
I've written a library, zip-archive, for dealing with zip archives. Haddock documentation (with links to source code): http://johnmacfarlane.net/zip-archive/ Darcs repository: http://johnmacfarlane.net/repos/zip-archive/ It comes with an example program that duplicates some of the functionality of 'zip' (configure with '-fexecutable' to build it). I intend to put it on HackageDB, but I thought I'd get some feedback first. Bug reports, patches, and suggestions on the API are all welcome. John

On Mon, 2008-08-25 at 23:22 -0700, John MacFarlane wrote:
I've written a library, zip-archive, for dealing with zip archives.
Great. I saw your query about this from a month ago.
Haddock documentation (with links to source code): http://johnmacfarlane.net/zip-archive/
Darcs repository: http://johnmacfarlane.net/repos/zip-archive/
It comes with an example program that duplicates some of the functionality of 'zip' (configure with '-fexecutable' to build it).
I intend to put it on HackageDB, but I thought I'd get some feedback first. Bug reports, patches, and suggestions on the API are all welcome.
Generally it looks good, that the operations on the archive are mostly separated from IO of writing out archives or creating entries from disk files etc. Looking at the API there feels to be slightly too much exposed. Eg does the MSDOSDateTime need to be exposed, or the (de)compressData functions. I've been reworking the tar library recently and currently have an api that looks like: -- * Reading and writing the tar format read :: ByteString -> Entries write :: [Entry] -> ByteString -- * Packing and unpacking files to\/from a tar archive pack :: FilePath -> FilePath -> IO [Entry] unpack :: FilePath -> Entries -> IO () Entry is like your ZipEntry. Entries is a little special. Tar is really a linear/streamable format, we typically read the file front to back. Of course with zip it's more complex as you have an index (right?) and you can jump around without reading all the data. So Entries represents the unfolding of a tar file as a sequence of entries, but with the possibility of failure (eg format decoding failures): -- | A tar archive is a sequence of entries. data Entries = Next Entry Entries | Done | Fail String So that's why we have Entries for the result of decoding and just an ordinary list for the input to encoding. Zip is more complex of course because you often want to add files to existing archives, or lookup individual entries without just iterating through each entry. My personal inclination is to leave off the Zip prefix in the names and use qualified imports. I'd also leave out trivial compositions like readZipArchive f = toZipArchive <$> B.readFile f writeZipArchive f = B.writeFile f . fromZipArchive but reasonable people disagree. For both the pack in my tar lib and your addFilesToZipArchive, there's a getDirectoryContentsRecursive function asking to get out. This function seems to come up often. Ideally pack/unpack and addFilesToZipArchive/extractFilesFromZipArchive would just be mapM_ extract or create for an individual entry over the contents of the archive or the result of a recursive traversal. So yeah, I feel these operations ought to be simpler compositions of other things, in your lib and mine, since this bit is often the part where different use cases need slight variations, eg in how they write files, or deal with os-specific permissions/security stuff. So if these are compositions of simpler stuff it should be easier to add in extra stuff or replace bits. Duncan

Thanks again for the feedback! I've modified the zip-archive library along the lines you suggested. Version 0.1 is now available on HackageDB. John +++ Duncan Coutts [Aug 26 08 21:36 ]:
Generally it looks good, that the operations on the archive are mostly separated from IO of writing out archives or creating entries from disk files etc.
Looking at the API there feels to be slightly too much exposed. Eg does the MSDOSDateTime need to be exposed, or the (de)compressData functions.
My personal inclination is to leave off the Zip prefix in the names and use qualified imports. I'd also leave out trivial compositions like
readZipArchive f = toZipArchive <$> B.readFile f writeZipArchive f = B.writeFile f . fromZipArchive
but reasonable people disagree.
For both the pack in my tar lib and your addFilesToZipArchive, there's a getDirectoryContentsRecursive function asking to get out. This function seems to come up often. Ideally pack/unpack and addFilesToZipArchive/extractFilesFromZipArchive would just be mapM_ extract or create for an individual entry over the contents of the archive or the result of a recursive traversal.
So yeah, I feel these operations ought to be simpler compositions of other things, in your lib and mine, since this bit is often the part where different use cases need slight variations, eg in how they write files, or deal with os-specific permissions/security stuff. So if these are compositions of simpler stuff it should be easier to add in extra stuff or replace bits.
Duncan

jgm:
Thanks again for the feedback! I've modified the zip-archive library along the lines you suggested. Version 0.1 is now available on HackageDB.
And, of course, natively packaged for Arch, http://aur.archlinux.org/packages.php?ID=19555 Go, packagers, go! :) -- Don
participants (3)
-
Don Stewart
-
Duncan Coutts
-
John MacFarlane