
Duncan Coutts wrote: [...]
Tar.unpack dir . Tar.read . GZip.decompress =<< BS.readFile tar
or
BS.writeFile tar . GZip.compress . Tar.write =<< Tar.pack base dir [...]
The sources in cabal-install seem most up-to-date (because of cabal-install-0.6.2) and it would make sense to take this sources and replace those in the tar-package.
Yes, that's what I was doing over the weekend.
thanks a lot!
darcs get http://code.haskell.org/tar/
Let me know what you think about the API and documentation. You mention above about exporting internal data structures. As far as I can see everything that is exported in the current code is needed. Let me know if you think it is too much or too little.
Ok, I think the api is too big (for a casual user). I don't want to know anything about the internals of an "Entry" or about a "TarPath". For refactoring cabal-install (using your tar package) the following interface was enough: create :: FilePath -> FilePath -> FilePath -> IO () extract :: FilePath -> FilePath -> IO () read :: ByteString -> Entries write :: [Entry] -> ByteString pack :: FilePath -> FilePath -> IO [Entry] unpack :: FilePath -> Entries -> IO () data Entry fileName :: Entry -> FilePath fileContent :: Entry -> ByteString data Entries = Next Entry Entries | Done | Fail String Maybe only a "isNormalFile" test-function for an Entry is missing. checkSecurity is not needed in the API, because it is done by unpack. (checkTarBomb does nothing currently). Tar entries should (usually will) not be constructed by the user. getDirectoryContentsRecursive does not really belong into this tar package. I would be happy, if the existence of TarPath (and all the other funny entry fields) could be hidden from the user. Manipulating Entries is also not a typical user task. (Maybe the type Entries should just be "[Either String Entry]", but the given type is fine, as it only allows a final failure string) So rather than re-exporting almost everything from the other modules in the top module, I suggest my API above and simply expose all other modules in case some wants the internals.
Currently I get round-trip byte-for-byte compatibility with about 50% of the .tar.gz packages on my system (I'm on gentoo so there's lots of those). The ones that are not byte-for-byte equal after reading/writing are still readable by other tools (and probably normalised and closer to standard compliant) but it needs investigating in more detail.
The checking API is incomplete (security, tarbombs, portability) and there are no tests for the lazy streaming properties yet (ie that we can process arbitrary large archives in constant space).
I can only suggest to release it soon, use it for cabal-install and make a new release of cabal-install for ghc-6.10.2 Thank you, Duncan Christian P.S. I could (darcs) send you my (humble) changes to cabal-install and tar