workarounds for Codec.Compression.Zlib errors in darcs

Hi everybody, This advisory is for people who have installed darcs 2.1.2 via the Cabal build method. As you may have noticed, the cabalised darcs sometimes fails with errors like Codec.Compression.Zlib: incorrect data check Why this happens ---------------- Older versions of darcs can to produce gzipped files with broken CRCs. We never noticed this because our homegrown wrapper around the C libz library does not pick up these errors. Lately, we have been working on adopting the Haskell zlib wrapper, which is made available by default in darcs.cabal. This new wrapper is more stringent and fails when it encounters these files. Workaround 1 : use C libz instead of Haskell zlib ------------------------------------------------- So how can you work around these errors? If you are building darcs on any Unix-y operating system (e.g. Linux or MacOS X), you can cabal configure darcs to use the old C libz binding: cabal configure -f external-zlib This will restore our homegrown wrapper which ignores the broken CRCs (note that the darcs head no longer *produces* these broken files, thanks to debugging by Matthias Andree and to a bugfix by David Roundy; http://bugs.darcs.net/issue844 for details). In principle, the same advice applies for Windows users, with more details hopefully to follow on how the C libz in a GHC-accesible location. Details to follow. In the meantime, you can either build darcs using the old configure and make technique (assuming you have MSYS and related tools), or use a binary that does not use the Haskell zlib wrapper (for example, by downgrading to http://www.haskell.org/~simonmar/darcs-2.0.2+75.zip ) Workaround 2 : fix your broken gzipped files -------------------------------------------- If you have control over the repositories with broken gzipped files, it should be possible to repair these files by gunzipping them and then redo-ing the gzip. We think that the attached script should help. Please report back if this is not the case. How we will fix this problem in the long term --------------------------------------------- I'm very sorry for the grief this has caused. To begin with, we will ensure that the 2.2 release gets more testing before releasing it in January. It will also handle these broken CRCs more gracefully. Our plan is to - either extend darcs repair or provide a Haskell script to fix these broken files - detect the broken files and advise users to run darcs repair (or the script) as needed - somewhere in the future, disallow broken CRC files whilst still advising users on how to fix their files. Many thanks! -- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow PGP Key ID: 08AC04F9

On Wed, Nov 26, 2008 at 14:38:32 +0000, Eric Kow wrote:
Workaround 2 : fix your broken gzipped files -------------------------------------------- If you have control over the repositories with broken gzipped files, it should be possible to repair these files by gunzipping them and then redo-ing the gzip. We think that the attached script should help. Please report back if this is not the case.
No advisory is complete without a forgotten attachment. Here it is. -- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow PGP Key ID: 08AC04F9

On Wed, Nov 26, 2008 at 14:38:32 +0000, Eric Kow wrote:
Workaround 1 : use C libz instead of Haskell zlib ------------------------------------------------- So how can you work around these errors? If you are building darcs on any Unix-y operating system (e.g. Linux or MacOS X), you can cabal configure darcs to use the old C libz binding:
cabal configure -f external-zlib
My advice here is incorrect. We want the /opposite/ cabal configure -f -external-zlib Although note that a new version of darcs 2.1.2.3 will be soon on hackage with the new default (unfortunately this makes darcs a bit harder to build on Windows unless you're willing to use Workaround #2 of actually fixing your .gz files) -- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow PGP Key ID: 08AC04F9

On Wed, 2008-11-26 at 14:38 +0000, Eric Kow wrote:
Hi everybody,
This advisory is for people who have installed darcs 2.1.2 via the Cabal build method. As you may have noticed, the cabalised darcs sometimes fails with errors like
Codec.Compression.Zlib: incorrect data check
Why this happens ---------------- Older versions of darcs can to produce gzipped files with broken CRCs. We never noticed this because our homegrown wrapper around the C libz library does not pick up these errors.
I should note that one moral of this story is to check that your FFI imports are correct. That is, check they import the foreign functions at the right Haskell types. In this case the mistake was that the foreign function returned a C int, but the Haskell foreign import declaration stated that the C function returned IO () rather than IO CInt. This is where a tool really helps. The hsc2hs tool cannot check the cross-language type consistency while c2hs can. It reads the C header files and generates the FFI imports at the correct Haskell types. The downside is that c2hs is not shipped with ghc, it is a bit slower and it's not quite so good with structures. I think there is a need for a tool like c2hs but that works in a checking mode rather than in a generating mode. It would use much of the same code as c2hs but it would read the C header files and the .hs file (via ghc api) and check that the FFI imports are using the right types. That way it could be run to check a package without the checker tool being needed at build time on every platform. The downside would be that some C header files differ between platforms and c2hs handles this fine while a checker tool might say it's ok on one platform and that may not carry over to another. Still, it would be an improvement on just using raw FFI imports (or hsc2hs, which is really the same thing). Duncan

duncan.coutts:
On Wed, 2008-11-26 at 14:38 +0000, Eric Kow wrote:
Hi everybody,
This advisory is for people who have installed darcs 2.1.2 via the Cabal build method. As you may have noticed, the cabalised darcs sometimes fails with errors like
Codec.Compression.Zlib: incorrect data check
Why this happens ---------------- Older versions of darcs can to produce gzipped files with broken CRCs. We never noticed this because our homegrown wrapper around the C libz library does not pick up these errors.
I should note that one moral of this story is to check that your FFI imports are correct. That is, check they import the foreign functions at the right Haskell types. In this case the mistake was that the foreign function returned a C int, but the Haskell foreign import declaration stated that the C function returned IO () rather than IO CInt.
This is where a tool really helps. The hsc2hs tool cannot check the cross-language type consistency while c2hs can. It reads the C header files and generates the FFI imports at the correct Haskell types.
The downside is that c2hs is not shipped with ghc, it is a bit slower and it's not quite so good with structures.
I think there is a need for a tool like c2hs but that works in a checking mode rather than in a generating mode. It would use much of the same code as c2hs but it would read the C header files and the .hs file (via ghc api) and check that the FFI imports are using the right types. That way it could be run to check a package without the checker tool being needed at build time on every platform. The downside would be that some C header files differ between platforms and c2hs handles this fine while a checker tool might say it's ok on one platform and that may not carry over to another. Still, it would be an improvement on just using raw FFI imports (or hsc2hs, which is really the same thing).
Yes, this plagued xmonad's X11 bindings: almost all bugs in the last 12 months were due to FFI bindings. I'd love a hsc2hs -Wall mode. -- Don

On Wed, 2008-11-26 at 14:30 -0800, Don Stewart wrote:
I think there is a need for a tool like c2hs but that works in a checking mode rather than in a generating mode. It would use much of the same code as c2hs but it would read the C header files and the .hs file (via ghc api) and check that the FFI imports are using the right types. That way it could be run to check a package without the checker tool being needed at build time on every platform. The downside would be that some C header files differ between platforms and c2hs handles this fine while a checker tool might say it's ok on one platform and that may not carry over to another. Still, it would be an improvement on just using raw FFI imports (or hsc2hs, which is really the same thing).
Yes, this plagued xmonad's X11 bindings: almost all bugs in the last 12 months were due to FFI bindings.
I'd love a hsc2hs -Wall mode.
Right, but it cannot be hsc2hs. The model of hsc2hs simply does cannot support such a thing because it does not actually know the C types of anything. It would have to be more on the model of c2hs, using Language.C to work out the C types and then map them to Haskell ones, to check they're the same as the declared types in the .hs files. Duncan

... to work out the C types and then map them to Haskell ones, to check they're the same as the declared types in the .hs files.
I'd like to point out that the FFI specification already has such a mechanism. That is, if you use the optional specification of a header file for each foreign import, and if your Haskell compiler can compile via C, then any checking that types match between Haskell and C can be performed automatically, by the backend C compiler. [ OK, so that is not the whole story, and there are good reasons why it might not always work out, but I still think it was an important principle in the original FFI design. ] Regards, Malcolm

On Wed, Nov 26, 2008 at 3:16 PM, Malcolm Wallace < malcolm.wallace@cs.york.ac.uk> wrote:
... to work out the C types and then map them to Haskell ones, to
check they're the same as the declared types in the .hs files.
I'd like to point out that the FFI specification already has such a mechanism. That is, if you use the optional specification of a header file for each foreign import, and if your Haskell compiler can compile via C, then any checking that types match between Haskell and C can be performed automatically, by the backend C compiler.
[ OK, so that is not the whole story, and there are good reasons why it might not always work out, but I still think it was an important principle in the original FFI design. ]
Would this method work with return types since C compilers tend to let you ignore those? In this example that brought up this discussion it was in fact an ignored return value that caused the problem. Jason

"Jason Dagit"
That is, if you use the optional specification of a header file for each foreign import, and if your Haskell compiler can compile via C, then any checking that types match between Haskell and C can be performed automatically, by the backend C compiler.
Would this method work with return types since C compilers tend to let you ignore those? In this example that brought up this discussion it was in fact an ignored return value that caused the problem.
I've tried to look at GCC's warning options, but couldn't make it emit a warning on an ignored result. (Doesn't mean it isn't in there, of course) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Wed, 2008-11-26 at 23:16 +0000, Malcolm Wallace wrote:
... to work out the C types and then map them to Haskell ones, to check they're the same as the declared types in the .hs files.
I'd like to point out that the FFI specification already has such a mechanism. That is, if you use the optional specification of a header file for each foreign import, and if your Haskell compiler can compile via C, then any checking that types match between Haskell and C can be performed automatically, by the backend C compiler.
Yes, it would have caught a similar problem in an argument position, but not in the result.
[ OK, so that is not the whole story, and there are good reasons why it might not always work out, but I still think it was an important principle in the original FFI design. ]
And covering those holes requires a tool that can grok C. Duncan

Hello Duncan, Thursday, November 27, 2008, 1:28:21 AM, you wrote:
checking mode rather than in a generating mode. It would use much of the same code as c2hs but it would read the C header files and the .hs file (via ghc api) and check that the FFI imports are using the right types.
there is FFI imports generator (HSFFIG?), written by Dmitry Golubovsky -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed, Nov 26, 2008 at 10:28:21PM +0000, Duncan Coutts wrote:
On Wed, 2008-11-26 at 14:38 +0000, Eric Kow wrote:
Older versions of darcs can to produce gzipped files with broken CRCs. We never noticed this because our homegrown wrapper around the C libz library does not pick up these errors.
I should note that one moral of this story is to check that your FFI imports are correct. That is, check they import the foreign functions at the right Haskell types. In this case the mistake was that the foreign function returned a C int, but the Haskell foreign import declaration stated that the C function returned IO () rather than IO CInt.
While that's true, Haskell also makes it easy to make the same sort of error with IO (or any other Monad) values, whether created with the FFI or not. If you say f = do x y z and y has type IO CInt then you won't get an error (and I don't think you can even ask for a warning with the current implementations). Should we have (>>) :: (Monad m) => m () -> m a -> m a and force you to write _ <- y ? Thanks Ian

On Thu, Nov 27, 2008 at 10:20 AM, Ian Lynagh
On Wed, Nov 26, 2008 at 10:28:21PM +0000, Duncan Coutts wrote:
On Wed, 2008-11-26 at 14:38 +0000, Eric Kow wrote:
Older versions of darcs can to produce gzipped files with broken CRCs. We never noticed this because our homegrown wrapper around the C libz library does not pick up these errors.
I should note that one moral of this story is to check that your FFI imports are correct. That is, check they import the foreign functions at the right Haskell types. In this case the mistake was that the foreign function returned a C int, but the Haskell foreign import declaration stated that the C function returned IO () rather than IO CInt.
While that's true, Haskell also makes it easy to make the same sort of error with IO (or any other Monad) values, whether created with the FFI or not. If you say
f = do x y z
and y has type IO CInt then you won't get an error (and I don't think you can even ask for a warning with the current implementations).
Should we have (>>) :: (Monad m) => m () -> m a -> m a and force you to write _ <- y ?
I'd like that (though I certainly didn't like that prospect when I started learning). I think the option of turning on a warning would be a nice happy medium. Luke

On Thu, 2008-11-27 at 17:20 +0000, Ian Lynagh wrote:
On Wed, Nov 26, 2008 at 10:28:21PM +0000, Duncan Coutts wrote:
I should note that one moral of this story is to check that your FFI imports are correct. That is, check they import the foreign functions at the right Haskell types. In this case the mistake was that the foreign function returned a C int, but the Haskell foreign import declaration stated that the C function returned IO () rather than IO CInt.
While that's true, Haskell also makes it easy to make the same sort of error with IO (or any other Monad) values, whether created with the FFI or not. If you say
f = do x y z
and y has type IO CInt then you won't get an error (and I don't think you can even ask for a warning with the current implementations).
Right, which is why we do not use IO CInt style error handling much in Haskell. For functions that return a real result we use Maybe, or for things that would otherwise be IO (), then using IO exceptions is the obvious thing to do. In either case the error is hard to ignore. Duncan

While that's true, Haskell also makes it easy to make the same sort of error with IO (or any other Monad) values, whether created with the FFI or not. If you say
f = do x y z
and y has type IO CInt then you won't get an error (and I don't think you can even ask for a warning with the current implementations).
Should we have (>>) :: (Monad m) => m () -> m a -> m a and force you to write _ <- y
It's intersting to note that F# follows exactly your proposal. If x has a return type other than () then you do: y |> ignore where ignore :: a -> (), and |> = flip ($) In practice, I found this quite reasonable to use. You also eliminate "errors" such as: do mapM deleteFile files ; return 1 Where mapM requires more memory than the equivalent mapM_ Thanks Neil

On Wed, Nov 26, 2008 at 14:38:33 +0000, Eric Kow wrote:
In principle, the same advice applies for Windows users, with more details hopefully to follow on how the C libz in a GHC-accesible location. Details to follow.
As promised, here are the details for installing darcs using our old internal binding to libz on Windows. Again, the instructions are either to cabal configure -f -external-zlib (Or to wait for darcs 2.1.2.3 to appear on hackage) Thanks to Salvatore Insalaco, our Windows Czar! ---------------------------------------------------------------------- There're two zlib versions for Windows: one compiled with ccall convention, and one with stdcall convention. We need the ccall one, at the address: http://gnuwin32.sourceforge.net/packages/zlib.htm The other one will *appear* to work, but darcs will segfault as soon as the first call to zlib is made. This is the step-by-step howto: 1) Download the binary from http://gnuwin32.sourceforge.net/downlinks/zlib-bin-zip.php and unzip it. 2) Copy the zlib1.dll file in bin directory in ghc's gcc-lib directory, and then rename it libz.dll. 3a) Copy the zlib1.dll file in c:\windows directory, WITHOUT renaming it. OR 3b) If you don't want to pollute your windows directory, just copy it in whatever directory is in the search PATH, or in the one you'll place darcs.exe binary. That's all. The cabal install will then just work. By the way: that's similar to the how-to for making curl work. Maybe we can update the windows building instructions providing a "pre-install" instructions to make cabal install darcs just work. -- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow PGP Key ID: 08AC04F9
participants (10)
-
Bulat Ziganshin
-
Don Stewart
-
Duncan Coutts
-
Eric Kow
-
Ian Lynagh
-
Jason Dagit
-
Ketil Malde
-
Luke Palmer
-
Malcolm Wallace
-
Neil Mitchell