RE: Persistant (as in on disk) data

a recent post reminded me of a feature i'd like. for all i know it is already implemenetd in GHC so pointers are welcome.
i'd like to be able to dump data structures to disk, and later load them.
A Binary library was discussed recently on the libraries list. The thread starts here: http://www.haskell.org/pipermail/libraries/2002-November/000691.html It's currently stalled. There are several implementations of Binary: one that comes with NHC and is described in a paper (sorry, don't have a link to hand), a port of this library to GHC by Sven Panne (suffers from bitrot), a derived/simplified version used in GHC which is heavily hacked for speed, and a further derived version of this library by Hal Daume who is adapting it to support bit-by-bit serialisation. I think the outstanding issues are (a) is the API for GHC's Binary library acceptable, or do we need the extra bells and whistles that the NHC version has? (b) can we make a version of Binary that uses a bit-by-bit rather than byte-by-byte serialisation of the data that is as fast (or nearly as fast) as the current byte-by-byte implementation? Perhaps performance isn't that important to the majority of people: please comment if you have an opinion. (c) how do we derive instances of Binary? IMHO: something is better than nothing, so I'd be in favour of just plugging in the Binary library from GHC, and marking it "experimental". Cheers, Simon

(a) is the API for GHC's Binary library acceptable, or do we need the extra bells and whistles that the NHC version has?
I would far prefer a library that is standard between the two compilers. As a user, I don't like using compiler-specific libraries unless absolutely necessary.
implementation? Perhaps performance isn't that important to the majority of people: please comment if you have an opinion.
Performance isn't the most important thing to me. Amanda

"Simon Marlow"
i'd like to be able to dump data structures to disk, and later load them.
A Binary library was discussed recently on the libraries list. I think the outstanding issues are ...
(a) is the API for GHC's Binary library acceptable, or do we need the extra bells and whistles that the NHC version has?
In particular, the NHC version is platform-independent with regard to endian-ness issues, whereas I believe the GHC version is not?
(b) can we make a version of Binary that uses a bit-by-bit rather than byte-by-byte serialisation of the data that is as fast (or nearly as fast) as the current byte-by-byte implementation? Perhaps performance isn't that important to the majority of people: please comment if you have an opinion.
In experiments, I found the bit-by-bit serialisation at least an order of magnitude faster than Show/Read. The extra speed margin obtained by going byte-wise might be nice, but isn't so important (to me) as that first step from text to binary.
(c) how do we derive instances of Binary?
The nhc98 compiler already accepts a `deriving Binary' clause. DrIFT likewise already supports {-! derives : Binary !-}. I imagine it wouldn't be too difficult to use GHC's support for Generics to code up a simple deriving mechanism as well.
IMHO: something is better than nothing, so I'd be in favour of just plugging in the Binary library from GHC, and marking it "experimental".
Something I have never got round to asking before is what were the perceived defects in the nhc98 Binary library that encouraged (a) Sven to rewrite it for GHC, and (b) Simon to rewrite it again? Regards, Malcolm
participants (3)
-
Amanda Clare
-
Malcolm Wallace
-
Simon Marlow