
Hello Simon, Wednesday, April 19, 2006, 7:28:23 PM, you wrote:
GHC might well be able to make use of such stuff too. In general, one would like to be able to treat a file much like a database, as you suggest, with binary serialisation of data structures into it.
what you mean by "database"? what operations you need, in addition to sequential read and write?
GHC's serialisation also includes a simple communing-up mechanism for "leaves", especially strings. We build a kind of dictionary, to avoid repeatedly re-serialising the same string. I guess that any good binary serialisation will want to do something similar. (Or something more dynamic, a la arithmetic coding.)
arithmetic coding in Haskell? :) it will be MUCH faster to use simplest form of serialization and then call C compression library such as ziplib i just scanned ghc's Binary library and can say what features i don't implemented in my lib: 1) lazyGet/lazyPut. it's no problem to copy your implementation but i still don't understand how lazyGet should work - it share the same buffer pointer as one used in `get`. so `get` and consuming structure returned by lazyGet should interfere 2) i don't think that dictionary sharing should be part of general Binary library. but i tried to implement my lib so that this can be implemented in user code. it seems that i failed and i think that it is Haskell's drawback :) let's see: we want to use dictionary in get/put_ functions for FastString, so that large datastructure that includes strings can be serialized with just `put`. but `put` have the following signature: class Binary a where put :: OutByteStream h => h -> a -> IO () where OutByteStream defined as class OutByteStream m h where vPutByte :: h -> Word8 -> m () so, `put` only has access to OutByteStream's functions (i.e. only vPutByte) and can't deal with any data specific to user-supported stream, including it's dictionary. well, we can redefine Binary: class OutByteStream m h => Binary m h a where put :: h -> a -> m () instance Binary IO StreamWithDict FastString where put = ... -- now `put` can use functions specific for StreamWithDict but there is again catch: instance (Binary m h a) => Binary m h [a] where put h = replicateM_ (put h) here. internal call to `put` again will receive only OutByteStream dictionary! instance for FastString will just not matched!!! btw, btw. Haskell type classes has many non-obvious peculiarities. for example, it was not easy for to understand that Haskell resolve all overloading at compile time but finds what overloaded function to call at runtime (well, i can't even describe this behavior). can you recommend me paper to read about using Haskell class system? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com