(general question) Approaches for storing large amount of simple data structures

I have a project where I want to store a data structure on a file, binary or ascii. And I want to use haskell to read and write the file. I will have about half a million records so it would be nice if the format was able to load quickly. I guess I could, but I kind of want to avoid using XML. I have the following structure in pseudo code. A URL -> id -> keywords associated with that URL -> title associated with that URL -> links contained in that URL. (0 ... N) What is an approach for saving 500 thousand of those types of records and where I can load the data into a haskell data type. -- Berlin Brown [berlin dot brown at gmail dot com] http://botspiritcompany.com/botlist/?

bbrown:
I have a project where I want to store a data structure on a file, binary or ascii. And I want to use haskell to read and write the file. I will have about half a million records so it would be nice if the format was able to load quickly. I guess I could, but I kind of want to avoid using XML.
I have the following structure in pseudo code. -> keywords associated with that URL -> title associated with that URL -> links contained in that URL. (0 ... N)
What is an approach for saving 500 thousand of those types of records and where I can load the data into a haskell data type.
Data.Binary is the standard approach for large data/high performance serialising to and from Haskell types. It composes well with the gzip librarry too, so you can compress the stream on the way out, and decompress lazily on the way in. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-0.4.1 The interface is really simple: encode :: Binary a => a -> ByteString decode :: Binary a => ByteString -> a For marshalling Haskell type 'a' into a bytestring and back. -- Don
participants (2)
-
bbrown
-
Don Stewart