
Dear all, if you want to temporarily store haskell data in a file – do you have a special way to get it done efficiently? In an offline, standalone app, I am continuously reusing data volumes of about 200MB, representing Map like tables of a rather simple structure, key: (Int,Int,Int) value: [((Int,Int),LinkId)] which take quite a good deal of time to produce. Is there a recommendation about how to 'park' such data tables most efficiently in files – any format acceptable, quick loading time is the most desirable thing. Thanks a lot in advance, Nick

On Mon, Jan 23, 2012 at 9:37 PM, Nick Rudnick
if you want to temporarily store haskell data in a file – do you have a special way to get it done efficiently?
In an offline, standalone app, I am continuously reusing data volumes of about 200MB, representing Map like tables of a rather simple structure,
key: (Int,Int,Int) value: [((Int,Int),LinkId)]
which take quite a good deal of time to produce.
Is there a recommendation about how to 'park' such data tables most efficiently in files – any format acceptable, quick loading time is the most desirable thing.
Use cereal [1], usually it's fast and easy enough. If you need to be able to access your files for a long time, consider using safecopy [2] (which internally uses cereal as well). [1] http://hackage.haskell.org/package/cereal [2] http://hackage.haskell.org/package/safecopy HTH, -- Felipe.

On 01/24/2012 07:33 AM, Gregory Crosswhite wrote:
On 1/24/12 9:43 AM, Felipe Almeida Lessa wrote:
Use cereal [1], usually it's fast and easy enough.
Out of curiosity, is binary no longer the recommended standard for such things?
binary got only an interface for processing lazy bytestring. cereal is able to do strict and lazy bytestring and got a partial interface like attoparsec (which is required to do proper network/io processing). Fortunately it's very simple to convert between the two, since the actual serialization API is really close. Features-wise, in my view, cereal is a superset of binary. the only thing missing that i've noticed is that you can't tell how many bytes you have processed with cereal. -- Vincent

On 1/24/12 5:51 PM, Vincent Hanquez wrote:
On 01/24/2012 07:33 AM, Gregory Crosswhite wrote:
On 1/24/12 9:43 AM, Felipe Almeida Lessa wrote:
Use cereal [1], usually it's fast and easy enough.
Out of curiosity, is binary no longer the recommended standard for such things?
binary got only an interface for processing lazy bytestring. cereal is able to do strict and lazy bytestring and got a partial interface like attoparsec (which is required to do proper network/io processing).
Fortunately it's very simple to convert between the two, since the actual serialization API is really close.
Features-wise, in my view, cereal is a superset of binary. the only thing missing that i've noticed is that you can't tell how many bytes you have processed with cereal.
Fair enough, it's just that I had gotten the impression that for some time the binary package was considered by the community to the community to be the "standard" way of serialization/deserialization values. Is this no longer the case? Cheers, Greg

From my experience I can recommend msgpack ( http://hackage.haskell.org/package/msgpack) as being extremely fast. It comes with optimized prepared instances for common data structures which is very nice, because you don't have to roll your own version with library like cereal (which is indeed very fast, but simply less convenient).
Best regards,
Krzysztof Skrzętnicki
On Tue, Jan 24, 2012 at 00:37, Nick Rudnick
Dear all,
if you want to temporarily store haskell data in a file – do you have a special way to get it done efficiently?
In an offline, standalone app, I am continuously reusing data volumes of about 200MB, representing Map like tables of a rather simple structure,
key: (Int,Int,Int) value: [((Int,Int),LinkId)]
which take quite a good deal of time to produce.
Is there a recommendation about how to 'park' such data tables most efficiently in files – any format acceptable, quick loading time is the most desirable thing.
Thanks a lot in advance, Nick
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

It's not as efficient for Maps, but you might want to look at the
swapper package:
http://hackage.haskell.org/package/swapper
It transfers Haskell data structures (any functors) directly to and from disk.
Tom
On 1/23/12, Krzysztof Skrzętnicki
From my experience I can recommend msgpack ( http://hackage.haskell.org/package/msgpack) as being extremely fast. It comes with optimized prepared instances for common data structures which is very nice, because you don't have to roll your own version with library like cereal (which is indeed very fast, but simply less convenient).
Best regards, Krzysztof Skrzętnicki
On Tue, Jan 24, 2012 at 00:37, Nick Rudnick
wrote: Dear all,
if you want to temporarily store haskell data in a file – do you have a special way to get it done efficiently?
In an offline, standalone app, I am continuously reusing data volumes of about 200MB, representing Map like tables of a rather simple structure,
key: (Int,Int,Int) value: [((Int,Int),LinkId)]
which take quite a good deal of time to produce.
Is there a recommendation about how to 'park' such data tables most efficiently in files – any format acceptable, quick loading time is the most desirable thing.
Thanks a lot in advance, Nick
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (6)
-
Felipe Almeida Lessa
-
Gregory Crosswhite
-
Krzysztof Skrzętnicki
-
Nick Rudnick
-
Tom Murphy
-
Vincent Hanquez