
On 05/08/2009 08:44, Malcolm Wallace wrote:
Let's discuss, then have the steering committee recommend yay/nay.
We should have _some_ kind of binary library in the Platform, but I don't know whether the proposed library is the right one.
In a recent application, I found Data.Binary very slow, both to encode and to decode data. Decoding had stack-overflows, and when I increased the stack, it took about 20mins to read in an 8Mb file. When I then turned on optimisation with -O, the performance improved considerably (down to ~30secs to read the same file). This was using the standard instances of Binary for data structures like Data.Map. 30secs was still too slow, so we ended up needing to write our own improved instance for Data.Map. (Timings were similar with both ghc-6.8.3 and ghc-6.10.3.)
Ok, the way to proceed is to build a set of benchmarks. I did some rough timings myself recently, and found that binary was roughly comparable to the Binary module in GHC, which as far as I know is fairly fast (though I know there are faster libraries out there). I suspect we're all measuring different things. Would someone like to work on benchmarking binary and identifying the weak points? Cheers, Simon