
On Mon, 2009-02-23 at 17:03 -0800, Don Stewart wrote:
Here's a quick demo using Data.Binary directly.
[...]
$ time ./A dict +RTS -K20M 52848 "done" ./A dict +RTS -K20M 1.51s user 0.06s system 99% cpu 1.582 total
Ok. So 1.5s to decode a 1.3M Map. There may be better ways to build the Map since we know the input will be sorted, but the Data.Binary instance can't do that.
[...]
$ time ./A dict.gz 52848 "done" ./A dict.gz 0.28s user 0.03s system 98% cpu 0.310 total
Interesting. So extracting the Map from a compressed bytestring in memory is a fair bit faster than loading it directly, uncompressed from disk.
That's actually rather surprising. The system time is negligible and the difference between total and user time does not leave much for time wasted doing i/o. So that's a real difference in user time. So what is going on? We're doing the same amount of binary decoding in each right? We're also allocating the same number of buffers, in fact slightly more in the case that uses compression. The time taken to cat a meg through an Handle using lazy bytestring is nothing. So where is all that time going? Duncan