
Khudyakov Alexey schrieb:
On Saturday 23 May 2009 02:55:17 Antoine Latter wrote:
Or you could go for the compromise position, where the list can be part of a complex data structure so you're not relying on EOF to find the end.
Interesting solution however it does not perform very nice. I wrote microbenchmark
xs :: [Word32] xs = [1..(10^6)]
Writing chunked list of Word32
B.writeFile "chunked" . toLazyByteString . putList putWord32be $ xs real 0m4.311s user 0m3.272s sys 0m0.096s
Reading chunked list of Word32
print . last . runGet (getList getWord32be) =<< B.readFile "chunked" real 0m0.634s user 0m0.496s sys 0m0.012s
Writing stream of Word32
B.writeFile "stream" . encodeStream $ xs real 0m0.391s user 0m0.252s sys 0m0.020s
Reading stream of Word32
print . (last :: [Word32] -> Word32) . decodeStream =<< B.readFile "stream" real 0m0.376s user 0m0.248s sys 0m0.020s
I didn'd do any profiling so I have no idea why writing is so slow.
If you use top-level definition 'xs' the program might cache the list and second write is faster. You may change the order of tests in order check, whether that is the cause.

On Saturday 23 May 2009 23:23:05 Henning Thielemann wrote:
Interesting solution however it does not perform very nice. I wrote microbenchmark
... skipped ...
I didn'd do any profiling so I have no idea why writing is so slow.
If you use top-level definition 'xs' the program might cache the list and second write is faster. You may change the order of tests in order check, whether that is the cause.
There was separate process for each test. So caching could not affect results.

andrewcoppin:
The problem seems to boil down to this: The Binary instance for Double (and Float, by the way) is... well I guess you could argue it's very portable, but efficient it isn't. As we all know, an IEEE-754 double-precision floating-point number occupies 64 bits; 1 sign bit, 11 exponent bits, and 52 mantissa bits (implicitly 53). I had assumed that the Binary instance for Double would simply write these bits to disk, requiring approximately 0 computational power, and exactly 64 bits of disk space. I was wrong.
Is there any danger that there might be some kind of improvement to the Double instance in the next version of Data.Binary?
This was discussed last week. A patch was posted implementing more efficient low level double encodings. Google for the thread: "Data.Binary and little endian encoding". Tim
participants (3)
-
Henning Thielemann
-
Khudyakov Alexey
-
Tim Docker