
On Tuesday 21 September 2010 19:46:02, John Millikin wrote:
On Tue, Sep 21, 2010 at 07:10, Daniel Fischer
wrote: And I'd expect it to be a heck of a lot faster than the previous implementation. Have you done any benchmarks?
Only very rough ones -- a few basic Criterion checks, but nothing extensive.
Certainly good enough for an indication.
Numbers for put/get of 64-bit big-endian:
getWord getFloat putWord putFloat Bitfields (0.4.1) 59 ns 8385 ns 1840 ns 11448 ns poke/peek (0.4.2) 59 ns 305 ns 1840 ns 744 ns
Yaw. That's a huge difference. I don't think there's much room for doubt that it's much faster (the exact ratios will vary of course).
unsafeCoerce 59 ns 61 ns 1840 ns 642 ns
Odd that unsafeCoerce gains 244 ns for get, but only 102 for put.
Note: I don't know why the cast-based versions can put a Double faster than a Word64;
Strange. putFloat does a putWord and a transformation, how can that be faster than only the putWord?
Float is (as expected) slower than Word32. Some special-case GHC optimization?
One problem I see with both, unsafeCoerce and poke/peek is endianness. Will the bit-pattern of a double be interpreted as the same uint64_t on little-endian and on big-endian machines? In other words, is the byte order for doubles endianness-dependent too? If yes, that's fine, if no, it would break between machines of different endianness.
Endianness only matters when marshaling bytes into a single value -- Data.Binary.Get/Put handles that. Once the data is encoded as a Word, endianness is no longer relevant.
I mean, take e.g. 2^62 :: Word64. If you poke that to memory, on a big- endian machine, you'd get the byte sequence 40 00 00 00 00 00 00 00 while on a little-endian, you'd get 00 00 00 00 00 00 00 40 , right? If both bit-patterns are interpreted the same as doubles, sign-bit = 0, exponent-bits = 0x400 = 1024, mantissa = 0 , thus yielding 1.0*2^(1024 - 1023) = 2.0, fine. But if on a little-endian machine, the floating point handling is not little-endian and the number is interpreted as sign-bit = 0, exponent-bits = 0, mantissa = 0x40, hence (1 + 2^(-46))*2^(-1023), havoc. I simply didn't know whether that could happen. According to http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness it could. On the other hand, "no standard for transferring floating point values has been made. This means that floating point data written on one machine may not be readable on another", so if it breaks on weird machines, it's at least a general problem (and not Haskell's).