
Didn't reply to all. Sorry for the double Johan. Johan asked:
will GHC be able to optimize away all of the Data.Binary stuff so the functions, if defined in terms of Data.Binary, are as efficient as a direct implementation?
Binary can be a bit slower than you might want. I'm guessing it would do fine for hton like operations, but in pureMD5 I noticed a 2x slow down when using Data.Binary instead of the low-level FFI implementation. In C one might want: inline uint32_t getWord32(unsigned char *bytestring) { uint32_t *p = &bytestring; return p[idx]; } And that is extremely fast code. In low level Haskell it looks more like: getNthWord n bs@(PS ptr off len) = inlinePerformIO $ withForeignPtr ptr $ \ptr' -> do let p = castPtr $ plusPtr ptr' off peekElemOff p n But even this low level code seems to be excessively slow with 33% of the MD5 run time attributed to this function. I ment to investigate this a couple months ago but have had zero time. Tom