
Hello Simon, Tuesday, April 18, 2006, 3:02:20 PM, you wrote:
if that is due to the time of reading .hi files, my alternative Binary library should help in some future
I'd be suprised if you could improve on GHC's binary library. Using BinMem (reading/writing directly to memory), GHC's binary library is about as fast as it gets. I'm sure yours wins when dealing with files, though.
sorry, Simon, but when i found 10x difference in speed (6mb/s vs 60 mb/s) it was on membufs :) although i can't say that the difference anyway will be 10 times. for so fast lib the time required to traverse lists is essential. there are also a lot of other possible problems. i just asking - whether the time required for reading these (or any other binary) files is essential for compilation speed? (although i don't tested code in ghc 6.5, may be it is faster) if you are interested, i can comment some bits of Binary module. i optimized essentially the same code, so i know it's potential bottlenecks :) getWord8 (BinMem _ ix_r sz_r arr_r) = do ix <- readFastMutInt ix_r sz <- readFastMutInt sz_r when (ix >= sz) $ ioError (mkIOError eofErrorType "Data.Binary.getWord8" Nothing Nothing) arr <- readIORef arr_r w <- unsafeRead arr ix writeFastMutInt ix_r (ix+1) return w here you use boxed reference for storing array. i uses unboxed Ptr reference, that is of course faster. also, using ioError here means that the whole function result can't be "unboxed". at least i seen substantial slowdown when i added error processing here. of course, it's hard to omit. also the following may be faster: writeFastMutInt ix_r (ix+1) unsafeRead arr ix i also use one less variable here (only curptr and bufend) instance Binary Word16 where get h = do w1 <- getWord8 h w2 <- getWord8 h return $! ((fromIntegral w1 `shiftL` 8) .|. fromIntegral w2) first possible problem here is what getWord8 is not inlined. second - using of checked arithmetic (operations on Ints may have additional checks, unlike operations on Int#). third - it's better to make all operations on Int and only then pack all data to Int16 constructor btw, my lib use 32/64-bit independent binary files (i.e. you can write the file with 64-bit program and then read it with the 32-bit one). to achieve this, i use variable-size representation for Int and Word. i also use compiler-independent representation for Integer (the same as dor Int). may be it will be interesting for ghc and will allow to use both (32 bit and 64 bit) compilers on the same project? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com