
Salvatore Insalaco wrote:
I've got some time to take a look at the code. It's very nice, readable and declarative, but obviously not optimized for "raw speed". There're a few things to do to have a speed-up of 2x, without going "low-level".
You have my individed attention...
1) There's a lazyness leak in Pad.hs. The sum in: else make_block_bs bs0 : work (n + block_size_bytes) bs1 is not strict. With a very large file (e.g. 100 mb) it means stack overflow.
Oops! Well spotted...
To solve it just add a bang (!) before the n parameter:
work !n bs =
You've got to add {-# LANGUAGE BangPatterns #-} at the top of the file too. There're solutions that don't imply BangPatterns, and use only seq, but I like bang patterns!
Ah, so *that's* what bang patterns do?
2) make_block_bs is sub-optimal, and very critical to performance.
Yeah, so I see. (60% time spent here... ouch!) I'll bet that's where C is beating me...
I decided to use Data.Binary for it
I'm not familiar with that library (i.e. what it does or how you use it).
import Data.Binary import Data.Binary.Get import Control.Monad
// ...
instance Binary Block where put _ = undefined get = do xs <- replicateM 16 getWord32le return $ Block $ listArray (0, 15) xs
make_block_bs :: B.ByteString -> Block make_block_bs = decode
Mmm, OK. Doesn't look too bad...
3) You are doing a lot of access to fields of a strict data type (State). You can at least ask the compiler to help you a bit with -funbox-strict-fields.
I did try that, but it didn't seem to make any difference for me. [Maybe it does now because of your other improvements? Which version of GHC and which platform are you running on? I'm GHC 6.8.2 on Windows...]
We have got a good, nearly 2x, speed-up with very few optimizations, and we run in a very small constant amount of memory.
I'm liking the sound of that. :-D
Probably compiling with -fvia-C could help even more
...oh yeah, that's no longer default for -O2. I forgot about that!
but strangely:
Segmentation fault
Is this supposed to happen? There're no "unsafe" functions or imports used in the program. Maybe a bug in GHC?
GAH! o_O That's not good(tm)... Any chance you could email me a Darcs patch (or 2 seperate ones?) I'll add it to the repo on my website, and test to see what numbers I get on my PC.