
Dominic Steinitz:
Malte Milatz:
It performs better than the SHA1 algorithm in Crypto: It is faster by a factor of approximately e. It is also competitive (regarding time) with the »unsafe« SHA1 implementation posted here some days ago,
Great. Can you post some timings?
On a 12 MB file I get malte +RTS -prof -RTS total time = 9.80 secs (196 ticks @ 50 ms) total alloc = 3,764,831,396 bytes anatoly +RTS -prof -RTS total time = 14.35 secs (287 ticks @ 50 ms) total alloc = 1,773,131,228 bytes crypto +RTS -prof -RTS total time = 24.55 secs (491 ticks @ 50 ms) total alloc = 8,703,970,360 bytes
although it uses considerably more memory. Of course, you may safely
Does it have a space leak or does it run in constant but larger space than the other versions?
I haven't discovered a space leak so far. When profiling with -S, the first column holds fairly constant values.
forget it if you're interested in performance near a C implementation such as GNU sha1sum. I don't think it's unreasonable to think we could get near to C performance and we've been getting closer.
Hm, but I suppose, if we want the code to remain in its pure style, that will mean that someone of the more clever Haskell people has to invent again a low-level library that improves performance behind the curtains. On the other hand, we can still hope for someone to find a better algorithm.
PS I had a 5 minute (well actually less) look at the code and it doesn't look dissimilar to other versions I've seen in Haskell. I wonder if the speedup is principally because of ByteString.
Before I started tweaking (with the help of the IRC channel), I had read the input in as a ByteString, then unpacked it to [Word8] for further processing. That version can be found on top of the same hpaste page. It delivers: malte_old +RTS -prof -RTS total time = 15.95 secs (319 ticks @ 50 ms) total alloc = 5,844,911,772 bytes This is still better than crypto. (By the way, I'm really astonished by this. I translated the SHA1 specification into Haskell, and as it seems so far, I'm better than a library. That's worth a party!)
PPS We have to put splitn in a library.
Yea. It's disappointing not to find it in Data.List. However, splitByN might yet be a better name. Regards, Malte