[Haskell-cafe] sha1 implementation thats "only" 12 times slower then C

1 Jul 2007

      So I tried implementing a more efficient sha1 in haskell, and i got to
about 12 times slower as C.  The darcs implementation is also around
10 to 12 times slower, and the crypto one is about 450 times slower.
I haven't yet unrolled the loop like the darcs implementation does, so
I can still get some improvement from that, but I want that to be the
last thing i do.

I think I've been getting speed improvements when minimizing
unnecessary allocations.  I went from 40 times slower to 12 times
slower by converting a foldM to a mapM that modifies a mutable array.

Anyone have any pointers on how to get hashElem and updateElem to run
faster, or any insight on what exactly they are allocating.  To me it
seems that those functions should be able to do everything they need
to without a malloc.

This is the profiling statistics generated from my implementation

COST CENTRE                    MODULE               %time %alloc

hashElem                       SHA1                  42.9   66.2
updateElem                     SHA1                  12.7   16.7
unboxW                         SHA1                  10.6    0.0
hashA80                        SHA1                   5.2    0.3
temp                           SHA1                   4.6    0.0
sRotateL                       SHA1                   4.6    0.0
ffkk                           SHA1                   3.3    2.6
hashA16IntoA80                 SHA1                   3.1    0.1
sXor                           SHA1                   2.9    0.0
do60                           SHA1                   2.9    2.6
sAdd                           SHA1                   2.3    0.0
do20                           SHA1                   1.3    2.6
splitByN                       SHA1                   1.2    2.3
do80                           SHA1                   0.8    2.6
do40                           SHA1                   0.4    2.6

Thanks,
Anatoly

[Haskell-cafe] sha1 implementation thats "only" 12 times slower then C

Anatoly Yakovenko