
dons:
aeyakovenko:
$ time ./md5sum ./md5sum [105,252,52,138,187,192,216,17,225,123,185,3,124,101,86,132]
real 0m4.790s user 0m3.688s sys 0m0.492s
$ time md5sum ./md5sum 69fc348abbc0d811e17bb9037c655684 ./md5sum
real 0m0.023s user 0m0.000s sys 0m0.008s
this is my implementation using crypto (http://www.haskell.org/crypto/). Am I doing something wrong?
module Main where
import System import qualified Data.Digest.MD5 as MD5 import qualified Data.ByteString as BS
main = do args <- getArgs dt <- BS.readFile $ head args putStrLn $ show $ MD5.hash . BS.unpack $ dt ^^^^ ^^^^^^^^^
not a good idea.
You need an MD5 over bytestrings, not [Word8].
-- Don
I note a couple of other issues: crypto is compiled with: Ghc-options: -fglasgow-exts that is, no optimisations, although it would certainly benefit from Ghc-options: -O2 -fexcess-precision -funbox-strict-fields So I'd recompile the crypto package with that first. Then be sure to compile your code with ghc -O2 -- bytestrings love -O2. Finally, to actually get C speed, use a C md5. Here's an example Haskell binding to the OpenSSL libraries 'md5' function, which you can compile and run like so: $ ghc MD5.hs -lcrypto -o hsmd5 $ time ./hsmd5 /usr/share/dict/words MD5 (/usr/share/dict/words) = e5c152147e93b81424c13772330e74b3 ./hsmd5 /usr/share/dict/words 0.01s user 0.02s system 80% cpu 0.029 total versus my system's 'md5' program: $ time md5 /usr/share/dict/words MD5 (/usr/share/dict/words) = e5c152147e93b81424c13772330e74b3 md5 /usr/share/dict/words 0.02s user 0.00s system 29% cpu 0.052 total Oh huh, that's interesting... And the code: {-# OPTIONS -O2 -fffi -#include "openssl/md5.h" #-} -- -- A few imports, should tidy these up one day. -- import System.Environment import qualified Data.ByteString.Base as B (unsafeUseAsCStringLen) import qualified Data.ByteString as B import Foreign import Foreign.C.Types import Numeric import Text.Printf main = do (f:_) <- getArgs src <- B.readFile f printf "MD5 (%s) = %s \n" f (md5sum src) -- --------------------------------------------------------------------- -- -- Fast md5 using OpenSSL and zero-copying bytestrings -- -- -- The md5 hash should be referentially transparent.. -- md5sum :: B.ByteString -> String md5sum p = unsafePerformIO $ B.unsafeUseAsCStringLen p $ \(ptr,n) -> do digest <- c_md5 ptr (fromIntegral n) nullPtr go digest 0 [] where -- print it in 0-padded hex format go :: Ptr Word8 -> Int -> [String] -> IO String go p n acc | n >= 16 = return $ concat (reverse acc) | otherwise = do w <- peekElemOff p n go p (n+1) (draw w : acc) draw w = case showHex w [] of [x] -> ['0', x] x -> x -- unsigned char *MD5(const unsigned char *d, unsigned long n, unsigned char *md); -- foreign import ccall "openssl/md5.h MD5" c_md5 :: Ptr CChar -> CULong -> Ptr CChar -> IO (Ptr Word8) Happy hacking. -- Don