
I'm trying to figure out the best way to calculate SHA1 hashes for files using Haskell. There are several libraries I tried from Hackage, but they all seem to end up computing the hash far too slowly vs. OpenSSL. Some of the libraries seem to be FFI bindings to OpenSSL, but I haven't figured out how to efficiently feed these. (For instance, hopenssl consumes a [Word8], but I haven't figured out how to lazily produce that from a file; using "unpack" on a ByteString seems to end up much too slow.) I also tried mimicking nano-md5 by writing a "nano-sha1" that was largely a "replace the name of the digest function" hack job, but I couldn't manage to get that working. Is there a particular accepted way to generate hashes that I'm missing?

Tom, The way to lazily produce [Word8] from a file is to unpack a lazy ByteString, and you're right - it will generally be slow. I don't know of any obvious accepted way to do it. The fast way is to work with ByteStrings. The update' function in hopenssl almost does what you want: You could unwrap your ByteStrings into Ptr Word8 using code like this: import qualified Data.ByteString.Internal as BI import Foreign.ForeignPtr ... let (bsFPtr, bsOffset, bsLength) = BI.toForeignPtr bs withForeignPtr bsFPtr $ \bsPtr_ -> do let bsPtr = bsPtr_ `plusPtr` bsOffset update' digest (bsPtr, fromIntegral bsLength) I should also mention that lazy I/O has problems - it temporarily leaks file handles, and doesn't handle errors correctly. It is generally better read blocks using hGet from Data.ByteString. Steve Tom Tobin wrote:
I'm trying to figure out the best way to calculate SHA1 hashes for files using Haskell. There are several libraries I tried from Hackage, but they all seem to end up computing the hash far too slowly vs. OpenSSL. Some of the libraries seem to be FFI bindings to OpenSSL, but I haven't figured out how to efficiently feed these. (For instance, hopenssl consumes a [Word8], but I haven't figured out how to lazily produce that from a file; using "unpack" on a ByteString seems to end up much too slow.) I also tried mimicking nano-md5 by writing a "nano-sha1" that was largely a "replace the name of the digest function" hack job, but I couldn't manage to get that working.
Is there a particular accepted way to generate hashes that I'm missing? _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
participants (2)
-
Stephen Blackheath [to Haskell-Beginners]
-
Tom Tobin