Re: cryptohash and an incremental API

Vincent said:
couple of comments around the hashes interface:
* updateCtx works on blockLength, instead of working on arbitrary size...
So for performance reasons you seem to prefer Semantics 1.2? """ 1.2 Multiple of blockSize bytes Implementations are encouraged to consume data (continue updating, encrypting, or decrypting) until there is less than blockSize bits available. """ Also, I'll amend 1.2 and say the hashUpdate/encrypt/decrypt functions should only consume n * blockSize bytes, tracking the remainder will be done at the higher level. Also, the higher level default implementations should only pass n * blocksize inputs to these functions. I can see how that's reasonable and am strongly considering using these semantics instead of 1.1.
* hash is a generic operation based on the class Hash. In my case, it improve performance by not running the pure init/update/finalize exposed, but use the hidden impure function. I realized yesterday it's not as much as i though since i had a bug in my benchmark, but it's still there (100ms for 500mb of data).
Humm, 0.2 sections / GB is significant so again I can be swayed - it isn't like I can't have a default definition of hash (and others) when its part of the class instance.
* Why is the digest of a specific type ? I like representing different things with different types, but i'm not sure what do you gain with digests though.
This I am less flexible on. My thought on how people will use this library is centered around the instantiation of classes on the keys used or resulting digests. Anyone wanting ByteString results can simply use Data.[Serialize,Binary].encode. Here is a user getting a sha256 hash: let h = hash contents :: SHA256 or the type could be implicit due to context (not shown): let h = hash contents
* is strength really useful in the Hash class ? it might be accurate when the thing get implemented, but i'm not sure what would happens over time, and flaws are discovered. would people actually updates it ?
Will people actually update it? I hope so but if they don't are we really worse off than not having any strength numbers? People who care about strength will likely keep track of the algorithms on which they depend. I added strength largely because the Hash class came from DRBG (NIST SP 800-90) and that needed strength values. If we don't have strength then applications like DRBG need a way to know which algorithm each data type represents then to look up that algorithm their its own table of algorithm strength - very messy. I'd imaging crypto-api would have to look something like: \begin{code} data HashAlgorithm = MD5 | SHA1 | SHA256 | SHA512 | ... class Hash d c | d -> c, c -> d where ... algorithm :: Tagged d HashAlgorithm ... \end{code} I don't consider this a win - crypto-api now enumerating all hash algorithms wanting Hash instances.
The blockCipher should exposes the chaining modes as overridable typeclass functions, with default generic implementations that use encryptBlocks. For example the haskell AES package has different C implementations for each chaining modes (e.g. cbc, ebc), and i suspect that using a generic chaining implementation would slow things down.
As with "hash" being part of the hash typeclass, I don't have a strong objection here. It allows particular implementations to be slightly higher performance and does not preclude default definitions. This is rather messier than I wanted, but the reasoning seems sound. WRT your specific examples: encryptBlocksCBC :: k -> ByteString -> (k, ByteString) decryptBlocksCBC :: k -> ByteString -> (k, ByteString) These I do object to. The key does not change as the CBC algorithm progresses, but contextual information does. My initial mode implementations have types like: cbc :: (BlockCipher k) => k -> IV k -> ByteString -> (ByteString, IV k) In other words, initialization vectors are explicit and separate from the key. The type parameter on IV allows us to build an IV of proper size, something like: buildIV :: (BlockCipher k, MonadRandom m) => m (IV k) and it is always true that iv :: IV k iv <- buildIV B.length (encode iv) == blockSize `for` (undefined :: k)
and my last comment, is that i don't understand the streamcipher interface you're proposing. I've got a (inefficient) RC4 implementation that has this interface:
stream :: Ctx -> B.ByteString -> (Ctx, B.ByteString) streamlazy :: Ctx -> L.ByteString -> (Ctx, L.ByteString)
My interface was just a quick hack with me understanding it would likely change - I didn't know there was a Haskell RC4 binding or implementation and will happily follow your lead here. Is this implementation on hackage? Cheers, Thomas
participants (1)
-
Thomas DuBuisson