
All, A new pair of typeclasses are below and in the repo [1]. Mostly this is just me tweaking the Hash class and updating DRBG [2] to use the new interface (tests not yet run I might have broken something, but that wouldn't be the interfaces fault). The classes include: Note L. is ByteString.Lazy while B. is strict bytestrings. ===== class (Binary d, Serialize d) => Hash ctx d | d -> ctx, ctx -> d where outputLength :: Tagged d BitLength blockLength :: Tagged d BitLength initialCtx :: ctx updateCtx :: ctx -> B.ByteString -> ctx finalize :: ctx -> B.ByteString -> d strength :: Tagged d BitLength ===== I was considering having a 'needAlignment :: Tagged d ByteLength' value for Hashes. The reasoning was [3]. ==== class BlockCipher k where blockSize :: Tagged k BitLength encryptBlock :: k -> B.ByteString -> B.ByteString decryptBlock :: k -> B.ByteString -> B.ByteString buildKey :: B.ByteString -> Maybe k keyLength :: k -> BitLength -- ^ keyLength may inspect its argument to return the length ==== Other helper functions exist that build on the class primitives to provide operations such as hash and hash'. The TODO list includes: - Look harder at the other classes including "BlockCipher", "AsymCipher", "StreamCipher" - example instances of each class - example uses of each class - Collecting tests, building a test framework - Move "for" and (.::.) into the Tagged library (?) - Decide what we want on padding - Decide what we want with crypto-related items that aren't directly a cipher or hash (ex: pbkdf2). - Decide on package name (replace "Crypto" or select a new name? Goes with another recent threads' topic) - Implement modes Individual responses: Bas said:
Why not use the Edward Kmett's 'tagged'[1] package for these methods? As in: outputLength :: Tagged d BitLength
Done. I like it.
Adam Wick
Why two libraries instead of n+1? Wouldn't it make sense to just have one library (what you call "Crypto") define the interface as one package, and then have a number of packages that implement that interface as a series of other modules?
It will start as just 1 (crypto) then I'm leaning toward targeting n+2 where n is the number of packages that have the desired interface and testing (currently zero). -Algs can simply re-export from alg specific packages (i.e. is a meta package) when such package exists and is maintained. I feel there is value in a well supported algorithm collection, namely uniform inclusion policy and maintenance; this doesn't stop algorithm specific packages from targeting the Crypto API, that is the whole point of having Crypto and Crypto-Algs separate.
Enumerating principles I support: * Lazy ByteStrings should be used for all input data
Really? Why? I've actually been considering going back to both the SHA and RSA packages and redoing them using strict ByteStrings. Recent experience has suggested that strict ByteStrings are almost always what I want, and building a fast lazy ByteString interface over strict ByteString routines seems like a pretty trivial task.
It was this comment that caused me to realize the class interface should all be strict bytestrings performing component operations (matches crypto definitions better anyway) and have helper functions that use these component functions to provide strict and lazy operations. For example, the Hash class defines initialContext, update, and finalize while helper functions use these to provide hash and hash'. Such design was already the idea behind cipher, just didn't consciously realize it. Cheers, Thomas [1] http://code.haskell.org/~tommd/crypto/ [2] http://code.haskell.org/~tommd/DRBG/ [3] Reasoning behind the currently excluded 'neededAlignment' value The 'needAlignment' value is the byte alignment assumed by the Hash for input data (presumably 1, 2, 4, or 8). The 'hash' helper function (or any users of 'finialize' or 'update') checks the alignment of the input data - if it is not aligned then it's copied into a newly allocated bytestring, allowing the implementation to assume 64 bit alignment (new allocation rule in Haskell 2010). Implementations that use alignment-safe word extraction (ex: Cereal) can just specify 1 while other implementations (ex: for performance reasons pureMD5 used to use an unsafePerformIO ... peekElem ...) can request proper alignment. But this is a hack job, we need to get a high performance way to extract unboxed words from a bytestring that will fall back to a safe method when the alignment isn't correct (Cereal is measurably lower performance than unsafePerformIO with peekElem).