
On 05/04/2012 02:33 PM, Ryan Newton wrote:
1. cprng-aes is painfully slow.
when using the haskell AES implementation yes. with AESNI it fly, and even more when i'll have time to chunk the generation to bigger blocks (says 128 AES block at a time)
One data-point -- in "intel-aes" I needed to do bigger blocks to get decent performance.
Yes, it's a slightly random value here, although it's a tradeoff with memory usage and performance, 128 blocks would do quite well compared to any haskell implementation that goes 1 block at a time [1] [1] because you'll have to drop in/out of C, and reload the SSE registers each time.
2. It doesn't use NI instructions (or any C implementation, currently).
The NI instructions support are coming. and there's ton of already existing C implementation that could just be added.
Oh, neat. Could you share a pointer to some C code (with GCC aes intrinsics?) that can replace what the ASM does in the "intel-aes" package?
Just have a look in cryptocipher with cbits/aes/x86ni.c -- Vincent