
On 05/04/2012 02:37 PM, Ryan Newton wrote:
My end goal is to have the user use transparently the fastest implementation available to their architecture/cpu providing they use the high level module. I've uploaded the cpu package which allows me to detect at runtime the aes instruction (and the architecture), but i've been distracted in implementing fast galois field arithmetics for GCM and XTS mode (with AES).
Yes! A worthy goal!
I think the proposal here is that we do the build/integration work to get something good which is portable enough and install-reliable enough to replace 'random'. Then people who don't care will be using a good implementation by default.
That was my goal when I had my own small shot at this, but what I came up with was *very* build-fragile. (Depended on assembler being available, or on prebuilt binaries being included for that package.) You can see the Setup.hs customization I attempted to do in intel-aes to compensate, but it's not enough.
Can we write a cabal-compatible, really robust installer that will test the users system and always fall back rather than failing?
That was my original plan, until i find out that it's not really possible. For the language, i think assembly is a no-no with cabal, as such it need to be embedded in gcc inline assembly if you want to have something that works (unless there's a secret way to run assembler in a portable fashion in cabal). Which is the reason behind why i settled on intrinsics, as i didn't have to do the assembly directly. It appears more portable as well as every major compiler seems to support it (with difference of course, it would too simple otherwise (!))
P.S. How are you doing the CPUID test for NI instructions? I used the *intel provided* test for this (in intel-aes) but I still had reports of incorrect identification on certain AMD CPUs...
I haven't done it yet, but it should be just a matter of this piece of code for Intel and AMD: import System.Cpuid import Data.Bits supportAESNI :: IO Bool supportAESNI = cpuid 1 >>= \(_,_,ecx,_) -> ecx `testBit` 25 -- Vincent