have you run a Nofib, or even benchmarks restricted to your multivector code, for the current calling convention (including the spilling AVX vectors to the stack thats the current plan i gather) VS passing in registers with an LLVM built using the patches i worked out ~ 2 months ago? it'd be really easy to build that custom llvm, then run the benchmarks! (i'm happy to help, and ultimately, benchmarks will reveal if its worth while or not! And if the main goal is for your talk, its still valid even if its not in the merge window over the next 4 days).
I really think its not obvious what the "best" abi change would be! It really will require coming up with a list of variants, implementing them, and running nofib with each variant, which i lack the compute/human time resources to do this week. Modern hardware is complex enough that for something like an ABI change, the only healthy attitude can be "lets benchmark it!".
i'd really like any change in calling convention to also improve perf on codes that aren't explicitly simd! (and a conservative simd only change, blocks/conflicts with that augmentation going forward, and not just for the stack pointer example i mention early)
Not just scalar floats in simd registers , but perhaps also words/ints !
(though that latter bit might be pretty ambitious and subtle, i'll need to investigate that a bit to see how feasible it may be).
SIMD has great support for ints/words, and any partial abi change on the llvm backend now would make it hard to support that later well (or at least, thats what it looks like to me). actually effectively using simd for scalar ints and words should be doable, but might force us to be a bit more thoughtful on how GHC internally distinguishes ints used for address arithmetic, vs ints used as data. (interestingly, i'm not sure if any current extent x86 calling convention does that!)
That single change would make 7.10 require a completely different llvm and native code gen convention from our current one, plus touch all of the code gen on x86 architectures.
basically: we're lucky that everyone builds haskell code from source, so ABI compat across GHC versions is a non issue. BUT, any ABI changes should be backed by benchmarks (at least when the change is performance motivated). Likewise, because we use LLVM as an external dep for the -fllvm backend, we really need to keep how their release cycle interacts with our release cycle, because people use haskell and ghc! which as many like to say, is both a boon and a pain ;).
Having people hit ghc acting broken with an llvm that was "supported before" is risky support problem to deal with. having an LLVM head variant support a modified ABI, and then later needing to break it for 7.10 (for one of the possible exploratory reasons above) would lead to a support headache I don't wish on anyone.
pardon the verbose answer, but thats my offhand take
cheers