
Edward Kmett
Currently if you try to use a DoubleX4# and don't have AVX2 turned on, it deliberately crashes out during code generation, no?
I very well be missing something, but I don't believe this is true. This program compiles just fine with merely -fllvm -msse, {-# LANGUAGE MagicHash #-} {-# LANGUAGE UnboxedTuples #-} module Hi where import GHC.Prim import GHC.Float addIt :: DoubleX4# -> DoubleX4# -> DoubleX4# addIt x y = plusDoubleX4# x y {-# NOINLINE addIt #-} It produces the following assembler,, movupd 0x10(%rbp),%xmm0 movupd 0x0(%rbp),%xmm1 movupd 0x30(%rbp),%xmm2 movupd 0x20(%rbp),%xmm3 addpd %xmm1,%xmm3 addpd %xmm0,%xmm2 movupd %xmm2,0x30(%rbp) movupd %xmm3,0x20(%rbp) mov 0x40(%rbp),%rax lea 0x20(%rbp),%rbp jmpq *%rax The reason for this is that the LLVM code generator just blindly translates DoubleX4# to LLVM's <4 x double> type. The LLVM code generator then does whatever it can to produce the code we ask of it, even if the target doesn't have support for this vector variety. Cheers, - Ben