On Wed, Mar 15, 2017 at 5:14 PM, Ben Gamari <ben@smart-cactus.org> wrote:

Edward Kmett <ekmett@gmail.com> writes:

> Currently if you try to use a DoubleX4# and don't have AVX2 turned on, it
> deliberately crashes out during code generation, no?

I very well be missing something, but I don't believe this is true. This
program compiles just fine with merely -fllvm -msse,

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
module Hi where
import GHC.Prim
import GHC.Float

addIt :: DoubleX4# -> DoubleX4# -> DoubleX4#
addIt x y = plusDoubleX4# x y
{-# NOINLINE addIt #-}

It produces the following assembler,,

movupd 0x10(%rbp),%xmm0
movupd 0x0(%rbp),%xmm1
movupd 0x30(%rbp),%xmm2
movupd 0x20(%rbp),%xmm3
addpd %xmm1,%xmm3
addpd %xmm0,%xmm2
movupd %xmm2,0x30(%rbp)
movupd %xmm3,0x20(%rbp)
mov 0x40(%rbp),%rax
lea 0x20(%rbp),%rbp
jmpq *%rax

The reason for this is that the LLVM code generator just blindly
translates DoubleX4# to LLVM's <4 x double> type. The LLVM code
generator then does whatever it can to produce the code we ask of it,
even if the target doesn't have support for this vector variety.

Cheers,

- Ben