Re: LLVM calling convention for AVX2 and AVX512 registers

15 Mar 2017


      to reiterate: any automated lowering / shimming scheme will hurt any
serious user of simd who isn't treating it as some black box abstraction.
And those are the very users who are equipped to write / design libraries /
ghc improvements that let still *other* users pretend to have a mostly
decent black box abstraction. Our compiler engineering bandwidth is not
enough to start with any automagic in this problem domain that isn't
validated with a model implementation in user space.

On Wed, Mar 15, 2017 at 3:31 PM, Carter Schonwald <
carter.schonwald@gmail.com> wrote:
...
agreed. and the generic vector size stuff in llvm is both pretty naive,
AND not the sane/tractable way to add SIMD support to the NCG,
i'm totally ok with my vector sizes that are available depending on the
target CPU or whatever. Operating systems have very sane errors for that
sort of mishap,
On Wed, Mar 15, 2017 at 3:29 PM, Edward Kmett  wrote:
...
Currently if you try to use a DoubleX4# and don't have AVX2 turned on, it
deliberately crashes out during code generation, no? So this is very
deliberately *not* a problem with the current setup as I understand it.
It only becomes one if we reverse the decision and decide to add terribly
inefficient shims for this functionality at the primop level rather than
have a higher level make the right call to just not use functionality that
isn't present on the target platform.
-Edward
On Wed, Mar 15, 2017 at 10:27 AM, Ben Gamari 
wrote:
...
Siddhanathan Shanmugam  writes:
...
...
I would be happy to advise if you would like to pick this up.
Thanks Ben!
...
This would mean that Haskell libraries compiled with different flags
would not be ABI compatible.
Wait, can we not maintain ABI compatibility if we limit the target
features using a compiler flag? Sometimes (for performance reasons)
it's reasonable to request the compiler to only generate SSE
instructions, even if AVX2 is available on the target. On GCC we can
use the flag -msse to do just that.
I think the reasoning here is the following (please excuse the rather
contrived example): Consider a function f with two variants,
module AvxImpl where
    {-# OPTIONS_GHC -mavx #-}
    f :: DoubleX4# -> DoubleX4# -> Double
module SseImpl where
    {-# OPTIONS_GHC -msse #-}
    f :: DoubleX4# -> DoubleX4# -> Double
If we allow GHC to pass arguments with SIMD registers we now have a bit
of a conundrum: The calling convention for AvxImpl.f will require that
we pass the two arguments in YMM registers, whereas SseImpl.f will
be via passed some other means (perhaps two pairs of XMM registers).
In the C world this isn't a problem AFAIK since intrinsic types map
directly to register classes. Consequently, I can look at a C
declaration type,
double f(__m256 x, __m256 y);
and tell you precisely the calling convention that would be used. In
GHC, however, we have an abstract vector model and therefore the calling
convention is determined by which ISA the compiler is targetting.
I really don't know how to fix this "correctly". Currently we assume
that there is a static mapping between STG registers and machine
registers. Giving this up sounds quite painful.
Cheers,
- Ben
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs