
I would be happy to advise if you would like to pick this up.
Thanks Ben!
This would mean that Haskell libraries compiled with different flags would not be ABI compatible.
Wait, can we not maintain ABI compatibility if we limit the target
features using a compiler flag? Sometimes (for performance reasons)
it's reasonable to request the compiler to only generate SSE
instructions, even if AVX2 is available on the target. On GCC we can
use the flag -msse to do just that.
On Tue, Mar 14, 2017 at 5:49 PM, Carter Schonwald
This thread is getting into a broader discussion about target specific intrincsics as user prims vs compiler generated.
@ben - ed is talking about stuff like a function call that's using a specific avx2 intrinsic, not the parameterized vector abstraction. LLvm shouldn't be lowering those. ... or clang has issues :/
On Tue, Mar 14, 2017 at 4:33 PM Geoffrey Mainland
wrote: On 03/14/2017 04:02 PM, Ben Gamari wrote:
Edward Kmett
writes: Hrmm. In C/C++ I can tell individual functions to turn on additional ISA feature sets with compiler-specific __attribute__((target("avx2"))) tricks. This avoids complains from the compiler when I call builtins that aren't available at my current compilation feature level. Perhaps pragmas for the codegen along those lines is what we'd ultimately need? Alternately, if we simply distinguish between what the ghc codegen produces with one set of options and what we're allowed to ask for explicitly with another then user-land tricks like I employ would remain sound.
I'm actually not sure that simply distinguishing between the user- and codegen-allowed ISA extensions is quite sufficient. Afterall, AFAIK LLVM doesn't make such a distinction itself: AFAIK if you write a vector primitive and compile for a target that doesn't have an appropriate instruction the code-generator will lower it with software emulation.
This would mean that Haskell libraries compiled with different flags would not be ABI compatible.
Our original paper exposed a Multi type class that was meant to be the programmer interface to the primops. A Multi a would be the widest vector type supported on the current architecture, so code that used a Multi Double would always be guaranteed to work at the widest vector type available for Double's.
The Multi approach explicitly eschewed lowering, but I would argue that if performance is the goal, then automatic lowering is not what you want. I would rather have the system pick the correct vector width for me based on the current architecture.
This does nothing to solved the problem of ABI compatibility, which is one reason I didn't push to get this upstreamed.
Is the Multi approach desirable? I think it would be nice to be able to at least provide such a solution even if it isn't some sort of default. Do we really want lowering of wider vector types?
Geoff
However, adding a pragma to allow per-function target annotations seems quite reasonable and easily doable. Moreover, contrary to my previous assertion, it shouldn't require any splitting of compilation units. I ran a quick experiment, compiling this program,
__attribute__((target("sse2"))) int hello() { return 1; }
With clang. It produced something like,
define i32 @hello() #0 { ret i32 1 }
attributes #0 = { "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" ... }
So it seems LLVM is perfectly capable of expressing this; in hindsight I'm not sure why I ever doubted this.
There are a number of details that would need to be worked out regarding how such a pragma should behave. Does the general direction sound reasonable? I've opened #13427 [1] to track this idea.
Cheers,
- Ben
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs