Re: Vector primops sizes

14 Feb 2013

      I mentioned this in another thread, but Xeon Phi chips have 512-bit AVX,
and Intel has apparently implemented support in LLVM for the ispc compiler.
Also apparently this hasn't been merged back yet, but I guess it is only a
matter of time.
The Intel MIC architecture isn't quite x86 though.

https://github.com/ispc/ispc/issues/367
http://gpuscience.com/software/ispc-a-spmd-compiler-with-xeon-and-xeon-phi-s...

Alexander

On Thu, Feb 14, 2013 at 12:29 AM, Geoffrey Mainland wrote:
...
I haven't seen Michael's patches (where are they btw?), but there is
some extra work to be done to ensure that 256-bit values are passed in
registers. Otherwise adding support for wider vector types is fairly
straightforward.
The current plan is for 256-bit wide vector primops to always be
available. The programmer can test for the __AVX__ CPP symbol, which
indicates that these primops will be compiled to efficient code. I am
not inclined to add wider vector primops, as there is no current
platform where they can be compiled efficiently.
Most programmers should use the Multi type family instead of working
with primops (or their boxed wrappers) directly. For example, by using
Multi Double instead of DoubleX2, the programmer will get 256-bit wide
vectors on platforms that support AVX, and 128-bit wide vectors
otherwise. See https://github.com/mainland/primitive for details.
Geoff
On 02/13/2013 07:44 AM, Simon Peyton-Jones wrote:
...
I believe Geoff is working on adding AVX.  I expect he’d be interested
in your patches.
Simon
*From:*ghc-devs-bounces@haskell.org
[mailto:ghc-devs-bounces@haskell.org] *On Behalf Of *Carter Schonwald
*Sent:* 13 February 2013 05:59
*To:* Michael Baikov
*Cc:* ghc-devs@haskell.org
*Subject:* Re: Vector primops sizes
Yes please! having these  (for valid target arches/ CPU targets) would
be really really valuable for me.
On Feb 13, 2013 12:07 AM, "Michael Baikov" mailto:manpacket@gmail.com> wrote:
...
Recently merged vector primops support only 16 bytes operands - Int32
x 4, Double x 2 and so on. Current AVX instructions support 256 bit
operands and with simple cut'n'paste work it's possible to support at
least Double x 4 operands. I made those changes and GHC generates
(using llvm) proper AVX code using ymm registers. Also it might make
sense to support primops for vector types larger than any currently
supported primitive types - I have those changes in my branch as well
and llvm generates pretty good code as well - those changes might be
useful to provide access for llvm shufflevector instruction or writing
high performance processing of large vectors - with less potential
overhead.
Do we want to support larger vectors directly or ghc should be made
smart enough to fuse operations with vector primops performed in
parallel into larger vectors/registers for llvm? Do we want to provide
access to llvm shufflevector instruction?
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org mailto:ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

Re: Vector primops sizes

Alexander Kjeldaas