
I mentioned this in another thread, but Xeon Phi chips have 512-bit AVX,
and Intel has apparently implemented support in LLVM for the ispc compiler.
Also apparently this hasn't been merged back yet, but I guess it is only a
matter of time.
The Intel MIC architecture isn't quite x86 though.
https://github.com/ispc/ispc/issues/367
http://gpuscience.com/software/ispc-a-spmd-compiler-with-xeon-and-xeon-phi-s...
Alexander
On Thu, Feb 14, 2013 at 12:29 AM, Geoffrey Mainland
I haven't seen Michael's patches (where are they btw?), but there is some extra work to be done to ensure that 256-bit values are passed in registers. Otherwise adding support for wider vector types is fairly straightforward.
The current plan is for 256-bit wide vector primops to always be available. The programmer can test for the __AVX__ CPP symbol, which indicates that these primops will be compiled to efficient code. I am not inclined to add wider vector primops, as there is no current platform where they can be compiled efficiently.
Most programmers should use the Multi type family instead of working with primops (or their boxed wrappers) directly. For example, by using Multi Double instead of DoubleX2, the programmer will get 256-bit wide vectors on platforms that support AVX, and 128-bit wide vectors otherwise. See https://github.com/mainland/primitive for details.
Geoff
On 02/13/2013 07:44 AM, Simon Peyton-Jones wrote:
I believe Geoff is working on adding AVX. I expect he’d be interested in your patches.
Simon
*From:*ghc-devs-bounces@haskell.org [mailto:ghc-devs-bounces@haskell.org] *On Behalf Of *Carter Schonwald *Sent:* 13 February 2013 05:59 *To:* Michael Baikov *Cc:* ghc-devs@haskell.org *Subject:* Re: Vector primops sizes
Yes please! having these (for valid target arches/ CPU targets) would be really really valuable for me.
On Feb 13, 2013 12:07 AM, "Michael Baikov"
mailto:manpacket@gmail.com> wrote: Recently merged vector primops support only 16 bytes operands - Int32 x 4, Double x 2 and so on. Current AVX instructions support 256 bit operands and with simple cut'n'paste work it's possible to support at least Double x 4 operands. I made those changes and GHC generates (using llvm) proper AVX code using ymm registers. Also it might make sense to support primops for vector types larger than any currently supported primitive types - I have those changes in my branch as well and llvm generates pretty good code as well - those changes might be useful to provide access for llvm shufflevector instruction or writing high performance processing of large vectors - with less potential overhead.
Do we want to support larger vectors directly or ghc should be made smart enough to fuse operations with vector primops performed in parallel into larger vectors/registers for llvm? Do we want to provide access to llvm shufflevector instruction?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org mailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs