
Geoff I'm too far from this stuff to give it a meaningful review, at least not without sitting beside you. So I suggest you just merge it! Simon Marlow may want to look. The wiki page http://ghc.haskell.org/trac/ghc/wiki/SIMD/Design describes the design, and I think it's up to date with your patches (correct?). Thanks for doing that!
From our previous discussion, the bit I hate is this:
1 there are so many distinct data types (Int16x4, Int32x2, etc) 2 primops.txt.pp therefore has to grow a macro-like mechanism to ameliorate the burden of writing out all the zillions of types and primops Concerning (2), the obvious rejoinder is: well, primops.txt.pp is really a program written in a domain specific language -- and that language is getting more elaborate. Solution: stop building a new language, and instead make primops.txt.pp into an embedded DSL in Haskell; just a Haskell program that we run to generate the various outputs. Then all the mechanisms you had to add will be trivial. Concerning (1) what we want is a way to make types Int<16,4> where the parameters 16 and 4 are forced to be static literals, and where you absolutely do not get polymorphism like f :: Int<a><b> -> blah. There is some Trac discussion about this. It can't be that hard. I'm copying some FC friends! Simon | -----Original Message----- | From: Geoffrey Mainland [mailto:mainland@apeiron.net] | Sent: 16 September 2013 20:17 | To: Simon Peyton-Jones; Simon Marlow; Austin Seipp; ghc-devs@haskell.org | Subject: simd branch ready for review/merge | | The SIMD branch, available as wip/simd, is ready for review/merge. It | could use some review---Simon and Simon, I'd be especially grateful if | you both had a quick look. Some major points: | | 1) I have added support for AVX 512, although this is necessarily | untested. AVX and AVX2 are also both supported. | | 2) After the recent churn regarding patching LLVM's GHC calling | convention, by default only 128-bit wide SIMD vectors are passed in | registers, and then only on X86_64. There is a "hidden" flag, | -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that | assumes all vectors are passed in registers by LLVM. This can be used | with a suitably patched version of LLVM, and if we get LLVM 3.4 patched, | we can consider turning it on by default for LLVM 3.4+. This would mean | that we couldn't mix LLVM <3.3-compiled object files with LLVM | >3.4-compiled object files, but I don't see that as much of a problem. | | 3) utils/genprimcode has been hacked up to allow us to write vector | operations once and have them instantiated at multiple vector types. I'm | not thrilled with this solution, but after discussing with Simon PJ, | what I've implemented seems to be the minimal reasonable solution to the | problem of exploding primop boilerplate. The changes are documented in | compiler/prelude/primops.txt.pp. | | 4) Error handling is sub-optimal. My patch checks to make sure that | vector primops can be compiled efficiently based on the current set of | dynamic flags. For example, if -mavx is not specified and the user tries | to use a primop that adds together two 256-bit wide vectors of | double-precision elements, the user will see an error message like: | | ghc-stage2: sorry! (unimplemented feature or known bug) | (GHC version 7.7.20130916 for x86_64-unknown-linux): | 256-bit wide floating point SIMD vector instructions require at | least -mavx. | | This is because the only good place to check for this kind of error is | during STG->Cmm translation (in compiler/codeGen/StgCmmPrim.hs), and we | don't have much of an error handling infrastructure there in contrast to | when we're working in the typechecking/renaming monad. If there is a | better way/place to do this, please let me know. | | Thanks, | Geoff