
On 31 January 2013 09:52, Geoffrey Mainland
On 01/31/2013 12:56 PM, Simon Marlow wrote:
On 31/01/13 11:38, Geoffrey Mainland wrote:
I've pushed my simd branch to darcs.haskell.org. Everything has been rebased against HEAD. Simon PJ and I looked over the changes together already, but I wanted to give you (and everyone on ghc-devs) the opportunity to look things over before I merge to HEAD. Simon PJ and I came up with a few questions/notes for you, but hopefully nothing that should delay a merge.
I'm happy for these to go in - we've already discussed the design a few times, and you've incorporated changes we agreed before, so as far as I'm concerned it's all good. Go for it!
Cool.
* Win32 issues
Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32 aligns only to 4-bytes. LLVM does not assume 16-byte stack alignment. Instead, on platforms where 16-byte stack alignment is not guaranteed, it 1) always outputs a function prologue that 2) aligns the stack to a 16-byte boundary with an "and" instructions, and it also 3) disables tail calls. Because LLVM aligns the stack for a function that has SSE register spills, it also generates movaps instructions (aligned SSE moves) for the spills.
I must be misunderstanding your use of "always" above, because that would imply that the LLVM backend doesn't work on Win32 at all. Maybe LLVM only aligns the stack when it needs to store SSE values?
You are correct---the stack-aligning prologue is only added by LLVM when SSE values are written to the stack, so this wasn't a problem before we had SSE support.
This makes SSE support on Win32 difficult, and in my opinion not worth worrying about.
The alternative is to 1) patch LLVM to disable the stack-alignment code so that we recover the ability to use tail calls and so that ebp scribbled over by the prologue and 2) patch the mangler to rewrite LLVM's movaps (move aligned) instructions to movups (move unaligned) instructions. I have these patches, but they are not included in the simd branch.
I don't have an opinion here - maybe ask David T what he'd prefer.
Requiring an LLVM hack seems pretty bad, and David yelled when I changed the mangler since he wants to get rid of it eventually. My patches are still around, so if we decide Win32 support is important, I can always add the changes.
Not supporting Win32 sucks but yes, I want to move to just requiring LLVM un-patched and no mangler. How ugly are the patches for LLVM? I'd be supportive of it if the plan is to get them merged upstream. Otherwise, I don't think it is worth the effort of having to carry around our own patched LLVM for installation on windows. Cheers, David
* Could we add a CmmType field to GlobalReg's constructors? You'll see that I added a new XmmReg constructor to GlobalReg, but because I don't know the type of an XmmReg, I have to bitcast everywhere in the generated LLVM code because LLVM wants to know not just that a value is a 16-byte vector, but that it is, e.g., a 16-byte vector containing 2 64-bit doubles. Having a CmmType attached to a GlobalReg---or pairing a GlobalReg with a CmmType when assigning registers---would let me avoid all these casts.
We already have a function
globalRegType :: DynFlags -> GlobalReg -> CmmType
so I see that you're guessing in the case of XmmReg. Why not just add the necessary information to XmmReg so that you don't have to guess in globalRegType?
There doesn't seem to be a clear best choice for this extra info. A CmmType seems reasonable, and if I'm adding a CmmType to XmmReg, why not add it everywhere and simplify globalRegType? I'll go ahead and stick with what I have now.
Thanks for all your answers.
Geoff
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs