
On 01/31/2013 07:10 PM, David Terei wrote:
On 31 January 2013 09:52, Geoffrey Mainland
wrote: On 01/31/2013 12:56 PM, Simon Marlow wrote:
On 31/01/13 11:38, Geoffrey Mainland wrote:
* Win32 issues
Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32 aligns only to 4-bytes. LLVM does not assume 16-byte stack alignment. Instead, on platforms where 16-byte stack alignment is not guaranteed, it 1) always outputs a function prologue that 2) aligns the stack to a 16-byte boundary with an "and" instructions, and it also 3) disables tail calls. Because LLVM aligns the stack for a function that has SSE register spills, it also generates movaps instructions (aligned SSE moves) for the spills.
I must be misunderstanding your use of "always" above, because that would imply that the LLVM backend doesn't work on Win32 at all. Maybe LLVM only aligns the stack when it needs to store SSE values?
You are correct---the stack-aligning prologue is only added by LLVM when SSE values are written to the stack, so this wasn't a problem before we had SSE support.
This makes SSE support on Win32 difficult, and in my opinion not worth worrying about.
The alternative is to 1) patch LLVM to disable the stack-alignment code so that we recover the ability to use tail calls and so that ebp scribbled over by the prologue and 2) patch the mangler to rewrite LLVM's movaps (move aligned) instructions to movups (move unaligned) instructions. I have these patches, but they are not included in the simd branch.
I don't have an opinion here - maybe ask David T what he'd prefer.
Requiring an LLVM hack seems pretty bad, and David yelled when I changed the mangler since he wants to get rid of it eventually. My patches are still around, so if we decide Win32 support is important, I can always add the changes.
Not supporting Win32 sucks but yes, I want to move to just requiring LLVM un-patched and no mangler. How ugly are the patches for LLVM? I'd be supportive of it if the plan is to get them merged upstream. Otherwise, I don't think it is worth the effort of having to carry around our own patched LLVM for installation on windows.
The patch against LLVM 3.0 is here: https://github.com/mainland/ghc-simd-tests/blob/master/patches/llvm-3.0.patc... If you were to look, you'd see that it's not appropriate for upstream integration. Please don't look :) Since we have support for Win64 as of GHC 7.6, I vote that we forget about Win32 support for SSE. Simon, this reminds me of two other issues... 1) SSE vector values are only passed in registers on x86-64 anyway right now. MAX_REAL_FLOAT_REG and MAX_REAL_DOUBLE_REG are both #defined to 0 on x86-32 in includes/stg/MachRegs.h. Are floats and double not passed in registers on x86-32? I'm confused as to how this works. The GHC calling convention in LLVM certainly says they are passed in registers. 2) SSE support is processor and platform dependent. What is the proper way for the programmer to know what SSE primitives are available? A CPP define? If so, what should it be called? Right now one can look at the TARGET_* and __GLASGOW_HASKELL_LLVM__ CPP macros and make a decision as to whether or not SSE primitives are available, but that's not a great solution. Also, what happens when we want to add AVX support? How do we control the inclusion of AVX support when building GHC, and how do we let the programmer know that the AVX primops/primtypes are available for use? Geoff