Re: simd branch ready for review

31 Jan 2013

      On 01/31/2013 07:10 PM, David Terei wrote:
...
On 31 January 2013 09:52, Geoffrey Mainland  wrote:
...
On 01/31/2013 12:56 PM, Simon Marlow wrote:
...
On 31/01/13 11:38, Geoffrey Mainland wrote:
...
* Win32 issues
Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32
aligns only to 4-bytes. LLVM does not assume 16-byte stack
alignment. Instead, on platforms where 16-byte stack alignment is not
guaranteed, it 1) always outputs a function prologue that 2) aligns
the stack to a 16-byte boundary with an "and" instructions, and it
also 3) disables tail calls. Because LLVM aligns the stack for a
function that has SSE register spills, it also generates movaps
instructions (aligned SSE moves) for the spills.
I must be misunderstanding your use of "always" above, because that
would imply that the LLVM backend doesn't work on Win32 at all. Maybe
LLVM only aligns the stack when it needs to store SSE values?
You are correct---the stack-aligning prologue is only added by LLVM when
SSE values are written to the stack, so this wasn't a problem before we
had SSE support.
...
...
This makes SSE support on Win32 difficult, and in my opinion not
worth worrying about.
The alternative is to 1) patch LLVM to disable the stack-alignment
code so that we recover the ability to use tail calls and so that ebp
scribbled over by the prologue and 2) patch the mangler to rewrite
LLVM's movaps (move aligned) instructions to movups (move unaligned)
instructions. I have these patches, but they are not included in the
simd branch.
I don't have an opinion here - maybe ask David T what he'd prefer.
Requiring an LLVM hack seems pretty bad, and David yelled when I changed
the mangler since he wants to get rid of it eventually. My patches are
still around, so if we decide Win32 support is important, I can always
add the changes.
Not supporting Win32 sucks but yes, I want to move to just requiring
LLVM un-patched and no mangler. How ugly are the patches for LLVM? I'd
be supportive of it if the plan is to get them merged upstream.
Otherwise, I don't think it is worth the effort of having to carry
around our own patched LLVM for installation on windows.
The patch against LLVM 3.0 is here:

https://github.com/mainland/ghc-simd-tests/blob/master/patches/llvm-3.0.patc...

If you were to look, you'd see that it's not appropriate for upstream
integration. Please don't look :)

Since we have support for Win64 as of GHC 7.6, I vote that we forget
about Win32 support for SSE.

Simon, this reminds me of two other issues...

1) SSE vector values are only passed in registers on x86-64 anyway right
now. MAX_REAL_FLOAT_REG and MAX_REAL_DOUBLE_REG are both #defined to 0
on x86-32 in includes/stg/MachRegs.h. Are floats and double not passed
in registers on x86-32? I'm confused as to how this works. The GHC
calling convention in LLVM certainly says they are passed in registers.

2) SSE support is processor and platform dependent. What is the proper
way for the programmer to know what SSE primitives are available? A CPP
define? If so, what should it be called?

Right now one can look at the TARGET_* and __GLASGOW_HASKELL_LLVM__ CPP
macros and make a decision as to whether or not SSE primitives are
available, but that's not a great solution. Also, what happens when we
want to add AVX support? How do we control the inclusion of AVX support
when building GHC, and how do we let the programmer know that the AVX
primops/primtypes are available for use?

Geoff