#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by carter):
@gmainland, is the sse2 test always true on x86_64? Thats my concern,
that -msse on x86_64 will be "wrong" with GHC head now that we have SIMD
support. Unless LLVM does the right instruction lowering / register
transfers when trying to use sse2 when set to -msse
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:24>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by gmainland):
I have way to much on my plate at the moment to do anything other than
answer questions, although I am happy to do that.
The hasSSE1 test is vacuously true on x86_64. So it doesn't affect code
correctness, but it is confusing.
I was wrong about needing the test at all on x32. We could make GHC pass
Floats and Double on the stack when SSE is off, pass Floats in registers
when -msse is set, and pass both Floats and Doubles (an vectors) in
registers when -msse2 is set. The test wouldn't be necessary, because if
-msse2 isn't set, GHC will simply pass Doubles on the stack and LLVM will
never see a function call with a double-precision argument.
I'm not concerned with 32-bit performance at this point. Do you have an
application that requires these changes for performance reasons? If you
do, I still don't think it's worth making changes to the 32-bit LLVM
calling conventions until we can utilize them in GHC. Simply changing the
GHC calling convention in LLVM isn't enough, even when using the LLVM
back-end.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:23>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by carter):
1. changing to "hasSSE2()" will make the engineering for the various
levels of SIMD and impact on calling convention a teenyt bit simpler
a. If we keep "hasSSE1", we'll want to spill Double's to the stack at
-msse1, and pass the first few in registers on -msse2 or higher, which is
easy for us to support because we do our stack spills and loads *BEFORE*
llvm. This is because SSE1 only supports operations on Floats and not
Doubles. (@gmainland, I see you replied already)
b. yes, as GHC's SIMD is currently implemented, this is a bug in LLVM
(or an oversight on our side?!). @Gmainland, could you create a ticket
with a toy example using GHC head that illustrates it as a bug? (LLVM
patches that are bug fixes for down stream tools are things we can get
considered for the 3.3 POINT release). having a ticket and an explicit
example that hits it would be handy!
2. The same sort of instruction selection/register problem sort of happens
once we consider supporting both AVX1 and AVX2!
a. AVX1 only supports 256bit Float / Double short vector operations,
its Word / Int operations are 128bit only.
b. AVX2 is required for Word / Int operations to be 256bit (except
when you can encode them as the corresponding Floating point SIMD
operations). So we really need to sort out a good story here for that
anyways (which is kinda the same or similar to the sse1 vs sse2 problem,
Ie many machines have AVX1, all sandy-bridge and ivy-bridge Intel chips,
and only some cpus have AVX2, the recent Haswell generation)
3. (this perhaps should be done as a different patch for llvm) Would it be
worth considering augmenting the number of XMM / YMM used to be'''
XMM0-7/YMM0-7''' ? (from current XMM1-6 ). This would be the same scope
of change as the x86_32 specific change, but would mean that roughy LLVM
3.4 onwards would only work with newer ghcs. (this is true anyways
actually right? 7.6 and earlier are a bit sloppier in their generated bit
code, so they dont play nice with newest llvm's, right? ).
a. This wouldn't change any Caller side code except for stack spilling,
and any calle side code aside from a few reads from the stack, right?
should I re-email the list to make sure other folks are in the loop? It
sounds like we're already considering changing the x86_32 calling
convention (albeit in a ghc old and new compatible way), and thats as good
a time as any to start seriously considering any other calling convention
changes, even if we test them out using a patched GHC and LLVM first. (i
don't have the right hardware to run NOFIB for testing such a change
myself...)
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:22>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#4001: Implement an atomic readMVar
----------------------------+----------------------------------------------
Reporter: | Owner: ezyang
simonmar | Status: new
Type: task | Milestone: 7.6.2
Priority: low | Version: 6.12.2
Component: Runtime | Keywords:
System | Architecture: Unknown/Multiple
Resolution: | Difficulty: Moderate (less than a day)
Operating System: | Blocked By:
Unknown/Multiple | Related Tickets:
Type of failure: |
None/Unknown |
Test Case: |
Blocking: |
----------------------------+----------------------------------------------
Comment (by ezyang):
While testing, I discovered this program spins infinitely when run through
ghci (all the compiled ways are fine):
{{{
module Main where
import GHC.MVar
import Control.Concurrent
main = do
m <- newEmptyMVar
sync <- newEmptyMVar
let f = atomicReadMVar m
t1 <- forkIO (f >> error "FAILURE")
t2 <- forkIO (f >> putMVar sync ())
killThread t1
putMVar m (0 :: Int)
atomicReadMVar sync
}}}
Does ghci have any sort of special MVar handling?
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4001#comment:19>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by gmainland):
SSE (the original) only allowed short vectors of 4 single-precision
floating point values. So yes, I think the current version is buggy, and
strictly speaking incorrect! With SSE1 (only), you cannot put Double
arguments into XMM*.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:21>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by carter):
So does this mean that the version LLVM has is currently buggy?
https://github.com/llvm-
mirror/llvm/blob/master/lib/Target/X86/X86CallingConv.td#L289
(Note that a number of other conventions seem to use )
it uses the "hasSSE1()" check currently.
That said, most of the useful SSE features / simd capabilities don't
happen till sse2... so not terribly useful for us
strictly speaking, it is correct, in SSE1 there are XMM0-7, though the
only simd operations available then are for Floats.
SSE2 gets the Word + Double simd primops.
Likewise XMM8-15 start being available in sse3
sse2 started being in intel and amd chips 2001-2003, and sse3 landed in
2004.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:20>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8033: add AVX register support to llvm calling convention
-------------------------------------+------------------------------------
Reporter: carter | Owner: carter
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.7
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by carter):
Ok, I'll amend the patch so that the XMM and YMM stanzas are the same for
32bit and 64bit x86.
Current 32bit x86 GHC's will still work correctly because they spill all
the floating point values to the stack, and I guess we'll figure out
manpower wise doing the work to synch up the native code gen to support
floating point args too.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:17>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler