- ghc-tickets - Haskell.org

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by carter): @gmainland, is the sse2 test always true on x86_64? Thats my concern, that -msse on x86_64 will be "wrong" with GHC head now that we have SIMD support. Unless LLVM does the right instruction lowering / register transfers when trying to use sse2 when set to -msse -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:24> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by gmainland): I have way to much on my plate at the moment to do anything other than answer questions, although I am happy to do that. The hasSSE1 test is vacuously true on x86_64. So it doesn't affect code correctness, but it is confusing. I was wrong about needing the test at all on x32. We could make GHC pass Floats and Double on the stack when SSE is off, pass Floats in registers when -msse is set, and pass both Floats and Doubles (an vectors) in registers when -msse2 is set. The test wouldn't be necessary, because if -msse2 isn't set, GHC will simply pass Doubles on the stack and LLVM will never see a function call with a double-precision argument. I'm not concerned with 32-bit performance at this point. Do you have an application that requires these changes for performance reasons? If you do, I still don't think it's worth making changes to the 32-bit LLVM calling conventions until we can utilize them in GHC. Simply changing the GHC calling convention in LLVM isn't enough, even when using the LLVM back-end. -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:23> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by carter): 1. changing to "hasSSE2()" will make the engineering for the various levels of SIMD and impact on calling convention a teenyt bit simpler a. If we keep "hasSSE1", we'll want to spill Double's to the stack at -msse1, and pass the first few in registers on -msse2 or higher, which is easy for us to support because we do our stack spills and loads *BEFORE* llvm. This is because SSE1 only supports operations on Floats and not Doubles. (@gmainland, I see you replied already) b. yes, as GHC's SIMD is currently implemented, this is a bug in LLVM (or an oversight on our side?!). @Gmainland, could you create a ticket with a toy example using GHC head that illustrates it as a bug? (LLVM patches that are bug fixes for down stream tools are things we can get considered for the 3.3 POINT release). having a ticket and an explicit example that hits it would be handy! 2. The same sort of instruction selection/register problem sort of happens once we consider supporting both AVX1 and AVX2! a. AVX1 only supports 256bit Float / Double short vector operations, its Word / Int operations are 128bit only. b. AVX2 is required for Word / Int operations to be 256bit (except when you can encode them as the corresponding Floating point SIMD operations). So we really need to sort out a good story here for that anyways (which is kinda the same or similar to the sse1 vs sse2 problem, Ie many machines have AVX1, all sandy-bridge and ivy-bridge Intel chips, and only some cpus have AVX2, the recent Haswell generation) 3. (this perhaps should be done as a different patch for llvm) Would it be worth considering augmenting the number of XMM / YMM used to be''' XMM0-7/YMM0-7''' ? (from current XMM1-6 ). This would be the same scope of change as the x86_32 specific change, but would mean that roughy LLVM 3.4 onwards would only work with newer ghcs. (this is true anyways actually right? 7.6 and earlier are a bit sloppier in their generated bit code, so they dont play nice with newest llvm's, right? ). a. This wouldn't change any Caller side code except for stack spilling, and any calle side code aside from a few reads from the stack, right? should I re-email the list to make sure other folks are in the loop? It sounds like we're already considering changing the x86_32 calling convention (albeit in a ghc old and new compatible way), and thats as good a time as any to start seriously considering any other calling convention changes, even if we test them out using a patched GHC and LLVM first. (i don't have the right hardware to run NOFIB for testing such a change myself...) -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:22> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #4001: Implement an atomic readMVar
by GHC 09 Jul '13

09 Jul '13

#4001: Implement an atomic readMVar ----------------------------+---------------------------------------------- Reporter: | Owner: ezyang simonmar | Status: new Type: task | Milestone: 7.6.2 Priority: low | Version: 6.12.2 Component: Runtime | Keywords: System | Architecture: Unknown/Multiple Resolution: | Difficulty: Moderate (less than a day) Operating System: | Blocked By: Unknown/Multiple | Related Tickets: Type of failure: | None/Unknown | Test Case: | Blocking: | ----------------------------+---------------------------------------------- Comment (by ezyang): Ah, found the bug... -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4001#comment:20> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #4001: Implement an atomic readMVar
by GHC 09 Jul '13

09 Jul '13

#4001: Implement an atomic readMVar ----------------------------+---------------------------------------------- Reporter: | Owner: ezyang simonmar | Status: new Type: task | Milestone: 7.6.2 Priority: low | Version: 6.12.2 Component: Runtime | Keywords: System | Architecture: Unknown/Multiple Resolution: | Difficulty: Moderate (less than a day) Operating System: | Blocked By: Unknown/Multiple | Related Tickets: Type of failure: | None/Unknown | Test Case: | Blocking: | ----------------------------+---------------------------------------------- Comment (by ezyang): While testing, I discovered this program spins infinitely when run through ghci (all the compiled ways are fine): {{{ module Main where import GHC.MVar import Control.Concurrent main = do m <- newEmptyMVar sync <- newEmptyMVar let f = atomicReadMVar m t1 <- forkIO (f >> error "FAILURE") t2 <- forkIO (f >> putMVar sync ()) killThread t1 putMVar m (0 :: Int) atomicReadMVar sync }}} Does ghci have any sort of special MVar handling? -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/4001#comment:19> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by gmainland): SSE (the original) only allowed short vectors of 4 single-precision floating point values. So yes, I think the current version is buggy, and strictly speaking incorrect! With SSE1 (only), you cannot put Double arguments into XMM*. -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:21> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by carter): So does this mean that the version LLVM has is currently buggy? https://github.com/llvm- mirror/llvm/blob/master/lib/Target/X86/X86CallingConv.td#L289 (Note that a number of other conventions seem to use ) it uses the "hasSSE1()" check currently. That said, most of the useful SSE features / simd capabilities don't happen till sse2... so not terribly useful for us strictly speaking, it is correct, in SSE1 there are XMM0-7, though the only simd operations available then are for Floats. SSE2 gets the Word + Double simd primops. Likewise XMM8-15 start being available in sse3 sse2 started being in intel and amd chips 2001-2003, and sse3 landed in 2004. -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:20> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by gmainland): Shouldn't that be hasSSE2, not hasSSE1? -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:19> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by carter): {{{ // Pass in STG registers for floats, doubles and 128bit simd vectors CCIfType<[f32, f64, v16i8, v8i16, v4i32, v2i64, v4f32, v2f64], CCIfSubtarget<"hasSSE1()", CCAssignToReg<[XMM1, XMM2, XMM3, XMM4, XMM5, XMM6]>>>, // Pass in STG registers for 256bit simd vectors CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], CCIfSubtarget<"hasAVX()", CCAssignToReg<[YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6]>>> }}} will be added to both stanzas then -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:18> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0

Re: [GHC] #8033: add AVX register support to llvm calling convention
by GHC 09 Jul '13

09 Jul '13

#8033: add AVX register support to llvm calling convention -------------------------------------+------------------------------------ Reporter: carter | Owner: carter Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.7 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by carter): Ok, I'll amend the patch so that the XMM and YMM stanzas are the same for 32bit and 64bit x86. Current 32bit x86 GHC's will still work correctly because they spill all the floating point values to the stack, and I guess we'll figure out manpower wise doing the work to synch up the native code gen to support floating point args too. -- Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8033#comment:17> GHC <http://www.haskell.org/ghc/> The Glasgow Haskell Compiler

1 0