[GHC] #8048: Register spilling produces ineffecient/highly contending code

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Keywords: register | Operating System: Unknown/Multiple allocator spill | Type of failure: Runtime Architecture: Unknown/Multiple | performance bug Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | -------------------------------------+------------------------------------- The native codegen and llvm both produce ineffecient code for functions using structures with many strict fields or unboxed values. Consider the following program: {{{ {-# LANGUAGE BangPatterns #-} module Spill where import GHC.Exts data S = S !Int !Int !Int !Int !Int !Int !Int !Int !Int spill :: S -> S -> S -> S spill (S !a !b !c !d !e !f !g !h !i) (S !j !k !l !m !n !o !p !q !r) (S !s !t !u !v !w !x !y !z _) = S (a + j + s) (b + c) (k + r) (a + b + c + d + e + f + g + h + i) (j + k + l + m + n + o + p + q + r) (s + t + u + v + w + x + y + z) (a + b + c) (j + k + l) (s + t + u) }}} Parts of the code produced for this (which is identical regardless of -funbox-strict-fields) looks like: {{{ _cnc: addq $80,%r12 cmpq 144(%r13),%r12 ja _cni movq $Spill.S_con_info,-72(%r12) movq 32(%rbp),%rax movq %rax,-64(%r12) movq 24(%rbp),%rax movq %rax,-56(%r12) movq 16(%rbp),%rax movq %rax,-48(%r12) movq 8(%rbp),%rax movq %rax,-40(%r12) movq 40(%rbp),%rax movq %rax,-32(%r12) movq 48(%rbp),%rax movq %rax,-24(%r12) movq 56(%rbp),%rax movq %rax,-16(%r12) movq 64(%rbp),%rax movq %rax,-8(%r12) movq 7(%rbx),%rax movq %rax,0(%r12) leaq -71(%r12),%rbx addq $72,%rbp jmp *0(%rbp) }}} {{{ _csv: movq 63(%rbx),%rax movq %rax,-56(%rbp) movq 55(%rbx),%rax movq %rax,-48(%rbp) movq 47(%rbx),%rax movq %rax,-40(%rbp) movq 39(%rbx),%rax movq %rax,-32(%rbp) movq 31(%rbx),%rax movq %rax,-24(%rbp) movq 23(%rbx),%rax movq %rax,-16(%rbp) movq 71(%rbx),%rax movq %rax,-8(%rbp) movq 15(%rbx),%rax movq %rax,0(%rbp) }}} And likewise for LLVM: {{{ .LBB10_1: # %coZ movq 7(%rbx), %rcx movq $Spill_S_con_info, 8(%rax) movq 8(%rbp), %rdx movq %rdx, 16(%rax) movq 16(%rbp), %rdx movq %rdx, 24(%rax) movq 24(%rbp), %rdx movq %rdx, 32(%rax) movq 32(%rbp), %rdx movq %rdx, 40(%rax) movq 40(%rbp), %rdx movq %rdx, 48(%rax) movq 48(%rbp), %rdx movq %rdx, 56(%rax) movq 56(%rbp), %rdx movq %rdx, 64(%rax) movq 64(%rbp), %rdx movq %rdx, 72(%rax) movq %rcx, (%r12) movq 72(%rbp), %rax leaq 72(%rbp), %rbp leaq -71(%r12), %rbx jmpq *%rax # TAILCALL }}} Quoting from #ghc "the [register allocator] core algo is '96 vintage". Improvements are needed; * Take into consideration pipelining and handle spills less dramatically, attempting to reduce register contention * Sink memory reads in order to reduce register pressure -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by jstolarek): * cc: jan.stolarek@… (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by schyler): This paper might be relevant or otherwise useful. http://ftp.cs.wisc.edu/galileo/papers/ICS88_Hsu.pdf -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by carter): Looking at this, it looks like the real issue is the lack of an analogue of the llvm mem2reg optimization for the ghc stack I think... (ie what we have here is one basic block is pushing args onto the stack, then we're funcalling to another basic block that is unloading them again) (and llvm's mem2reg can't do it because GHC manages its own stack in a way llvm doesn't understand) https://github.com/mlite/HsLlvm/blob/master/src/Llvm/Pass/Mem2Reg.hs is a implementation of mem2reg using hoopl, for the llvm ir, might be worth figuring out how to make that work for cmm. (what follows is me sketching out an idea, though I or someone else will have to do some experimentation to see if it helps ) naively: what I '''think''' is going on is we have a few different basic blocks as CMM functions going on in the test code for the ```spill``` function, and we're moving between these blocks using general function calls, when we actually went to ''' "fuse" ''' the blocks together so that we don't have this busy work of doing the trivial spill to stack only to immediately unspill again. ok, now that i've thought about it out loud, might be easier to do this optimization before we go to CMM, because then we dont have to invert the spill / unspill code -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by carter): dcoutts has pointed out this mem2reg work could be done nicely on the CMM side, though not sure how much is needed to make it so -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by schyler): Unsure if this helps, but there's a project implementing mem2reg using Hoopl on github: https://github.com/mlite/HsLlvm -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: high | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by schyler): * priority: normal => high -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by carter): * priority: high => normal Comment: Schyler, do you have a engineering plan of attack to resolve the problem? Do you have some example codes where you can share benchmarks on the ticket that demonstrate this actually creates a performance problem? I remember you saying you did, but also that you said later "register renaming on newer CPUs seems to make the problem go away". Is it an actual performance issue, or merely an aesthetical one? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Comment (by schyler): Using a pretty crude benchmark, found a 9-10% hit. https://gist.github.com/kvanberendonck/6836713 It's hard to benchmark something so small though - those of you with Intel processors might have less luck running this one, but AMD is a little more linear and gives quite stable results. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: _|_ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by schyler): * cc: schyler (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler (NCG) | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by simonmar): * cc: simonmar (added) * component: Compiler => Compiler (NCG) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 Resolution: | Keywords: register Operating System: Unknown/Multiple | allocator spill Type of failure: Runtime | Architecture: Unknown/Multiple performance bug | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------- Changes (by schyler): * component: Compiler (NCG) => Compiler Comment: Actually simon, I was able to reproduce this with LLVM also and carter can confirm. Interestingly, when we fiddled with some flags like -march=native and such, llvm seemed to bandaid it somehow by using mmx where the rotation happened properly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 (CodeGen) | Keywords: register Resolution: | allocator spill Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: Runtime | Difficulty: Unknown performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | -------------------------------------+------------------------------------- Changes (by simonmar): * component: Compiler => Compiler (CodeGen) Comment: Alright then, if we're being nitpicky about the Component field :-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 (CodeGen) | Keywords: register Resolution: | allocator spill Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by jstolarek): * cc: jan.stolarek@… (removed) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 (CodeGen) | Keywords: register Resolution: | allocator spill Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #10012 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * related: => #10012 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 (CodeGen) | Keywords: register Resolution: | allocator spill Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #10012 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by tjakway): * cc: tjakway (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8048: Register spilling produces ineffecient/highly contending code -------------------------------------+------------------------------------- Reporter: schyler | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: ⊥ Component: Compiler | Version: 7.6.3 (CodeGen) | Keywords: register Resolution: | allocator spill Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #10012 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by michalt): * cc: michalt (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC