
#13570: CoreFVs patch makes n-body slower -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by kavon): It seems the Cmm Sinker won't move the loads in (1) closer to their use in (3) because there is an intervening foreign call to the sqrt function in (2). Right now, the Sinker's analysis conservatively says that it's not save to commute a memory load with a foreign call due to the possibility of that call writing to the heap. Ideally, there would be a marker for foreign calls that we know are pure, i.e., math functions like sqrt, so that the loads can move past it. I'm not sure if this marker already exists or not. In this case, the importance of turning it into 2,1,3 via sinking is that there are fewer values live across the call to sqrt, because that call causes any floating-point values that are in register to be saved to the stack, negating any benefit of loading it so early. Here's the output of the 1,2,3 version I'm seeing with the NCG: {{{ movsd (%rax), %xmm5 ; load A into xmm5 very early ; ... movsd %xmm5, 128(%rsp) ; save A to stack ; ... call _sqrt ; ... movsd 128(%rsp), %xmm4 ; restore A from stack subsd %xmm3, %xmm4 ; actually use A }}} I think 2,1,3 is always desirable in Cmm, as the instruction scheduler should be hiding the load latency. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13570#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler