[GHC] #15056: Wrappers inlined too late

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Consider these two modules {{{ module Foo where test :: Int -> Bool test 0 = True test n = test (n-1) foo :: Foldable t => Int -> t Int -> Int foo n xs | test n = foldr (+) n xs | otherwise = n+7 }}} and {{{ module Bar where import Foo blam x = f 3 [1..x] }}} If we simply inline `foo` into `blam`, we get `foldr/build` fusion. And that is exactly what happens if you compile `Foo` with `-fno-strictness`. But what actually happens is * When compiling `Foo`, strictness analysis does worker/wrapper for `foo`. * The wrapper currently gets an aciviation of "active only in phase 0"; see `Note [Wrapper activation]` in `WorkWrap`. * So `foo` is ultimately inlined (actually its worker is inlined too) but too late for fusion to take place. This is bad: optimising `Foo` makes the overall runtime increase. I have seen other examples of this. The general pattern is: * Without worker/wrapper, a function `f` may inline, which exposes opportunities for functions in its body to participate in rule rewrites * With worker/wrapper, no inlining happens until phase 0; and that is too late for the rules to fire. Obvious thought: allow wrappers from strictness analysis to inline earlier. I'll try that. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Another example comes from `nofib/real/veritas`. In `Edlib.hs` we have {{{ (...) :: Xio_fn -> Xio_fn -> Xin -> Xio (...) f g xin = f xin \\\ g }}} and in `X_interface` we get {{{ x_send_argL argL = x_send x_multi_arg_esc_seq ... x_send [toEnum (length argL)] ... app ( map x_send_arg argL ) ... x_send x_end_arg_esc_seq }}} It turns out that `...` gets w/w'd (because it is CPR'd in fact), and that means it doesn't inline until phase 0. If we inline in phase 2 we get lots of fold/build fusion, which leads to a whopping 25% improvement in the allocation of the veritas benchmark. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15056: Wrappers inlined too late
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.2.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): The `mate` regression was very delicate. In Move.moveDetailsFor, we inline {{{ bd = Board.rmPieceAt c_a25I sq_a2np bd_a25J }}} into two branches of a case (to hopefully avoid allocating it). I think this is postInlineUnconditionally. But then it is floated out and has to be laboriously CSE'd again! (Idea: this aggressive inlining could be done rather late, and perhaps be a late float-in pass. The only gain is less alloc.) However, more delicately we end up with this, where `rPA1` is short for `rmPieceAt1`: $j1 (rPA1 x) --inline-> let v = rPA1 in ...x...x.. $j2 (rPA1 x) --no inline-> $j2 (rPA1 x) In the first case `(rPA1 x)` gets `RhsCtxt`, and hence is inlined (not boring). But in the second, the context is boring, so not inlined. Then both these expressions are floated out, but they do not get CSEd togehter becuase they look different (one has been inlined and the other hasn't). So then we get {{{ lvl1 = ...x..x.. lvl2 = rPA1 x }}} And now the `rPA1 x` is in an `RhsCtxt` and hence is inlined, giving two identical expressions. But it's too late for CSE. (Indeed, CSE is done immediately after float-out, but before any simplification. All very delicate, and not the fault of this patch. Makes me wonder if we should should really distinguish between `BoringCtxt` and `RhsCtxt`. Idea: just try collapsing them. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): The `fulsom` regression was in `Csg.calc`, which has nested join points. Several of them end up like this: {{{ join j (x :: Double) = Board (case x of { D# y -> ... }) ... in ....(j (D# v)).... }}} So the call to $j$ ends up with an explicit `(D# v)` argument, and `SpecConstr` should catch it. But alas it does not because this happens nestedly, and `SpecConstr` is worried about exponential blowup. Why is this a regression? Because something got inlined, that made a join point a bit bigger, which mean that the ''join point'' wasn't inlined. A classic inline-cascade problem. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.6.1 => 8.8.1 Comment: It looks like you intend to continue pondering the regressions, simonpj, so I'll leave this open. However, it's unlikely much else will happen for 8.6 so I'm bumping the milestone. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15056: Wrappers inlined too late -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.10.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Simplifier Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => Simplifier -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15056#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC