[GHC] #13143: NOINLINE and worker/wrapper

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Currently we do no worker/wrapper on a NOINLINE thing. In `WorkWrap`: {{{ tryWW dflags fam_envs is_rec fn_id rhs | isNeverActive inline_act -- No point in worker/wrappering if the thing is never inlined! -- Because the no-inline prag will prevent the wrapper ever -- being inlined at a call site. }}} But if we have, say, {{{ {-# NOINLINE f #-} f (x,y) = error (show x) g True p = f p g False p = snd p + 1 }}} then strictness analysis will discover `f` is strict, and `g`, but ''because `f` has no wrapper'', the worker for `g` will rebox the thing. So we get {{{ f (x,y) = error (show x) $wg b x y = let p = (x,y) -- Yikes! Reboxing! in case b of True -> f p False -> y + 1 g b p = case p of (x,y) -> $wg b x y }}} Now, in this case the reboxing will float into the `True` branch, an so the allocation will only happen on the error path. But it won't float inwards if there are multiple branches that call `(f p)`, so the reboxing will happen on every call of `g`. Disaster. Solution: do worker/wrapper even on NOINLINE things; but move the NOINLINE pragma to the worker. --------------------------- This actually happens! In `GHC.Arr` we have {{{ {-# NOINLINE indexError #-} indexError :: Show a => (a,a) -> a -> String -> b indexError rng i tp = error (...) index b i | inRange b i = unsafeIndex b i | otherwise = indexError b i "Char" }}} The `inRange` generates multiple alternatives, which the `indexError` is duplicated into, and exactly this phenomenon takes place. Eric (gridaphobe) offered this standalone example {{{ module Err where tabulate :: (Int -> a) -> (Int, Int) -> [Int] tabulate f (l,u) = array (l,u) [l..u] {-# INLINE array #-} array :: (Int, Int) -> [Int] -> [Int] array (l,u) is = [index (l,u) i | i <- is] {-# INLINE index #-} index :: (Int, Int) -> Int -> Int index b@(l,h) i | l <= i && i < h = 0 | otherwise = indexError b i 0 {-# NOINLINE indexError #-} indexError :: (Int, Int) -> Int -> Int -> b indexError rng i tp = error (show rng) }}} Compile this with GHC 8, and shudder at the terrible code we get for `$wtabulate`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by gridaphobe): Hi Simon, I'm a bit confused by your simplified example. On HEAD I get {{{ g = \ @ a @ p $dNum $dShow ds p1 -> case ds of { False -> + $dNum (case p1 of { (ds1, y) -> y }) (fromInteger $dNum g1); True -> f $dShow p1 } }}} No reboxing (or worker) in sight. But if I pattern match on the pair inside `g` {{{ g True (x,y) = f (x,y) g False (x,y) = x + 1 }}} I get {{{ $wg = \ @ b @ p w w1 w2 ww -> case w2 of { False -> + w ww (fromInteger w g2); True -> f w1 (ww, g1) } g = \ @ b @ p w w1 w2 w3 -> case w3 of { (ww1, ww2) -> $wg w w1 w2 ww1 } }}} which does do the needless reboxing. Unfortunately, removing the `isNeverActive` case in `tryWW` does not change the result, and glancing through `-dverbose-core2core` it doesn't appear that we end up performing W/W on `f`. It also doesn't improve the situation for the GHC.Arr example as you've written it, though interestingly it '''does''' improve over HEAD if you define `indexError` as a simple diverging loop {{{ indexError rng i tp = indexError rng i tp }}} rather than an explicit `error` call. Furthermore, even though the motivating example is not yet improved, there are some small gains in nofib {{{ -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- Min -0.9% -0.3% -16.9% -24.7% -2.4% Max 0.0% 0.0% +14.5% +15.8% +0.9% Geometric Mean -0.1% -0.0% +0.4% -1.2% -0.0% }}} It seems we're on the right track, but there must be another missing piece to the puzzle. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I made several errors. First, here's a better test {{{ {-# NOINLINE f #-} f :: Int -> a f x = error (show x) g :: Bool -> Bool -> Int -> Int g True True p = f p g False True p = p + 1 g b False p = g b True p }}} I've made `g` recursive to guarantee a w/w split. And I've given it a type signature to avoid the overloading. With HEAD we get {{{ $wg_s2kz [InlPrag=[0], Occ=LoopBreaker] :: Bool -> Bool -> GHC.Prim.Int# -> GHC.Prim.Int# $wg_s2kz = \ (w_s2kp :: Bool) (w_s2kq :: Bool) (ww_s2ku :: GHC.Prim.Int#) -> case w_s2kp of { False -> case w_s2kq of { False -> $wg_s2kz GHC.Types.False GHC.Types.True ww_s2ku; True -> GHC.Prim.+# ww_s2ku 1# }; True -> case w_s2kq of { False -> $wg_s2kz GHC.Types.True GHC.Types.True ww_s2ku; True -> case f @ Int (GHC.Types.I# ww_s2ku) of wild_00 { } } } }}} Note the re-boxing of the argument `(I# ww_s2ku)`, in the call to `f`. Second, here is a patch that does the job {{{ diff --git a/compiler/coreSyn/CoreUnfold.hs b/compiler/coreSyn/CoreUnfold.hs index f7e4265..efae22c 100644 --- a/compiler/coreSyn/CoreUnfold.hs +++ b/compiler/coreSyn/CoreUnfold.hs @@ -54,7 +54,7 @@ import DataCon import Literal import PrimOp import IdInfo -import BasicTypes ( Arity ) +import BasicTypes ( Arity, InlineSpec(..), inlinePragmaSpec ) import Type import PrelNames import TysPrim ( realWorldStatePrimTy ) @@ -997,6 +997,9 @@ certainlyWillInline dflags fn_info -- See Note [certainlyWillInline: INLINABLE] do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args }) | not (null args) -- See Note [certainlyWillInline: be careful of thunks] + , case inlinePragmaSpec (inlinePragInfo fn_info) of + NoInline -> False -- NOINLINE; do not say certainlyWillInline! + _ -> True -- INLINE, INLINABLE, or nothing , let arity = length args , size - (10 * (arity + 1)) <= ufUseThreshold dflags = Just (fn_unf { uf_src = InlineStable diff --git a/compiler/stranal/WorkWrap.hs b/compiler/stranal/WorkWrap.hs index d50bb22..9a0ccc5 100644 --- a/compiler/stranal/WorkWrap.hs +++ b/compiler/stranal/WorkWrap.hs @@ -283,12 +283,6 @@ tryWW :: DynFlags -- if two, then a worker and a -- wrapper. tryWW dflags fam_envs is_rec fn_id rhs - | isNeverActive inline_act - -- No point in worker/wrappering if the thing is never inlined! - -- Because the no-inline prag will prevent the wrapper ever - -- being inlined at a call site. - = return [ (new_fn_id, rhs) ] - | Just stable_unf <- certainlyWillInline dflags fn_info = return [ (fn_id `setIdUnfolding` stable_unf, rhs) ] -- See Note [Don't w/w INLINE things] @@ -305,7 +299,6 @@ tryWW dflags fam_envs is_rec fn_id rhs where fn_info = idInfo fn_id - inline_act = inlinePragmaActivation (inlinePragInfo fn_info) (wrap_dmds, res_info) = splitStrictSig (strictnessInfo fn_info) new_fn_id = zapIdUsedOnceInfo (zapIdUsageEnvInfo fn_id) }}} There are two components * Remove the `isNeverActtive` branch in `tryWW`, as described above * Make `certainlyWillInline` return `False` for any NOINLINE thing. I hadn't realised that this would be necessary but it's obviously true that a NOINLINE thing shouldn't reply `True` to `certainlyWillInline`. Now we get no re-boxing in the example. Would you like to try that? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D3046 Wiki Page: | -------------------------------------+------------------------------------- Changes (by gridaphobe): * status: new => patch * differential: => D3046 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner:
Type: bug | Status: closed
Priority: normal | Milestone: 8.2.1
Component: Compiler | Version: 8.0.1
Resolution: fixed | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): D3046
Wiki Page: |
-------------------------------------+-------------------------------------
Changes (by bgamari):
* status: patch => closed
* resolution: => fixed
* milestone: => 8.2.1
Comment:
This was committed as,
{{{
commit b572aadb20c2e41e2f6d7b48401bd0b4239ce9f8
Author: Eric Seidel

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D3046 Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): Hmm. I guess I am a bit late to the party, but I feel uneasy about this. On some level, to me, `NOINLINE` means “Dear compiler, in all uses of this, please treat this as a black box with precisely the interface I give here.” Allowing W/W moves a part of what the function does (e.g. taking a constructor apart) to the uses. I guess the test case did not come up with anything, but does this interfere with rules? A rule might mention the `NOINLINE` thing, but aftetr W/W, the worker remains, and a rule will no longer fire. `NOINLINE` is also very useful to prevent the compiler from optimizing test cases in our test suite too much (although I see that that should not be a driving factor). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D3046 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
I guess the test suite did not come up with anything, but does this interfere with rules? A rule might mention the NOINLINE thing, but aftetr W/W, the worker remains, and a rule will no longer fire.
That's a worry for all functions; but the wrapper always get an inline activation that allows it to inline only in the last phase. See `Note [Wrapper activation]` in `WorkWrap`. What you say has merit; but it was bad bad bad before we made this change. If there's actually a problem with this new story, let's see it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D3046 Wiki Page: | -------------------------------------+------------------------------------- Comment (by darchon): As A GHC API user, I recognise my "primitives" by their name, so it is absolutely critical that they do not get inlined. If this new behaviour ever becomes a problem for me, would anyone be opposed to putting this behaviour behind a dynflag which is enabled by default whenever the W/W transformation is enabled, but can be disabled by GHC API users like me? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13143: NOINLINE and worker/wrapper -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D3046 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): darchon: The new behaviour here is only active when w/w happens, so what you want is already the case. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13143#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC