[Git][ghc/ghc][wip/T26722] Add evals for strict data-con args in worker-functions
Simon Peyton Jones pushed to branch wip/T26722 at Glasgow Haskell Compiler / GHC Commits: ee290e4b by Simon Peyton Jones at 2026-01-06T17:41:33+00:00 Add evals for strict data-con args in worker-functions This fixes #26722, by adding an eval in a worker for arguments of strict data constructors, even if the function body uses them strictly. See (WIS1) in Note [Which Ids should be strictified] - - - - - 10 changed files: - compiler/GHC/Core/Opt/SpecConstr.hs - compiler/GHC/Core/Tidy.hs - compiler/GHC/Core/Utils.hs - compiler/GHC/Stg/EnforceEpt.hs - compiler/GHC/StgToCmm/Closure.hs - compiler/GHC/StgToCmm/Expr.hs - hadrian/src/Flavour.hs - + testsuite/tests/simplCore/should_compile/T26722.hs - + testsuite/tests/simplCore/should_compile/T26722.stderr - testsuite/tests/simplCore/should_compile/all.T Changes: ===================================== compiler/GHC/Core/Opt/SpecConstr.hs ===================================== @@ -2062,7 +2062,7 @@ mkSeqs seqees res_ty rhs = addEval :: Var -> CoreExpr -> CoreExpr addEval arg_id rhs -- Argument representing strict field and it's worth passing via cbv - | shouldStrictifyIdForCbv arg_id + | wantCbvForId arg_id = Case (Var arg_id) (localiseId arg_id) -- See (SCF1) in Note [SpecConstr and strict fields] res_ty ===================================== compiler/GHC/Core/Tidy.hs ===================================== @@ -38,7 +38,7 @@ import GHC.Utils.Outputable import GHC.Types.RepType (typePrimRep) import GHC.Utils.Panic import GHC.Types.Basic (isMarkedCbv, CbvMark (..)) -import GHC.Core.Utils (shouldUseCbvForId) +import GHC.Core.Utils ( wantCbvForId ) {- ************************************************************************ @@ -197,7 +197,7 @@ computeCbvInfo fun_id rhs isSimplePrimRep _ = False mkMark arg - | not $ shouldUseCbvForId arg = NotMarkedCbv + | not $ wantCbvForId arg = NotMarkedCbv -- We can only safely use cbv for strict arguments | (isStrUsedDmd (idDemandInfo arg)) , not (isDeadEndId fun_id) = MarkedCbv ===================================== compiler/GHC/Core/Utils.hs ===================================== @@ -58,7 +58,7 @@ module GHC.Core.Utils ( isJoinBind, -- * Tag inference - mkStrictFieldSeqs, shouldStrictifyIdForCbv, shouldUseCbvForId, + mkStrictFieldSeqs, wantCbvForId, -- * unsafeEqualityProof isUnsafeEqualityCase, @@ -2902,18 +2902,24 @@ Here comes the tricky part: If we make $wloop strict in both x/y and we get: }; end Rec } -Here both x and y are known to be tagged in the function body since we pass strict worker args using unlifted cbv. -This means the seqs on x and y both become no-ops and compared to the first version the seq on `y` disappears at runtime. - -The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated. -But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated -already at the call site because of the EPT Invariant! See Note [EPT enforcement] for more in this. -This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good. - -We only apply this when we think there is a benefit in doing so however. There are a number of cases in which -it would be useless to insert an extra seq. ShouldStrictifyIdForCbv tries to identify these to avoid churn in the +Here both x and y are known to be tagged in the function body since we pass +strict worker args using unlifted cbv. This means the seqs on x and y both +become no-ops (via (EPT-codegen) in Not [EPT enforcement]) and, compared to the +first version, the seq on `y` disappears at runtime. + +The downside is that the caller of $wfoo potentially has to evaluate `y` once if +we can't prove it isn't already evaluated. But y coming out of a strict field +is in WHNF so safe to evaluated. And most of the time it will be properly +tagged+evaluated already at the call site because of the EPT Invariant! See Note +[EPT enforcement] for more in this. This makes GHC itself around 1% faster +despite doing slightly more work! So this is generally quite good. + +We only apply this when we think there is a benefit in doing so however. There +are a number of cases in which it would be useless to insert an extra +seq. `wantCbvForId` tries to identify these to avoid churn in the simplifier. See Note [Which Ids should be strictified] for details on this. -} + mkStrictFieldSeqs :: [(Id,StrictnessMark)] -> CoreExpr -> (CoreExpr) mkStrictFieldSeqs args rhs = foldr addEval rhs args @@ -2923,7 +2929,7 @@ mkStrictFieldSeqs args rhs = addEval (arg_id,arg_cbv) (rhs) -- Argument representing strict field. | isMarkedStrict arg_cbv - , shouldStrictifyIdForCbv arg_id + , wantCbvForId arg_id -- Make sure to remove unfoldings here to avoid the simplifier dropping those for OtherCon[] unfoldings. = Case (Var $! zapIdUnfolding arg_id) arg_id case_ty ([Alt DEFAULT [] rhs]) -- Normal argument @@ -2943,87 +2949,99 @@ There are multiple reasons why we might not want to insert a seq in the rhs to strictify a functions argument: 1) The argument doesn't exist at runtime. - -For zero width types (like Types) there is no benefit as we don't operate on them -at runtime at all. This includes things like void#, coercions and state tokens. + For zero width types (like Types) there is no benefit as we don't operate on them + at runtime at all. This includes things like void#, coercions and state tokens. 2) The argument is a unlifted type. - -If the argument is a unlifted type the calling convention already is explicitly -cbv. This means inserting a seq on this argument wouldn't do anything as the seq -would be a no-op *and* it wouldn't affect the calling convention. + If the argument is a unlifted type the calling convention already is explicitly + cbv. This means inserting a seq on this argument wouldn't do anything as the seq + would be a no-op *and* it wouldn't affect the calling convention. 3) The argument is absent. + If the argument is absent in the body there is no advantage to it being passed as + cbv to the function. The function won't ever look at it so we don't save any work. -If the argument is absent in the body there is no advantage to it being passed as -cbv to the function. The function won't ever look at it so we don't safe any work. - -This mostly happens for join point. For example we might have: - - data T = MkT ![Int] [Char] - f t = case t of MkT xs{strict} ys-> snd (xs,ys) - -and abstract the case alternative to: + This mostly happens for join points. For example we might have: - f t = join j1 = \xs ys -> snd (xs,ys) - in case t of MkT xs{strict} ys-> j1 xs xy + data T = MkT ![Int] [Char] + f t = case t of MkT xs{strict} ys-> snd (xs,ys) -While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to. -In short a absent demand means neither our RHS, nor any function we pass the argument -to will inspect it. So there is no work to be saved by forcing `xs` early. + and abstract the case alternative to: -NB: There is an edge case where if we rebox we *can* end up seqing an absent value. -Note [Absent fillers] has an example of this. However this is so rare it's not worth -caring about here. + f t = join j1 = \xs ys -> snd (xs,ys) + in case t of MkT xs{strict} ys-> j1 xs xy -4) The argument is already strict. + While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to. + In short a absent demand means neither our RHS, nor any function we pass the argument + to will inspect it. So there is no work to be saved by forcing `xs` early. -Consider this code: - - data T = MkT ![Int] - f t = case t of MkT xs{strict} -> reverse xs - -The `xs{strict}` indicates that `xs` is used strictly by the `reverse xs`. -If we do a w/w split, and add the extra eval on `xs`, we'll get - - $wf xs = - case xs of xs1 -> - let t = MkT xs1 in - case t of MkT xs2 -> reverse xs2 - -That's not wrong; but the w/w body will simplify to - - $wf xs = case xs of xs1 -> reverse xs1 - -and now we'll drop the `case xs` because `xs1` is used strictly in its scope. -Adding that eval was a waste of time. So don't add it for strictly-demanded Ids. + NB: There is an edge case where if we rebox we *can* end up seqing an absent value. + Note [Absent fillers] has an example of this. However this is so rare it's not worth + caring about here. 5) Functions - -Functions are tricky (see Note [TagInfo of functions] in EnforceEpt). -But the gist of it even if we make a higher order function argument strict -we can't avoid the tag check when it's used later in the body. -So there is no benefit. + Functions are tricky (see Note [TagInfo of functions] in EnforceEpt). + But the gist of it even if we make a higher order function argument strict + we can't avoid the tag check when it's used later in the body. + So there is no benefit. + +Wrinkles: + +(WIS1) You might have thought that we can omit the eval if the argument is used + strictly demanded in the body. But you'd be wrong. Consider this code: + data T = MkT ![Int] + f t = case t of MkT xs{Dmd=STR} -> reverse xs + + The `xs{Dmd=STR}` indicates that `xs` is used strictly by the `reverse xs`. + If we do a w/w split, and add the extra eval on `xs`, we'll get + $wf xs = case xs of xs1 -> + let t = MkT xs1 in + case t of MkT xs2 -> reverse xs2 + + That's not wrong; but you might wonder if the eval on `xs` is needed + when it is certainly evaluated by the `reverse`. But yes, it is (#26722): + g s True t = f s t t + g s False t = g s True t + + f True (MkT xs) t = f False (MkT xs) t + f False (MkT xs) _ = xs + + After worker/wrapper we get: + g s b t = case t of MkT ww -> $wg s b ww + $wg s ds ww = case ds of { + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg } + True -> let { t1 = MkT ww } in f s t1 t1 } + + We must make `f` inline inside `$wg`, because `f` too is ww'd, and we + don't want to rebox `t1` before passing it to `f`. BUT while `t1` + looks like a HNF, `exprIsHNF` will say False because `MkT` is strict + and `ww` isn't evaluated. So `f` doesn't inline and we get lots of + reboxing. + + The Right Thing to to is to add the eval for the data con argument: + $wg s ds ww = case ww of ww' { DEFAULT -> + case ds of { + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg } + True -> let { t1 = MkT ww' } in f s t1 t1 } } + + Now `t1` will be a HNF, and `f` will inline, and we get + $wg s ds ww = case ww of ww' { DEFAULT -> + case ds of { + False -> Bar.$wg s True ww' + True -> $wf s ww' + + (Ultimately `$wg` will be a CBV function, so that `case ww` will be a + no-op: see (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt.) -} --- | Do we expect there to be any benefit if we make this var strict --- in order for it to get treated as as cbv argument? --- See Note [Which Ids should be strictified] --- See Note [CBV Function Ids] for more background. -shouldStrictifyIdForCbv :: Var -> Bool -shouldStrictifyIdForCbv = wantCbvForId False - --- Like shouldStrictifyIdForCbv but also wants to use cbv for strict args. -shouldUseCbvForId :: Var -> Bool -shouldUseCbvForId = wantCbvForId True -- When we strictify we want to skip strict args otherwise the logic is the same --- as for shouldUseCbvForId so we common up the logic here. +-- as for wantCbvForId so we common up the logic here. -- Basically returns true if it would be beneficial for runtime to pass this argument -- as CBV independent of weither or not it's correct. E.g. it might return true for lazy args -- we are not allowed to force. -wantCbvForId :: Bool -> Var -> Bool -wantCbvForId cbv_for_strict v +wantCbvForId :: Var -> Bool +wantCbvForId v -- Must be a runtime var. -- See Note [Which Ids should be strictified] point 1) | isId v @@ -3037,9 +3055,6 @@ wantCbvForId cbv_for_strict v , not $ isFunTy ty -- If the var is strict already a seq is redundant. -- See Note [Which Ids should be strictified] point 4) - , not (isStrictDmd dmd) || cbv_for_strict - -- If the var is absent a seq is almost always useless. - -- See Note [Which Ids should be strictified] point 3) , not (isAbsDmd dmd) = True | otherwise ===================================== compiler/GHC/Stg/EnforceEpt.hs ===================================== @@ -140,9 +140,11 @@ Afterwards, the *EPT rewriter* inserts the actual evals realising Upcasts. Implementation -------------- -* EPT analysis is implemented in GHC.Stg.EnforceEpt.inferTags. +(EPT-anal) EPT analysis is implemented in `GHC.Stg.EnforceEpt.inferTags.` It attaches its result to /binders/, not occurrence sites. -* The EPT rewriter establishes the EPT invariant by inserting evals. That is, if + +(EPT-rewrite) The EPT rewriter, `GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds`, + establishes the EPT invariant by inserting evals. That is, if (a) a binder x is used to * construct a strict field (`SP x y`), or * passed as a CBV argument (`$wf x`), @@ -152,17 +154,27 @@ Implementation case x of x' { __ DEFAULT -> SP x' y }. case x of x' { __ DEFAULT -> $wf x' }. (Recall that the case binder x' is always EPT.) - This is implemented in GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds. + This pass also propagates the EPTness from binders to occurrences. + It is sound to insert evals on strict fields (Note [Strict fields in Core]), and on CBV arguments as well (Note [CBV Function Ids]). -* We also export the EPTness of top level bindings to allow this optimisation + +(EPT-codegen) Finally, code generation for (case x of alts) skips the thunk check + when `x` is EPT. This is done (a bit indirectly) thus: + * GHC.StgToCmm.Expr.cgCase: builds a `sequel`, and recurses into `cgExpr` on `x`. + * When `cgExpr` sees a `x` goes to `cgIdApp`, which uses `getCallMethod`. + * Then `getCallMethod` sees that `x` is EPT (via `idTagSigMaybe`), and + returns `InferredReturnIt`. + * Now `cgIdApp` can jump straight to the case-alternative switch in the `sequel` + constructed by `cgCase`. + +(EPT-export) We also export the EPTness of top level bindings to allow this optimisation to work across module boundaries. + NB: The EPT Invariant *must* be upheld, regardless of the optimisation level; hence EPTness is practically part of the internal ABI of a strict data constructor or CBV function. Note [CBV Function Ids] contains the details. -* Finally, code generation skips the thunk check when branching on binders that - are EPT. This is done by `cgExpr`/`cgCase` in the backend. Evaluation ---------- ===================================== compiler/GHC/StgToCmm/Closure.hs ===================================== @@ -617,12 +617,15 @@ getCallMethod cfg name id (LFThunk _ _ updatable std_form_info is_fun) getCallMethod cfg name id (LFUnknown might_be_a_function) n_args _cg_locs _self_loop_info | n_args == 0 , Just sig <- idTagSig_maybe id - , isTaggedSig sig -- Infered to be already evaluated by EPT analysis - -- When profiling we must enter all potential functions to make sure we update the SCC - -- even if the function itself is already evaluated. + , isTaggedSig sig -- This `id` is evaluated and properly tagged; no need to enter it + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt + + -- When profiling we must enter all potential functions to make sure we update + -- the SCC even if the function itself is already evaluated. -- See Note [Evaluating functions with profiling] in rts/Apply.cmm , not (profileIsProfiling (stgToCmmProfile cfg) && might_be_a_function) - = InferedReturnIt -- See Note [EPT enforcement] + + = InferedReturnIt -- See (EPT-codegen) in Note [EPT enforcement] | might_be_a_function = SlowCall ===================================== compiler/GHC/StgToCmm/Expr.hs ===================================== @@ -1053,6 +1053,7 @@ cgIdApp fun_id args = do | otherwise -> emitReturn [fun] -- A value infered to be in WHNF, so we can just return it. + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt InferedReturnIt | isZeroBitTy (idType fun_id) -> trace >> emitReturn [] | otherwise -> trace >> assertTag >> ===================================== hadrian/src/Flavour.hs ===================================== @@ -70,7 +70,8 @@ flavourTransformers = M.fromList , "fully_static" =: fullyStatic , "host_fully_static" =: hostFullyStatic , "collect_timings" =: collectTimings - , "assertions" =: enableAssertions + , "assertions" =: enableAssertions Stage2 + , "assertions_stage1" =: enableAssertions Stage1 , "debug_ghc" =: debugGhc Stage2 , "debug_stage1_ghc" =: debugGhc Stage1 , "lint" =: enableLinting @@ -394,11 +395,11 @@ enableLateCCS = addArgs ? arg "-fprof-late" -- | Enable assertions for the stage2 compiler -enableAssertions :: Flavour -> Flavour -enableAssertions flav = flav { ghcDebugAssertions = f } +enableAssertions :: Stage -> Flavour -> Flavour +enableAssertions stage flav = flav { ghcDebugAssertions = f } where - f Stage2 = True - f st = ghcDebugAssertions flav st + f s | s == stage = True + | otherwise = ghcDebugAssertions flav s -- | Build the stage3 compiler using the non-moving GC. enableBootNonmovingGc :: Flavour -> Flavour ===================================== testsuite/tests/simplCore/should_compile/T26722.hs ===================================== @@ -0,0 +1,9 @@ +module T26722 where + +data T = MkT ![Int] + +g s True t = f s t t +g s False t = g s True t + +f True (MkT xs) t = f False (MkT xs) t +f False (MkT xs) _ = xs ===================================== testsuite/tests/simplCore/should_compile/T26722.stderr ===================================== @@ -0,0 +1 @@ + \ No newline at end of file ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -563,3 +563,6 @@ test('T26115', [grep_errmsg(r'DFun')], compile, ['-O -ddump-simpl -dsuppress-uni test('T26116', normal, compile, ['-O -ddump-rules']) test('T26117', [grep_errmsg(r'==')], compile, ['-O -ddump-simpl -dsuppress-uniques']) test('T26349', normal, compile, ['-O -ddump-rules']) + +# T26722: there should be no reboxing in $wg +test('T26722', [grep_errmsg(r'SPEC')], compile, ['-O -dno-typeable-binds']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/ee290e4bfc327e60346f927f618cce42... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/ee290e4bfc327e60346f927f618cce42... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Simon Peyton Jones (@simonpj)