[Git][ghc/ghc][master] 2 commits: Add evals for strict data-con args in worker-functions
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: c56567ec by Simon Peyton Jones at 2026-01-15T23:19:04+00:00 Add evals for strict data-con args in worker-functions This fixes #26722, by adding an eval in a worker for arguments of strict data constructors, even if the function body uses them strictly. See (WIS1) in Note [Which Ids should be strictified] I took the opportunity to make substantial improvements in the documentation for call-by-value functions. See especially Note [CBV Function Ids: overview] in GHC.Types.Id.Info Note [Which Ids should be CBV candidates?] ditto Note [EPT enforcement] in GHC.Stg.EnforceEpt among others. - - - - - 9719ce5d by Simon Peyton Jones at 2026-01-15T23:19:04+00:00 Improve `interestingArg` This function analyses a function's argument to see if it is interesting enough to deserve an inlining discount. Improvements for * LitRubbish arguments * exprIsExpandable arguments See Note [Interesting arguments] which is substantially rewritten. - - - - - 20 changed files: - compiler/GHC/Core/Opt/Arity.hs - compiler/GHC/Core/Opt/Simplify/Utils.hs - compiler/GHC/Core/Opt/SpecConstr.hs - compiler/GHC/Core/Opt/WorkWrap.hs - compiler/GHC/Core/Opt/WorkWrap/Utils.hs - compiler/GHC/Core/Tidy.hs - compiler/GHC/Core/Unfold.hs - compiler/GHC/Core/Utils.hs - compiler/GHC/CoreToStg/Prep.hs - compiler/GHC/Stg/EnforceEpt.hs - compiler/GHC/Stg/Lint.hs - compiler/GHC/StgToCmm/Closure.hs - compiler/GHC/StgToCmm/Expr.hs - compiler/GHC/Types/Id.hs - compiler/GHC/Types/Id/Info.hs - compiler/GHC/Types/Id/Make.hs - testsuite/tests/simplCore/should_compile/T18013.stderr - + testsuite/tests/simplCore/should_compile/T26722.hs - + testsuite/tests/simplCore/should_compile/T26722.stderr - testsuite/tests/simplCore/should_compile/all.T Changes: ===================================== compiler/GHC/Core/Opt/Arity.hs ===================================== @@ -2515,7 +2515,7 @@ eta-reduce that are specific to Core and GHC: See Note [Eta expanding primops]. (W) We may not undersaturate StrictWorkerIds. - See Note [CBV Function Ids] in GHC.Types.Id.Info. + See Note [CBV Function Ids: overview] in GHC.Types.Id.Info. Here is a list of historic accidents surrounding unsound eta-reduction: @@ -2848,7 +2848,7 @@ cantEtaReduceFun fun || (isJust (idCbvMarks_maybe fun)) -- (W) -- Don't undersaturate StrictWorkerIds. - -- See Note [CBV Function Ids] in GHC.Types.Id.Info. + -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info. {- ********************************************************************* ===================================== compiler/GHC/Core/Opt/Simplify/Utils.hs ===================================== @@ -982,15 +982,58 @@ But we don't regard (f x y) as interesting, unless f is unsaturated. If it's saturated and f hasn't inlined, then it's probably not going to now! -Note [Conlike is interesting] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f d = ...((*) d x y)... - ... f (df d')... -where df is con-like. Then we'd really like to inline 'f' so that the -rule for (*) (df d) can fire. To do this - a) we give a discount for being an argument of a class-op (eg (*) d) - b) we say that a con-like argument (eg (df d)) is interesting +Wrinkles: + +(IA1) Conlike is interesting. + Consider + f d = ...((*) d x y)... + ... f (df d')... + where df is con-like. Then we'd really like to inline 'f' so that the + rule for (*) (df d) can fire. To do this + a) we give a discount for being an argument of a class-op (eg (*) d) + b) we say that a con-like argument (eg (df d)) is interesting + +(IA2) OtherCon. + interestingArg returns + (a) NonTrivArg for an arg with an OtherCon [] unfolding + (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding. + + Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was + inlining a pretty big function when all we knew was that its arguments + were evaluated, nothing more. That in turn make the enclosing function + too big to inline elsewhere. + + Reason for (b): we want to inline integerCompare here + integerLt# :: Integer -> Integer -> Bool# + integerLt# (IS x) (IS y) = x <# y + integerLt# x y | LT <- integerCompare x y = 1# + integerLt# _ _ = 0# + +(IA3) Rubbish literals. + In a worker we might see + $wfoo x = let y = RUBBISH in + ...(g y True)... + where `g` has a wrapper that discards its first argment. We really really + want to inline g's wrapper, to expose that it discards its RUBBISH arg. + That may not happen if RUBBISH looks like TrivArg, so we use NonTrivArg + instead. See #26722. (This reverses the plan in #20035, but the problem + reported there appears to have gone away.) + +(IA4) Consider a call `f (g x)`. If `f` has a an argument discount on its argument, + then f's body scrutinises its argument in a `case` expression, or perhaps applies + it. We give the arg `(g x)` an ArgSummary of `NonTrivArg` so that `f` has a bit + of encouragment to inline in these cases. + + Now consider `let y = g x in f y`. Now we have to look through y's unfolding. + When should we do so? Suppose we did inline `f` so we ended up with + let y = g x in ...(case y of alts)... + Then we'll call `exprIsConApp_maybe` on `y`; and that looks through "expandable" + unfoldings; indeed that's the whole purpose of `exprIsExpanadable`. See + Note [exprIsExpandable] in GHC.Core.Utils. + + Conclusion: `interestingArg` should give some encouragement (NonTrivArg) to `f` + when the argument is expandable. Hence `uf_expandable` in the `Var` case. + -} interestingArg :: SimplEnv -> CoreExpr -> ArgSummary @@ -1005,7 +1048,7 @@ interestingArg env e = go env 0 e ContEx tvs cvs ids e -> go (setSubstEnv env tvs cvs ids) n e go _ _ (Lit l) - | isLitRubbish l = TrivArg -- Leads to unproductive inlining in WWRec, #20035 + | isLitRubbish l = NonTrivArg -- See (IA3) in Note [Interesting arguments] | otherwise = ValueArg go _ _ (Type _) = TrivArg go _ _ (Coercion _) = TrivArg @@ -1027,45 +1070,29 @@ interestingArg env e = go env 0 e go_var n v | isConLikeId v = ValueArg -- Experimenting with 'conlike' rather that -- data constructors here - -- DFuns are con-like; see Note [Conlike is interesting] + -- DFuns are con-like; + -- see (IA1) in Note [Interesting arguments] | idArity v > n = ValueArg -- Catches (eg) primops with arity but no unfolding | n > 0 = NonTrivArg -- Saturated or unknown call | otherwise -- n==0, no value arguments; look for an interesting unfolding = case idUnfolding v of OtherCon [] -> NonTrivArg -- It's evaluated, but that's all we know OtherCon _ -> ValueArg -- Evaluated and we know it isn't these constructors - -- See Note [OtherCon and interestingArg] + -- See (IA2) in Note [Interesting arguments] DFunUnfolding {} -> ValueArg -- We konw that idArity=0 CoreUnfolding{ uf_cache = cache } | uf_is_conlike cache -> ValueArg -- Includes constructor applications - | uf_is_value cache -> NonTrivArg -- Things like partial applications + | uf_expandable cache -> NonTrivArg -- See (IA4) | otherwise -> TrivArg BootUnfolding -> TrivArg NoUnfolding -> TrivArg -{- Note [OtherCon and interestingArg] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -interstingArg returns - (a) NonTrivArg for an arg with an OtherCon [] unfolding - (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding. - -Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was -inlining a pretty big function when all we knew was that its arguments -were evaluated, nothing more. That in turn make the enclosing function -too big to inline elsewhere. - -Reason for (b): we want to inline integerCompare here - integerLt# :: Integer -> Integer -> Bool# - integerLt# (IS x) (IS y) = x <# y - integerLt# x y | LT <- integerCompare x y = 1# - integerLt# _ _ = 0# -************************************************************************ +{- ********************************************************************* * * SimplMode * * -************************************************************************ --} +********************************************************************* -} updModeForStableUnfoldings :: ActivationGhc -> SimplMode -> SimplMode -- See Note [The environments of the Simplify pass] ===================================== compiler/GHC/Core/Opt/SpecConstr.hs ===================================== @@ -1994,7 +1994,7 @@ spec_one env fn arg_bndrs body (call_pat, rule_number) spec_arity = count isId spec_lam_args spec_join_arity | isJoinId fn = JoinPoint (length spec_call_args) | otherwise = NotJoinPoint - spec_id = asWorkerLikeId $ + spec_id = setCbvCandidate $ mkLocalId spec_name ManyTy spec_id_ty -- See Note [Transfer strictness] `setIdDmdSig` spec_sig @@ -2065,7 +2065,7 @@ mkSeqs seqees res_ty rhs = addEval :: Var -> CoreExpr -> CoreExpr addEval arg_id rhs -- Argument representing strict field and it's worth passing via cbv - | shouldStrictifyIdForCbv arg_id + | wantCbvForId arg_id = Case (Var arg_id) (localiseId arg_id) -- See (SCF1) in Note [SpecConstr and strict fields] res_ty ===================================== compiler/GHC/Core/Opt/WorkWrap.hs ===================================== @@ -848,7 +848,7 @@ mkWWBindPair ww_opts fn_id fn_info fn_args fn_body work_uniq div -- worker is join point iff wrapper is join point -- (see Note [Don't w/w join points for CPR]) - work_id = asWorkerLikeId $ + work_id = setCbvCandidate $ mkWorkerId work_uniq fn_id (exprType work_rhs) `setIdOccInfo` occInfo fn_info -- Copy over occurrence info from parent ===================================== compiler/GHC/Core/Opt/WorkWrap/Utils.hs ===================================== @@ -914,23 +914,26 @@ C) Unlift *any* (non-boot exported) functions arguments if they are strict. an impedance matcher function. Leading to massive code bloat. Essentially we end up creating a impromptu wrapper function wherever we wouldn't inline the wrapper with a W/W approach. - ~ There is the option of achieving this without eta-expansion if we instead expand - the partial application code to check for demands on the calling convention and - for it to evaluate the arguments. The main downsides there would be the complexity - of the implementation and that it carries a certain overhead even for functions who - don't take advantage of this functionality. I haven't tried this approach because it's - not trivial to implement and doing W/W splits seems to work well enough. - -Currently we use the first approach A) by default, with a flag that allows users to fall back to the -more aggressive approach B). - -I also tried the third approach C) using eta-expansion at call sites to avoid modifying the PAP-handling -code which wasn't fruitful. See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903. -We could still try to do C) in the future by having PAP calls which will evaluate the required arguments -before calling the partially applied function. But this would be neither a small nor simple change so we -stick with A) and a flag for B) for now. - -See also Note [EPT enforcement] and Note [CBV Function Ids] + ~ There is the option of achieving this without eta-expansion if we instead + expand the partial application code to check for demands on the calling + convention and for it to evaluate the arguments. The main downsides there + would be the complexity of the implementation and that it carries a + certain overhead even for functions who don't take advantage of this + functionality. I haven't tried this approach because it's not trivial to + implement and doing W/W splits seems to work well enough. + +Currently we use the first approach A) by default, with a flag that allows users +to fall back to the more aggressive approach B). + +I also tried the third approach C) using eta-expansion at call sites to avoid +modifying the PAP-handling code which wasn't fruitful. See +https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903. We could +still try to do C) in the future by having PAP calls which will evaluate the +required arguments before calling the partially applied function. But this would +be neither a small nor simple change so we stick with A) and a flag for B) for +now. + +See also Note [EPT enforcement] and Note [CBV Function Ids: overview] Note [Worker/wrapper for strict arguments] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -954,7 +957,7 @@ an "eval" (see `GHC.StgToCmm.Expr.cgCase`). A call (f (a:as)) will have the wrapper inlined, and will drop the `case x`, so no eval happens at all. -The worker `$wf` is a CBV function (see `Note [CBV Function Ids]` +The worker `$wf` is a CBV function (see `Note [CBV Function Ids: overview]` in GHC.Types.Id.Info) and the code generator guarantees that every call to `$wf` has a properly tagged argument (see `GHC.Stg.EnforceEpt.Rewrite`). @@ -1044,7 +1047,7 @@ mkWWstr_one opts arg str_mark = DontUnbox | isStrictDmd arg_dmd || isMarkedStrict str_mark - , wwUseForUnlifting opts -- See Note [CBV Function Ids] + , wwUseForUnlifting opts -- See Note [WW for calling convention] , not (isFunTy arg_ty) , not (isUnliftedType arg_ty) -- Already unlifted! -- NB: function arguments have a fixed RuntimeRep, @@ -1311,12 +1314,13 @@ Needless to say, there are some wrinkles: But that also means we emit a rubbish lit for other args that have cardinality 'C_10' (say, the arg to a bottoming function) where we could've used an error-thunk. - NB from Andreas: But I think using an error thunk there would be dodgy no matter what - for example if we decide to pass the argument to the bottoming function cbv. - As we might do if the function in question is a worker. - See Note [CBV Function Ids] in GHC.Types.Id.Info. So I just left the strictness check - in place on top of threading through the marks from the constructor. It's a *really* cheap - and easy check to make anyway. + + NB from Andreas: But I think using an error thunk there would be dodgy no + matter what for example if we decide to pass the argument to the bottoming + function cbv. As we might do if the function in question is a worker. See + Note [CBV Function Ids: overview] in GHC.Types.Id.Info. So I just left the + strictness check in place on top of threading through the marks from the + constructor. It's a *really* cheap and easy check to make anyway. (AF3) We can only emit a LitRubbish if the arg's type `arg_ty` is mono-rep, e.g. of the form `TYPE rep` where `rep` is not (and doesn't contain) a variable. ===================================== compiler/GHC/Core/Tidy.hs ===================================== @@ -38,7 +38,7 @@ import GHC.Utils.Outputable import GHC.Types.RepType (typePrimRep) import GHC.Utils.Panic import GHC.Types.Basic (isMarkedCbv, CbvMark (..)) -import GHC.Core.Utils (shouldUseCbvForId) +import GHC.Core.Utils ( wantCbvForId ) {- ************************************************************************ @@ -70,52 +70,52 @@ tidyBind env (Rec prs) (env', Rec (zip bndrs' rhss')) --- Note [Attaching CBV Marks to ids] --- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --- See Note [CBV Function Ids] for the *why*. --- Before tidy, we turn all worker functions into worker like ids. --- This way we can later tell if we can assume the existence of a wrapper. This also applies to --- specialized versions of functions generated by SpecConstr for which we, in a sense, --- consider the unspecialized version to be the wrapper. --- During tidy we take the demands on the arguments for these ids and compute --- CBV (call-by-value) semantics for each individual argument. --- The marks themselves then are put onto the function id itself. --- This means the code generator can get the full calling convention by only looking at the function --- itself without having to inspect the RHS. --- --- The actual logic is in computeCbvInfo and takes: --- * The function id --- * The functions rhs --- And gives us back the function annotated with the marks. --- We call it in: --- * tidyTopPair for top level bindings --- * tidyBind for local bindings. --- --- Not that we *have* to look at the untidied rhs. --- During tidying some knot-tying occurs which can blow up --- if we look at the post-tidy types of the arguments here. --- However we only care if the types are unlifted and that doesn't change during tidy. --- so we can just look at the untidied types. --- --- If the id is boot-exported we don't use a cbv calling convention via marks, --- as the boot file won't contain them. Which means code calling boot-exported --- ids might expect these ids to have a vanilla calling convention even if we --- determine a different one here. --- To be able to avoid this we pass a set of boot exported ids for this module around. --- For non top level ids we can skip this. Local ids are never boot-exported --- as boot files don't have unfoldings. So there this isn't a concern. --- See also Note [CBV Function Ids] - - --- See Note [CBV Function Ids] +{- Note [Attaching CBV Marks to ids] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +See Note [CBV Function Ids: overview] for what is happening here. + +During tidy we take the demands on the arguments for any CBV-candidates and +compute CBV (call-by-value) semantics for each individual argument. The marks +themselves then are put onto the function id itself. This means the code +generator can get the full calling convention by only looking at the function +itself without having to inspect the RHS. + +The actual logic is in computeCbvInfo and takes: + * The function id + * The functions rhs +And gives us back the function annotated with the marks. We call it in: + * tidyTopPair for top level bindings + * tidyBind for local bindings. + +Wrinkles + +(ACBV1) Note that we *have* to look at the untidied rhs. During tidying some + knot-tying occurs which can blow up if we look at the post-tidy types of the + arguments here. However we only care if the types are unlifted and that + doesn't change during tidy. so we can just look at the untidied types. + +(ACBV2) If the id is boot-exported we don't use a cbv calling convention via + marks, as the boot file won't contain them. Which means code calling + boot-exported ids might expect these ids to have a vanilla calling convention + even if we determine a different one here. + + To be able to avoid this we pass a set of boot exported ids for this module + around. For non top level ids we can skip this. Local ids are never + boot-exported as boot files don't have unfoldings. So there this isn't a + concern. See also Note [CBV Function Ids: overview] +-} + tidyCbvInfoTop :: HasDebugCallStack => NameSet -> Id -> CoreExpr -> Id +-- See Note [CBV Function Ids: overview] tidyCbvInfoTop boot_exports id rhs - -- Can't change calling convention for boot exported things - | elemNameSet (idName id) boot_exports = id - | otherwise = computeCbvInfo id rhs + | elemNameSet (idName id) boot_exports + = id -- Can't change calling convention for boot exported things + -- See (ACBV2) in Note [Attaching CBV Marks to ids] + | otherwise + = computeCbvInfo id rhs --- See Note [CBV Function Ids] tidyCbvInfoLocal :: HasDebugCallStack => Id -> CoreExpr -> Id +-- See Note [CBV Function Ids: overview] tidyCbvInfoLocal id rhs = computeCbvInfo id rhs -- | For a binding we: @@ -124,9 +124,8 @@ tidyCbvInfoLocal id rhs = computeCbvInfo id rhs -- - It's argument to a worker and demanded strictly -- - Unless it's an unlifted type already -- * Update the id --- See Note [CBV Function Ids] +-- See Note [CBV Function Ids: overview] -- See Note [Attaching CBV Marks to ids] - computeCbvInfo :: HasCallStack => Id -- The function -> CoreExpr -- It's RHS @@ -172,7 +171,7 @@ computeCbvInfo fun_id rhs | otherwise = -- pprTraceDebug "computeCbvInfo: Worker seems to take unboxed tuple/sum types!" -- (ppr fun_id <+> ppr rhs) - asNonWorkerLikeId fun_id + removeCbvCandidate fun_id -- We don't set CBV marks on functions which take unboxed tuples or sums as -- arguments. Doing so would require us to compute the result of unarise @@ -197,7 +196,7 @@ computeCbvInfo fun_id rhs isSimplePrimRep _ = False mkMark arg - | not $ shouldUseCbvForId arg = NotMarkedCbv + | not $ wantCbvForId arg = NotMarkedCbv -- We can only safely use cbv for strict arguments | (isStrUsedDmd (idDemandInfo arg)) , not (isDeadEndId fun_id) = MarkedCbv ===================================== compiler/GHC/Core/Unfold.hs ===================================== @@ -152,10 +152,12 @@ updateCaseScaling n opts = opts { unfoldingCaseScaling = n } updateReportPrefix :: Maybe String -> UnfoldingOpts -> UnfoldingOpts updateReportPrefix n opts = opts { unfoldingReportPrefix = n } -data ArgSummary = TrivArg -- Nothing interesting - | NonTrivArg -- Arg has structure - | ValueArg -- Arg is a con-app or PAP - -- ..or con-like. Note [Conlike is interesting] +data ArgSummary + = TrivArg -- Nothing interesting + | NonTrivArg -- Arg has structure + | ValueArg -- Arg is a con-app or PAP + -- ..or con-like. See (IA1) in Note [Interesting arguments] + -- in GHC.Core.Opt.Simplify.Utils instance Outputable ArgSummary where ppr TrivArg = text "TrivArg" @@ -776,7 +778,7 @@ litSize _other = 0 -- Must match size of nullary constructors -- (eg via case binding) classOpSize :: UnfoldingOpts -> Class -> [Id] -> [CoreExpr] -> ExprSize --- See Note [Conlike is interesting] +-- See (IA1) in Note [Interesting arguments] in GHC.Core.Opt.Simplify.Utils classOpSize opts cls top_args args | isUnaryClass cls = sizeZero -- See (UCM4) in Note [Unary class magic] in GHC.Core.TyCon ===================================== compiler/GHC/Core/Utils.hs ===================================== @@ -59,7 +59,7 @@ module GHC.Core.Utils ( isJoinBind, -- * Tag inference - mkStrictFieldSeqs, shouldStrictifyIdForCbv, shouldUseCbvForId, + mkStrictFieldSeqs, wantCbvForId, -- * unsafeEqualityProof isUnsafeEqualityCase, @@ -1509,6 +1509,23 @@ going to put up with this, because the previous more aggressive inlining (which treated 'noFactor' as work-free) was duplicating primops, which in turn was making inner loops of array calculations runs slow (#5623) +Wrinkles + +(WF1) Strict constructor fields. We regard (K x) as work-free even if + K is a strict data constructor (see Note [Strict fields in Core]) + data T a = K !a + If we have + let t = K x in ...(case t of K y -> blah)... + we want to treat t's binding as expandable so that `exprIsConApp_maybe` + will look through its unfolding. (NB: exprIsWorkFree implies + exprIsExpandable.) + + Note, however, that because K is strict, after inlining we'll get a leftover + eval on x, which may or may not disappear + let t = K x in ...(case x of y -> blah)... + We put up with this extra eval: in effect we count duplicating the eval as + work-free. + Note [Case expressions are work-free] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Are case-expressions work-free? Consider @@ -1650,7 +1667,8 @@ isWorkFreeApp fn n_val_args = True | otherwise = case idDetails fn of - DataConWorkId {} -> True + DataConWorkId {} -> True -- Even if the data constructor is strict + -- See (WF1) in Note [exprIsWorkFree] PrimOpId op _ -> primOpIsWorkFree op _ -> False @@ -1751,6 +1769,8 @@ expansion. Specifically: duplicate the (a +# b) primop, which we should not do lightly. (It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.) +NB: exprIsWorkFree implies exprIsExpandable. + Note [isExpandableApp: bottoming functions] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It's important that isExpandableApp does not respond True to bottoming @@ -2901,29 +2921,38 @@ dumpIdInfoOfProgram dump_locals ppr_id_info binds = vcat (map printId ids) {- Note [Call-by-value for worker args] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we unbox a constructor with strict fields we want to -preserve the information that some of the arguments came -out of strict fields and therefore should be already properly -tagged, however we can't express this directly in core. - -Instead what we do is generate a worker like this: +If we unbox a constructor with strict fields we want to preserve the information +that some of the arguments came out of strict fields and therefore should be +already evaluated and properly tagged (EPT) throughout the body of the +function. We express this fact in Core like this: data T = MkT A !B foo = case T of MkT a b -> $wfoo a b $wfoo a b = case b of b' -> rhs[b/b'] + ^^^^ The "extra eval" + +Now + * Throughout `rhs` the Simplifier can see that `b` is EPT, and can (say) + drop evals on `b`. -This makes the worker strict in b causing us to use a more efficient -calling convention for `b` where the caller needs to ensure `b` is -properly tagged and evaluated before it's passed to $wfoo. See Note [CBV Function Ids]. + * The EPT enforcement pass will make $wfoo into a CBV function, where + the caller guarantees to pass an EPT argument (see Note [EPT enforcement] in + GHC.Core.Stg.EnforceEpt) -Usually the argument will be known to be properly tagged at the call site so there is + * The code generator will discard that "extra eval" case, because $wfoo is + CBV. + +See also Note [CBV Function Ids: overview]. + +In tihs case the argument is known to be properly tagged at the call site so there is no additional work for the caller and the worker can be more efficient since it can assume the presence of a tag. This is especially true for recursive functions like this: -- myPred expect it's argument properly tagged + -- The EnforceEPT pass has made it a CBV function myPred !x = ... loop :: MyPair -> Int @@ -2933,11 +2962,10 @@ This is especially true for recursive functions like this: B -> 2 _ -> loop (MyPair (myPred x) (myPred y)) -Here we would ordinarily not be strict in y after unboxing. -However if we pass it as a regular argument then this means on -every iteration of loop we will incur an extra seq on y before -we can pass it to `myPred` which isn't great! That is in STG after -tag inference we get: +Here we would ordinarily not be strict in y after unboxing. However if we pass +it as a regular argument then this means on every iteration of loop we will +incur an extra seq on y before we can pass it to `myPred` which isn't great! +That is in STG after tag inference we get: Rec { Find.$wloop [InlPrag=[2], Occ=LoopBreaker] @@ -2962,7 +2990,7 @@ tag inference we get: }; end Rec } -Here comes the tricky part: If we make $wloop strict in both x/y and we get: +But if we add an extra eval on `y` during worker/wrapper we this this: Rec { Find.$wloop [InlPrag=[2], Occ=LoopBreaker] @@ -2986,18 +3014,24 @@ Here comes the tricky part: If we make $wloop strict in both x/y and we get: }; end Rec } -Here both x and y are known to be tagged in the function body since we pass strict worker args using unlifted cbv. -This means the seqs on x and y both become no-ops and compared to the first version the seq on `y` disappears at runtime. - -The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated. -But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated -already at the call site because of the EPT Invariant! See Note [EPT enforcement] for more in this. -This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good. - -We only apply this when we think there is a benefit in doing so however. There are a number of cases in which -it would be useless to insert an extra seq. ShouldStrictifyIdForCbv tries to identify these to avoid churn in the +Here both x and y are known to be tagged in the function body since we pass +strict worker args using unlifted cbv. This means the seqs on x and y both +become no-ops (via (EPT-codegen) in Not [EPT enforcement]) and, compared to the +first version, the seq on `y` disappears at runtime. + +The downside is that the caller of $wfoo potentially has to evaluate `y` once if +we can't prove it isn't already evaluated. The wrapper, which calls `$wfoo` has +just pulled `y` out of a strict field of a data constructor, so it will always +be EPT. See Note [EPT enforcement] for more in this. This makes GHC itself +around 1% faster despite doing slightly more work! So this is generally quite +good. + +We only apply this when we think there is a benefit in doing so however. There +are a number of cases in which it would be useless to insert an extra +seq. `wantCbvForId` tries to identify these to avoid churn in the simplifier. See Note [Which Ids should be strictified] for details on this. -} + mkStrictFieldSeqs :: [(Id,StrictnessMark)] -> CoreExpr -> (CoreExpr) mkStrictFieldSeqs args rhs = foldr addEval rhs args @@ -3007,7 +3041,7 @@ mkStrictFieldSeqs args rhs = addEval (arg_id,arg_cbv) (rhs) -- Argument representing strict field. | isMarkedStrict arg_cbv - , shouldStrictifyIdForCbv arg_id + , wantCbvForId arg_id -- Make sure to remove unfoldings here to avoid the simplifier dropping those for OtherCon[] unfoldings. = Case (Var $! zapIdUnfolding arg_id) arg_id case_ty ([Alt DEFAULT [] rhs]) -- Normal argument @@ -3027,87 +3061,99 @@ There are multiple reasons why we might not want to insert a seq in the rhs to strictify a functions argument: 1) The argument doesn't exist at runtime. - -For zero width types (like Types) there is no benefit as we don't operate on them -at runtime at all. This includes things like void#, coercions and state tokens. + For zero width types (like Types) there is no benefit as we don't operate on them + at runtime at all. This includes things like void#, coercions and state tokens. 2) The argument is a unlifted type. - -If the argument is a unlifted type the calling convention already is explicitly -cbv. This means inserting a seq on this argument wouldn't do anything as the seq -would be a no-op *and* it wouldn't affect the calling convention. + If the argument is a unlifted type the calling convention already is explicitly + cbv. This means inserting a seq on this argument wouldn't do anything as the seq + would be a no-op *and* it wouldn't affect the calling convention. 3) The argument is absent. + If the argument is absent in the body there is no advantage to it being passed as + cbv to the function. The function won't ever look at it so we don't save any work. -If the argument is absent in the body there is no advantage to it being passed as -cbv to the function. The function won't ever look at it so we don't safe any work. - -This mostly happens for join point. For example we might have: - - data T = MkT ![Int] [Char] - f t = case t of MkT xs{strict} ys-> snd (xs,ys) - -and abstract the case alternative to: - - f t = join j1 = \xs ys -> snd (xs,ys) - in case t of MkT xs{strict} ys-> j1 xs xy - -While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to. -In short a absent demand means neither our RHS, nor any function we pass the argument -to will inspect it. So there is no work to be saved by forcing `xs` early. + This mostly happens for join points. For example we might have: -NB: There is an edge case where if we rebox we *can* end up seqing an absent value. -Note [Absent fillers] has an example of this. However this is so rare it's not worth -caring about here. + data T = MkT ![Int] [Char] + f t = case t of MkT xs{strict} ys-> snd (xs,ys) -4) The argument is already strict. + and abstract the case alternative to: -Consider this code: + f t = join j1 = \xs ys -> snd (xs,ys) + in case t of MkT xs{strict} ys-> j1 xs xy - data T = MkT ![Int] - f t = case t of MkT xs{strict} -> reverse xs + While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to. + In short a absent demand means neither our RHS, nor any function we pass the argument + to will inspect it. So there is no work to be saved by forcing `xs` early. -The `xs{strict}` indicates that `xs` is used strictly by the `reverse xs`. -If we do a w/w split, and add the extra eval on `xs`, we'll get - - $wf xs = - case xs of xs1 -> - let t = MkT xs1 in - case t of MkT xs2 -> reverse xs2 - -That's not wrong; but the w/w body will simplify to - - $wf xs = case xs of xs1 -> reverse xs1 - -and now we'll drop the `case xs` because `xs1` is used strictly in its scope. -Adding that eval was a waste of time. So don't add it for strictly-demanded Ids. + NB: There is an edge case where if we rebox we *can* end up seqing an absent value. + Note [Absent fillers] has an example of this. However this is so rare it's not worth + caring about here. 5) Functions - -Functions are tricky (see Note [TagInfo of functions] in EnforceEpt). -But the gist of it even if we make a higher order function argument strict -we can't avoid the tag check when it's used later in the body. -So there is no benefit. + Functions are tricky (see Note [TagInfo of functions] in EnforceEpt). + But the gist of it even if we make a higher order function argument strict + we can't avoid the tag check when it's used later in the body. + So there is no benefit. + +Wrinkles: + +(WIS1) You might have thought that we can omit the eval if the argument is used + strictly demanded in the body. But you'd be wrong. Consider this code: + data T = MkT ![Int] + f t = case t of MkT xs{Dmd=STR} -> reverse xs + + The `xs{Dmd=STR}` indicates that `xs` is used strictly by the `reverse xs`. + If we do a w/w split, and add the extra eval on `xs`, we'll get + $wf xs = case xs of xs1 -> + let t = MkT xs1 in + case t of MkT xs2 -> reverse xs2 + + That's not wrong; but you might wonder if the eval on `xs` is needed + when it is certainly evaluated by the `reverse`. But yes, it is (#26722): + g s True t = f s t t + g s False t = g s True t + + f True (MkT xs) t = f False (MkT xs) t + f False (MkT xs) _ = xs + + After worker/wrapper we get: + g s b t = case t of MkT ww -> $wg s b ww + $wg s ds ww = case ds of { + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg } + True -> let { t1 = MkT ww } in f s t1 t1 } + + We must make `f` inline inside `$wg`, because `f` too is ww'd, and we + don't want to rebox `t1` before passing it to `f`. BUT while `t1` + looks like a HNF, `exprIsHNF` will say False because `MkT` is strict + and `ww` isn't evaluated. So `f` doesn't inline and we get lots of + reboxing. + + The Right Thing to to is to add the eval for the data con argument: + $wg s ds ww = case ww of ww' { DEFAULT -> + case ds of { + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg } + True -> let { t1 = MkT ww' } in f s t1 t1 } } + + Now `t1` will be a HNF, and `f` will inline, and we get + $wg s ds ww = case ww of ww' { DEFAULT -> + case ds of { + False -> Bar.$wg s True ww' + True -> $wf s ww' + + (Ultimately `$wg` will be a CBV function, so that `case ww` will be a + no-op: see (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt.) -} --- | Do we expect there to be any benefit if we make this var strict --- in order for it to get treated as as cbv argument? --- See Note [Which Ids should be strictified] --- See Note [CBV Function Ids] for more background. -shouldStrictifyIdForCbv :: Var -> Bool -shouldStrictifyIdForCbv = wantCbvForId False - --- Like shouldStrictifyIdForCbv but also wants to use cbv for strict args. -shouldUseCbvForId :: Var -> Bool -shouldUseCbvForId = wantCbvForId True -- When we strictify we want to skip strict args otherwise the logic is the same --- as for shouldUseCbvForId so we common up the logic here. +-- as for wantCbvForId so we common up the logic here. -- Basically returns true if it would be beneficial for runtime to pass this argument -- as CBV independent of weither or not it's correct. E.g. it might return true for lazy args -- we are not allowed to force. -wantCbvForId :: Bool -> Var -> Bool -wantCbvForId cbv_for_strict v +wantCbvForId :: Var -> Bool +wantCbvForId v -- Must be a runtime var. -- See Note [Which Ids should be strictified] point 1) | isId v @@ -3121,9 +3167,6 @@ wantCbvForId cbv_for_strict v , not $ isFunTy ty -- If the var is strict already a seq is redundant. -- See Note [Which Ids should be strictified] point 4) - , not (isStrictDmd dmd) || cbv_for_strict - -- If the var is absent a seq is almost always useless. - -- See Note [Which Ids should be strictified] point 3) , not (isAbsDmd dmd) = True | otherwise ===================================== compiler/GHC/CoreToStg/Prep.hs ===================================== @@ -1582,7 +1582,8 @@ maybeSaturate fn expr n_args unsat_ticks -- See Note [Eta expansion of hasNoBinding things in CorePrep] = return $ wrapLamBody (\body -> foldr mkTick body unsat_ticks) sat_expr - | mark_arity > 0 -- A call-by-value function. See Note [CBV Function Ids] + | mark_arity > 0 -- A call-by-value function. + -- See Note [CBV Function Ids: overview] , not applied_marks = assertPpr ( not (isJoinId fn)) -- See Note [Do not eta-expand join points] ===================================== compiler/GHC/Stg/EnforceEpt.hs ===================================== @@ -66,13 +66,13 @@ SG thinks it would be good to fix this; see #21792. Note [EPT enforcement] ~~~~~~~~~~~~~~~~~~~~~~ The goal of EnforceEPT pass is to mark as many binders as possible as EPT -(see Note [Evaluated and Properly Tagged]). -To find more EPT binders, it establishes the following +(see Note [Evaluated and Properly Tagged]). It establishes the following +invariant: EPT INVARIANT:
Any binder of * a strict field (see Note [Strict fields in Core]), or -> * a CBV argument (see Note [CBV Function Ids]) +> * a CBV argument (see Note [CBV Function Ids: overview]) is EPT.
(Note that prior to EPT enforcement, this invariant may *not* always be upheld. @@ -105,7 +105,7 @@ however, we presently only promote worker functions such as $wf to CBV because we see all its call sites and can use the proper by-value calling convention. More precisely, with -O0, we guarantee that no CBV functions are visible in the interface file, so that naïve clients do not need to know how to call CBV -functions. See Note [CBV Function Ids] for more details. +functions. See Note [CBV Function Ids: overview] for more details. Specification ------------- @@ -140,9 +140,11 @@ Afterwards, the *EPT rewriter* inserts the actual evals realising Upcasts. Implementation -------------- -* EPT analysis is implemented in GHC.Stg.EnforceEpt.inferTags. +(EPT-anal) EPT analysis is implemented in `GHC.Stg.EnforceEpt.inferTags.` It attaches its result to /binders/, not occurrence sites. -* The EPT rewriter establishes the EPT invariant by inserting evals. That is, if + +(EPT-rewrite) The EPT rewriter, `GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds`, + establishes the EPT invariant by inserting evals. That is, if (a) a binder x is used to * construct a strict field (`SP x y`), or * passed as a CBV argument (`$wf x`), @@ -152,17 +154,27 @@ Implementation case x of x' { __ DEFAULT -> SP x' y }. case x of x' { __ DEFAULT -> $wf x' }. (Recall that the case binder x' is always EPT.) - This is implemented in GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds. + This pass also propagates the EPTness from binders to occurrences. + It is sound to insert evals on strict fields (Note [Strict fields in Core]), - and on CBV arguments as well (Note [CBV Function Ids]). -* We also export the EPTness of top level bindings to allow this optimisation + and on CBV arguments as well (Note [CBV Function Ids: overview]). + +(EPT-codegen) Finally, code generation for (case x of alts) skips the thunk check + when `x` is EPT. This is done (a bit indirectly) thus: + * GHC.StgToCmm.Expr.cgCase: builds a `sequel`, and recurses into `cgExpr` on `x`. + * When `cgExpr` sees a `x` goes to `cgIdApp`, which uses `getCallMethod`. + * Then `getCallMethod` sees that `x` is EPT (via `idTagSigMaybe`), and + returns `InferredReturnIt`. + * Now `cgIdApp` can jump straight to the case-alternative switch in the `sequel` + constructed by `cgCase`. + +(EPT-export) We also export the EPTness of top level bindings to allow this optimisation to work across module boundaries. + NB: The EPT Invariant *must* be upheld, regardless of the optimisation level; hence EPTness is practically part of the internal ABI of a strict data - constructor or CBV function. Note [CBV Function Ids] contains the details. -* Finally, code generation skips the thunk check when branching on binders that - are EPT. This is done by `cgExpr`/`cgCase` in the backend. + constructor or CBV function. Note [CBV Function Ids: overview] has the details. Evaluation ---------- ===================================== compiler/GHC/Stg/Lint.hs ===================================== @@ -420,7 +420,7 @@ lintAppCbvMarks e@(StgApp fun args) = do when (lf_unarised lf) $ do -- A function which expects a unlifted argument as n'th argument -- always needs to be applied to n arguments. - -- See Note [CBV Function Ids]. + -- See Note [CBV Function Ids: overview]. let marks = fromMaybe [] $ idCbvMarks_maybe fun when (length (dropWhileEndLE (not . isMarkedCbv) marks) > length args) $ do addErrL $ hang (text "Undersatured cbv marked ID in App" <+> ppr e ) 2 $ ===================================== compiler/GHC/StgToCmm/Closure.hs ===================================== @@ -617,12 +617,15 @@ getCallMethod cfg name id (LFThunk _ _ updatable std_form_info is_fun) getCallMethod cfg name id (LFUnknown might_be_a_function) n_args _cg_locs _self_loop_info | n_args == 0 , Just sig <- idTagSig_maybe id - , isTaggedSig sig -- Infered to be already evaluated by EPT analysis - -- When profiling we must enter all potential functions to make sure we update the SCC - -- even if the function itself is already evaluated. + , isTaggedSig sig -- This `id` is evaluated and properly tagged; no need to enter it + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt + + -- When profiling we must enter all potential functions to make sure we update + -- the SCC even if the function itself is already evaluated. -- See Note [Evaluating functions with profiling] in rts/Apply.cmm , not (profileIsProfiling (stgToCmmProfile cfg) && might_be_a_function) - = InferedReturnIt -- See Note [EPT enforcement] + + = InferedReturnIt -- See (EPT-codegen) in Note [EPT enforcement] | might_be_a_function = SlowCall ===================================== compiler/GHC/StgToCmm/Expr.hs ===================================== @@ -1053,6 +1053,7 @@ cgIdApp fun_id args = do | otherwise -> emitReturn [fun] -- A value infered to be in WHNF, so we can just return it. + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt InferedReturnIt | isZeroBitTy (idType fun_id) -> trace >> emitReturn [] | otherwise -> trace >> assertTag >> ===================================== compiler/GHC/Types/Id.hs ===================================== @@ -117,7 +117,7 @@ module GHC.Types.Id ( setIdCbvMarks, idCbvMarks_maybe, idCbvMarkArity, - asWorkerLikeId, asNonWorkerLikeId, + setCbvCandidate, removeCbvCandidate, idDemandInfo, idDmdSig, @@ -563,7 +563,7 @@ isDataConId id = case Var.idDetails id of -- | An Id for which we might require all callers to pass strict arguments properly tagged + evaluated. -- --- See Note [CBV Function Ids] +-- See Note [CBV Function Ids: overview] isWorkerLikeId :: Id -> Bool isWorkerLikeId id = case Var.idDetails id of WorkerLikeId _ -> True @@ -668,19 +668,20 @@ idJoinArity id = case idJoinPointHood id of NotJoinPoint -> pprPanic "idJoinArity" (ppr id) asJoinId :: Id -> JoinArity -> JoinId -asJoinId id arity = warnPprTrace (not (isLocalId id)) - "global id being marked as join var" (ppr id) $ - warnPprTrace (not (is_vanilla_or_join id)) - "asJoinId" - (ppr id <+> pprIdDetails (idDetails id)) $ - id `setIdDetails` JoinId arity (idCbvMarks_maybe id) +asJoinId id arity + = warnPprTrace (not (isLocalId id)) + "global id being marked as join var" (ppr id) $ + id `setIdDetails` JoinId arity cbv_info where - is_vanilla_or_join id = case Var.idDetails id of - VanillaId -> True - -- Can workers become join ids? Yes! - WorkerLikeId {} -> pprTraceDebug "asJoinId (call by value function)" (ppr id) True - JoinId {} -> True - _ -> False + cbv_info = case Var.idDetails id of + VanillaId -> Nothing + WorkerLikeId marks -> Just marks + JoinId _ mb_marks -> mb_marks + _ -> pprTraceDebug "asJoinId" + (ppr id <+> pprIdDetails (idDetails id)) $ + Nothing + -- Can workers become join ids? Yes! + -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info zapJoinId :: Id -> Id -- May be a regular id already @@ -691,7 +692,7 @@ zapJoinId jid | isJoinId jid = zapIdTailCallInfo (newIdDetails `seq` jid `setIdD where newIdDetails = case idDetails jid of -- We treat join points as CBV functions. Even after they are floated out. - -- See Note [Use CBV semantics only for join points and workers] + -- See Note [Which Ids should be CBV candidates?] JoinId _ (Just marks) -> WorkerLikeId marks JoinId _ Nothing -> WorkerLikeId [] _ -> panic "zapJoinId: newIdDetails can only be used if Id was a join Id." @@ -840,7 +841,7 @@ setIdCbvMarks id marks -- Perhaps that's sensible but for now be conservative. -- Similarly we don't need any lazy marks at the end of the list. -- This way the length of the list is always exactly number of arguments - -- that must be visible to CodeGen. See See Note [CBV Function Ids] + -- that must be visible to CodeGen. See Note [CBV Function Ids: overview] -- for more details. trimmedMarks = dropWhileEndLE (not . isMarkedCbv) $ take (idArity id) marks @@ -855,18 +856,10 @@ idCbvMarks_maybe id = case idDetails id of idCbvMarkArity :: Id -> Arity idCbvMarkArity fn = maybe 0 length (idCbvMarks_maybe fn) --- | Remove any cbv marks on arguments from a given Id. -asNonWorkerLikeId :: Id -> Id -asNonWorkerLikeId id = - let details = case idDetails id of - WorkerLikeId{} -> Just $ VanillaId - JoinId arity Just{} -> Just $ JoinId arity Nothing - _ -> Nothing - in maybeModifyIdDetails details id - --- | Turn this id into a WorkerLikeId if possible. -asWorkerLikeId :: Id -> Id -asWorkerLikeId id = +-- | Make this Id into a candidate for CBV treatment, if possible. +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info +setCbvCandidate :: Id -> Id +setCbvCandidate id = let details = case idDetails id of WorkerLikeId{} -> Nothing JoinId _arity Just{} -> Nothing @@ -875,6 +868,16 @@ asWorkerLikeId id = _ -> Nothing in maybeModifyIdDetails details id +-- | Remove any CBV-candidate info from a given Id. +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info +removeCbvCandidate :: Id -> Id +removeCbvCandidate id = + let details = case idDetails id of + WorkerLikeId{} -> Just $ VanillaId + JoinId arity Just{} -> Just $ JoinId arity Nothing + _ -> Nothing + in maybeModifyIdDetails details id + setCaseBndrEvald :: StrictnessMark -> Id -> Id -- Used for variables bound by a case expressions, both the case-binder -- itself, and any pattern-bound variables that are argument of a ===================================== compiler/GHC/Types/Id/Info.hs ===================================== @@ -208,14 +208,14 @@ data IdDetails -- ^ An 'Id' for a join point taking n arguments -- Note [Join points] in "GHC.Core" -- Can also work as a WorkerLikeId if given `CbvMark`s. - -- See Note [CBV Function Ids] + -- See Note [CBV Function Ids: overview] -- The [CbvMark] is always empty (and ignored) until after Tidy. | WorkerLikeId [CbvMark] -- ^ An 'Id' for a worker like function, which might expect some arguments to be -- passed both evaluated and tagged. -- Worker like functions are create by W/W and SpecConstr and we can expect that they -- aren't used unapplied. - -- See Note [CBV Function Ids] + -- See Note [CBV Function Ids: overview] -- See Note [EPT enforcement] -- The [CbvMark] is always empty (and ignored) until after Tidy for ids from the current -- module. @@ -244,85 +244,114 @@ conLikesRecSelInfo con_likes lbls has_fld dc lbl = any (\ fl -> flLabel fl == lbl) (conLikeFieldLabels dc) -{- Note [CBV Function Ids] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -A WorkerLikeId essentially allows us to constrain the calling convention -for the given Id. Each such Id carries with it a list of CbvMarks -with each element representing a value argument. Arguments who have -a matching `MarkedCbv` entry in the list need to be passed evaluated+*properly tagged*. - -CallByValueFunIds give us additional expressiveness which we use to improve -runtime. This is all part of the EPT enforcement work. See also Note [EPT enforcement]. - -They allows us to express the fact that an argument is not only evaluated to WHNF once we -entered it's RHS but also that an lifted argument is already *properly tagged* once we jump -into the RHS. -This means when e.g. branching on such an argument the RHS doesn't needed to perform -an eval check to ensure the argument isn't an indirection. All seqs on such an argument in -the functions body become no-ops as well. - -The invariants around the arguments of call by value function like Ids are then: - -* In any call `(f e1 .. en)`, if `f`'s i'th argument is marked `MarkedCbv`, - then the caller must ensure that the i'th argument - * points directly to the value (and hence is certainly evaluated before the call) - * is a properly tagged pointer to that value - -* The following functions (and only these functions) have `CbvMarks`: - * Any `WorkerLikeId` - * Some `JoinId` bindings. - -This works analogous to the EPT Invariant. See also Note [EPT enforcement]. - -To make this work what we do is: -* During W/W and SpecConstr any worker/specialized binding we introduce - is marked as a worker binding by `asWorkerLikeId`. -* W/W and SpecConstr further set OtherCon[] unfoldings on arguments which - represent contents of a strict fields. -* During Tidy we look at all bindings. - For any callByValueLike Id and join point we mark arguments as cbv if they - Are strict. We don't do so for regular bindings. - See Note [Use CBV semantics only for join points and workers] for why. - We might have made some ids rhs *more* strict in order to make their arguments - be passed CBV. See Note [Call-by-value for worker args] for why. -* During CorePrep calls to CallByValueFunIds are eta expanded. +{- Note [CBV Function Ids: overview] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +GHC can decide to use a call-by-value (CBV) calling convention for +(some arguments of) a function, implying that: + +* The caller /must/ pass an argument that is evaluated and properly + tagged (EPT). See Note [Evaluated and Properly Tagged] in GHC.Stg.EnforceEpt. + +* The function may /assume/ that the argument is EPT, and thereby omit + evals that would otherwise be necessary. + +CBV-ness is part of the calling convention; it is not optional. If a function +is compiled with CBV arguments, callers /must/ respect it, else seg-fault +beckon. + +Apart from the more efficent calling convention, a compelling reason for +a CBV calling conventions is worker-functions for strict data types. +Example: + data T a = MkT ![a] + f :: T Int -> blah + f (MkT y) = ... +We get a w/w split + $wf y = let x = MkT y in ... + f x = case x of MkT y -> $wf y +But in `$wf`, in general, we'd need to evaluate `y`, becuase `MkT` is strict. +With a CBV calling convention we can drop that stupid extra eval. + +Here's how it all works: + +* We identify some function Ids as "CBV candidates"; + see Note [Which Ids should be CBV candidates?] + +* During W/W and SpecConstr: any worker/specialized binding we introduce + is marked as a CBV-candidate by `asCbvCandidate`. This simply marks + the binding as a candidate for CBV-ness, using IdDetails `WorkerLikeId []`. + See Note [Which Ids should be CBV candidates?]. + + See also Note [Call-by-value for worker args] for how we build the worker RHS. + +* A CBV candidate may become a join point; we are careful to retain + its CBV-candidature; see `GHC.Types.Id.asJoinId`. (Actually that hardly + matters because all join points are CBV-candidates.) A join point can also + become an ordinary Id, due to floating (see `zapJoinId`); again we are + careful to retain CBV-candidature. + +* During Tidy, for CBV-candidate Ids, including join points, we mark any + /strict/ arguments as CBV. This is the point at which the CbvMarks inside a + WorkerLikeId are set. See `GHC.Core.Tidy.computeCbvInfo`, and + + This step is informed by a late demand analysis, performed just before tidying + to identify strict arguments. See Note [Call-by-value for worker args] for + how a worker guarantees to be strict in strict datacon fields. + + TODO: We currently don't do this for arguments that are unboxed sums or tuples, + because then we'd have to predict the result of unarisation. But it would be nice to + do so. See `computeCbvInfo`. + +* During CorePrep calls to CBV Ids are eta expanded. + See `GHC.CoreToStg.Prep.maybeSaturate`. + * During Stg CodeGen: * When we see a call to a callByValueLike Id: * We check if all arguments marked to be passed unlifted are already tagged. * If they aren't we will wrap the call in case expressions which will evaluate+tag these arguments before jumping to the function. + See (EPT-rewrite) in Note [EPT enforcement] in GHC.Stg.EnforceEpt + * During Cmm codeGen: * When generating code for the RHS of a StrictWorker binding we omit tag checks when using arguments marked as tagged. + See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt + +* Imported functions may be CBV, and then there is no point in eta-reducing + them; we'll just have to eta-expand later; see GHC.Core.Opt.Arity.cantEtaReduceFun. +*** SPJ really? Andreas? **** We only use this for workers and specialized versions of SpecConstr But we also check other functions during tidy and potentially turn some of them into call by value functions and mark some of their arguments as call-by-value by looking at argument unfoldings. -NB: I choose to put the information into a new Id constructor since these are loaded -at all optimization levels. This makes it trivial to ensure the additional -calling convention demands are available at all call sites. Putting it into -IdInfo would require us at the very least to always decode the IdInfo + +NB: I choose to put the CBV information into the IdDetails since these are +loaded at all optimization levels. This makes it trivial to ensure the +additional calling convention demands are available at all call sites. Putting +it into IdInfo would require us at the very least to always decode the IdInfo just to decide if we need to throw it away or not after. -Note [Use CBV semantics only for join points and workers] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -A function with cbv-semantics requires arguments to be visible -and if no arguments are visible requires us to eta-expand it's -call site. That is for a binding with three cbv arguments like -`w[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated -occurrences like `map w xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs. - -In experiments it turned out that the code size increase of doing so -can outweigh the performance benefits of doing so. -So we only do this for join points, workers and -specialized functions (from SpecConstr). -Join points are naturally always called saturated so -this problem can't occur for them. -For workers and specialized functions there are also always at least -some applied arguments as we won't inline the wrapper/apply their rule -if there are unapplied occurrences like `map f xs`. +Note [Which Ids should be CBV candidates?] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In principle, we could use a CBV calling convention for /any/ strict function. +But when we use CBV semantics the caller must obey the EPT calling convention, +and that may mean eta-expansion. For example, for a binding with three CBV +arguments like `foo[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated +occurrences like `map foo xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs. + +In experiments it turned out that the code size increase of doing so can +outweigh the performance benefits of doing so. + +So we treat only certain functions as candidates for CBV treatment: + * Workers created by worker/wrapper. + * Specialised functions from SpecConstr + * Join points + +Reason: + * All of these are always called saturated (at birth anyway) + * For workers in particular we want to use CBV for strict + fields of data constructors -} -- | Parent of a record selector function. ===================================== compiler/GHC/Types/Id/Make.hs ===================================== @@ -1038,7 +1038,7 @@ until the final simplifier phase; see Note [Activation for data constructor wrappers]. For further reading, see: - * Note [Conlike is interesting] in GHC.Core.Op.Simplify.Utils + * (IA1) in Note [Interesting arguments] in GHC.Core.Op.Simplify.Utils * Note [Lone variables] in GHC.Core.Unfold * Note [exprIsConApp_maybe on data constructors with wrappers] in GHC.Core.SimpleOpt ===================================== testsuite/tests/simplCore/should_compile/T18013.stderr ===================================== @@ -143,14 +143,14 @@ T18013.$wmapMaybeRule [InlPrag=NOINLINE] Unf=OtherCon []] T18013.$wmapMaybeRule = \ (@a) (@b) (@s) (ww :: s) (ww1 :: s -> a -> IO (Result s b)) -> + case ww of ww2 { __DEFAULT -> case ww1 of wild { __DEFAULT -> - case ww of wild1 { __DEFAULT -> T18013a.Rule @IO @(Maybe a) @(Maybe b) @s - wild1 + ww2 ((\ (s2 :: s) (a1 :: Maybe a) (s1 :: GHC.Internal.Prim.State# GHC.Internal.Prim.RealWorld) -> @@ -158,7 +158,7 @@ T18013.$wmapMaybeRule Nothing -> (# s1, T18013a.Result - @s @(Maybe b) wild1 (GHC.Internal.Maybe.Nothing @b) #); + @s @(Maybe b) ww2 (GHC.Internal.Maybe.Nothing @b) #); Just x -> case ((wild s2 x) `cast` Co:4 :: IO (Result s b) ===================================== testsuite/tests/simplCore/should_compile/T26722.hs ===================================== @@ -0,0 +1,9 @@ +module T26722 where + +data T = MkT ![Int] + +g s True t = f s t t +g s False t = g s True t + +f True (MkT xs) t = f False (MkT xs) t +f False (MkT xs) _ = xs ===================================== testsuite/tests/simplCore/should_compile/T26722.stderr ===================================== @@ -0,0 +1 @@ + \ No newline at end of file ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -575,3 +575,6 @@ test('T26682', normal, multimod_compile, ['T26682', '-O -v0']) # In the bug report #26615, the overloaded calls were signalled by a dictionary # argument like fEqList_xxxx, so we grep for that. Not a very robust test test('T26615', [grep_errmsg(r'fEqList')], multimod_compile, ['T26615', '-O -fspec-constr -ddump-simpl -dsuppress-uniques']) + +# T26722: there should be no reboxing in $wg +test('T26722', [grep_errmsg(r'SPEC')], compile, ['-O -dno-typeable-binds']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5e1cd595b98fc153beaea795ee079f3... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5e1cd595b98fc153beaea795ee079f3... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Marge Bot (@marge-bot)