Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC
Commits:
-
c56567ec
by Simon Peyton Jones at 2026-01-15T23:19:04+00:00
-
9719ce5d
by Simon Peyton Jones at 2026-01-15T23:19:04+00:00
20 changed files:
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs
- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Opt/WorkWrap/Utils.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Core/Unfold.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Stg/EnforceEpt.hs
- compiler/GHC/Stg/Lint.hs
- compiler/GHC/StgToCmm/Closure.hs
- compiler/GHC/StgToCmm/Expr.hs
- compiler/GHC/Types/Id.hs
- compiler/GHC/Types/Id/Info.hs
- compiler/GHC/Types/Id/Make.hs
- testsuite/tests/simplCore/should_compile/T18013.stderr
- + testsuite/tests/simplCore/should_compile/T26722.hs
- + testsuite/tests/simplCore/should_compile/T26722.stderr
- testsuite/tests/simplCore/should_compile/all.T
Changes:
| ... | ... | @@ -2515,7 +2515,7 @@ eta-reduce that are specific to Core and GHC: |
| 2515 | 2515 | See Note [Eta expanding primops].
|
| 2516 | 2516 | |
| 2517 | 2517 | (W) We may not undersaturate StrictWorkerIds.
|
| 2518 | - See Note [CBV Function Ids] in GHC.Types.Id.Info.
|
|
| 2518 | + See Note [CBV Function Ids: overview] in GHC.Types.Id.Info.
|
|
| 2519 | 2519 | |
| 2520 | 2520 | Here is a list of historic accidents surrounding unsound eta-reduction:
|
| 2521 | 2521 | |
| ... | ... | @@ -2848,7 +2848,7 @@ cantEtaReduceFun fun |
| 2848 | 2848 | |
| 2849 | 2849 | || (isJust (idCbvMarks_maybe fun)) -- (W)
|
| 2850 | 2850 | -- Don't undersaturate StrictWorkerIds.
|
| 2851 | - -- See Note [CBV Function Ids] in GHC.Types.Id.Info.
|
|
| 2851 | + -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info.
|
|
| 2852 | 2852 | |
| 2853 | 2853 | |
| 2854 | 2854 | {- *********************************************************************
|
| ... | ... | @@ -982,15 +982,58 @@ But we don't regard (f x y) as interesting, unless f is unsaturated. |
| 982 | 982 | If it's saturated and f hasn't inlined, then it's probably not going
|
| 983 | 983 | to now!
|
| 984 | 984 | |
| 985 | -Note [Conlike is interesting]
|
|
| 986 | -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 987 | -Consider
|
|
| 988 | - f d = ...((*) d x y)...
|
|
| 989 | - ... f (df d')...
|
|
| 990 | -where df is con-like. Then we'd really like to inline 'f' so that the
|
|
| 991 | -rule for (*) (df d) can fire. To do this
|
|
| 992 | - a) we give a discount for being an argument of a class-op (eg (*) d)
|
|
| 993 | - b) we say that a con-like argument (eg (df d)) is interesting
|
|
| 985 | +Wrinkles:
|
|
| 986 | + |
|
| 987 | +(IA1) Conlike is interesting.
|
|
| 988 | + Consider
|
|
| 989 | + f d = ...((*) d x y)...
|
|
| 990 | + ... f (df d')...
|
|
| 991 | + where df is con-like. Then we'd really like to inline 'f' so that the
|
|
| 992 | + rule for (*) (df d) can fire. To do this
|
|
| 993 | + a) we give a discount for being an argument of a class-op (eg (*) d)
|
|
| 994 | + b) we say that a con-like argument (eg (df d)) is interesting
|
|
| 995 | + |
|
| 996 | +(IA2) OtherCon.
|
|
| 997 | + interestingArg returns
|
|
| 998 | + (a) NonTrivArg for an arg with an OtherCon [] unfolding
|
|
| 999 | + (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding.
|
|
| 1000 | + |
|
| 1001 | + Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was
|
|
| 1002 | + inlining a pretty big function when all we knew was that its arguments
|
|
| 1003 | + were evaluated, nothing more. That in turn make the enclosing function
|
|
| 1004 | + too big to inline elsewhere.
|
|
| 1005 | + |
|
| 1006 | + Reason for (b): we want to inline integerCompare here
|
|
| 1007 | + integerLt# :: Integer -> Integer -> Bool#
|
|
| 1008 | + integerLt# (IS x) (IS y) = x <# y
|
|
| 1009 | + integerLt# x y | LT <- integerCompare x y = 1#
|
|
| 1010 | + integerLt# _ _ = 0#
|
|
| 1011 | + |
|
| 1012 | +(IA3) Rubbish literals.
|
|
| 1013 | + In a worker we might see
|
|
| 1014 | + $wfoo x = let y = RUBBISH in
|
|
| 1015 | + ...(g y True)...
|
|
| 1016 | + where `g` has a wrapper that discards its first argment. We really really
|
|
| 1017 | + want to inline g's wrapper, to expose that it discards its RUBBISH arg.
|
|
| 1018 | + That may not happen if RUBBISH looks like TrivArg, so we use NonTrivArg
|
|
| 1019 | + instead. See #26722. (This reverses the plan in #20035, but the problem
|
|
| 1020 | + reported there appears to have gone away.)
|
|
| 1021 | + |
|
| 1022 | +(IA4) Consider a call `f (g x)`. If `f` has a an argument discount on its argument,
|
|
| 1023 | + then f's body scrutinises its argument in a `case` expression, or perhaps applies
|
|
| 1024 | + it. We give the arg `(g x)` an ArgSummary of `NonTrivArg` so that `f` has a bit
|
|
| 1025 | + of encouragment to inline in these cases.
|
|
| 1026 | + |
|
| 1027 | + Now consider `let y = g x in f y`. Now we have to look through y's unfolding.
|
|
| 1028 | + When should we do so? Suppose we did inline `f` so we ended up with
|
|
| 1029 | + let y = g x in ...(case y of alts)...
|
|
| 1030 | + Then we'll call `exprIsConApp_maybe` on `y`; and that looks through "expandable"
|
|
| 1031 | + unfoldings; indeed that's the whole purpose of `exprIsExpanadable`. See
|
|
| 1032 | + Note [exprIsExpandable] in GHC.Core.Utils.
|
|
| 1033 | + |
|
| 1034 | + Conclusion: `interestingArg` should give some encouragement (NonTrivArg) to `f`
|
|
| 1035 | + when the argument is expandable. Hence `uf_expandable` in the `Var` case.
|
|
| 1036 | + |
|
| 994 | 1037 | -}
|
| 995 | 1038 | |
| 996 | 1039 | interestingArg :: SimplEnv -> CoreExpr -> ArgSummary
|
| ... | ... | @@ -1005,7 +1048,7 @@ interestingArg env e = go env 0 e |
| 1005 | 1048 | ContEx tvs cvs ids e -> go (setSubstEnv env tvs cvs ids) n e
|
| 1006 | 1049 | |
| 1007 | 1050 | go _ _ (Lit l)
|
| 1008 | - | isLitRubbish l = TrivArg -- Leads to unproductive inlining in WWRec, #20035
|
|
| 1051 | + | isLitRubbish l = NonTrivArg -- See (IA3) in Note [Interesting arguments]
|
|
| 1009 | 1052 | | otherwise = ValueArg
|
| 1010 | 1053 | go _ _ (Type _) = TrivArg
|
| 1011 | 1054 | go _ _ (Coercion _) = TrivArg
|
| ... | ... | @@ -1027,45 +1070,29 @@ interestingArg env e = go env 0 e |
| 1027 | 1070 | go_var n v
|
| 1028 | 1071 | | isConLikeId v = ValueArg -- Experimenting with 'conlike' rather that
|
| 1029 | 1072 | -- data constructors here
|
| 1030 | - -- DFuns are con-like; see Note [Conlike is interesting]
|
|
| 1073 | + -- DFuns are con-like;
|
|
| 1074 | + -- see (IA1) in Note [Interesting arguments]
|
|
| 1031 | 1075 | | idArity v > n = ValueArg -- Catches (eg) primops with arity but no unfolding
|
| 1032 | 1076 | | n > 0 = NonTrivArg -- Saturated or unknown call
|
| 1033 | 1077 | | otherwise -- n==0, no value arguments; look for an interesting unfolding
|
| 1034 | 1078 | = case idUnfolding v of
|
| 1035 | 1079 | OtherCon [] -> NonTrivArg -- It's evaluated, but that's all we know
|
| 1036 | 1080 | OtherCon _ -> ValueArg -- Evaluated and we know it isn't these constructors
|
| 1037 | - -- See Note [OtherCon and interestingArg]
|
|
| 1081 | + -- See (IA2) in Note [Interesting arguments]
|
|
| 1038 | 1082 | DFunUnfolding {} -> ValueArg -- We konw that idArity=0
|
| 1039 | 1083 | CoreUnfolding{ uf_cache = cache }
|
| 1040 | 1084 | | uf_is_conlike cache -> ValueArg -- Includes constructor applications
|
| 1041 | - | uf_is_value cache -> NonTrivArg -- Things like partial applications
|
|
| 1085 | + | uf_expandable cache -> NonTrivArg -- See (IA4)
|
|
| 1042 | 1086 | | otherwise -> TrivArg
|
| 1043 | 1087 | BootUnfolding -> TrivArg
|
| 1044 | 1088 | NoUnfolding -> TrivArg
|
| 1045 | 1089 | |
| 1046 | -{- Note [OtherCon and interestingArg]
|
|
| 1047 | -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 1048 | -interstingArg returns
|
|
| 1049 | - (a) NonTrivArg for an arg with an OtherCon [] unfolding
|
|
| 1050 | - (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding.
|
|
| 1051 | - |
|
| 1052 | -Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was
|
|
| 1053 | -inlining a pretty big function when all we knew was that its arguments
|
|
| 1054 | -were evaluated, nothing more. That in turn make the enclosing function
|
|
| 1055 | -too big to inline elsewhere.
|
|
| 1056 | - |
|
| 1057 | -Reason for (b): we want to inline integerCompare here
|
|
| 1058 | - integerLt# :: Integer -> Integer -> Bool#
|
|
| 1059 | - integerLt# (IS x) (IS y) = x <# y
|
|
| 1060 | - integerLt# x y | LT <- integerCompare x y = 1#
|
|
| 1061 | - integerLt# _ _ = 0#
|
|
| 1062 | 1090 | |
| 1063 | -************************************************************************
|
|
| 1091 | +{- *********************************************************************
|
|
| 1064 | 1092 | * *
|
| 1065 | 1093 | SimplMode
|
| 1066 | 1094 | * *
|
| 1067 | -************************************************************************
|
|
| 1068 | --}
|
|
| 1095 | +********************************************************************* -}
|
|
| 1069 | 1096 | |
| 1070 | 1097 | updModeForStableUnfoldings :: ActivationGhc -> SimplMode -> SimplMode
|
| 1071 | 1098 | -- See Note [The environments of the Simplify pass]
|
| ... | ... | @@ -1994,7 +1994,7 @@ spec_one env fn arg_bndrs body (call_pat, rule_number) |
| 1994 | 1994 | spec_arity = count isId spec_lam_args
|
| 1995 | 1995 | spec_join_arity | isJoinId fn = JoinPoint (length spec_call_args)
|
| 1996 | 1996 | | otherwise = NotJoinPoint
|
| 1997 | - spec_id = asWorkerLikeId $
|
|
| 1997 | + spec_id = setCbvCandidate $
|
|
| 1998 | 1998 | mkLocalId spec_name ManyTy spec_id_ty
|
| 1999 | 1999 | -- See Note [Transfer strictness]
|
| 2000 | 2000 | `setIdDmdSig` spec_sig
|
| ... | ... | @@ -2065,7 +2065,7 @@ mkSeqs seqees res_ty rhs = |
| 2065 | 2065 | addEval :: Var -> CoreExpr -> CoreExpr
|
| 2066 | 2066 | addEval arg_id rhs
|
| 2067 | 2067 | -- Argument representing strict field and it's worth passing via cbv
|
| 2068 | - | shouldStrictifyIdForCbv arg_id
|
|
| 2068 | + | wantCbvForId arg_id
|
|
| 2069 | 2069 | = Case (Var arg_id)
|
| 2070 | 2070 | (localiseId arg_id) -- See (SCF1) in Note [SpecConstr and strict fields]
|
| 2071 | 2071 | res_ty
|
| ... | ... | @@ -848,7 +848,7 @@ mkWWBindPair ww_opts fn_id fn_info fn_args fn_body work_uniq div |
| 848 | 848 | -- worker is join point iff wrapper is join point
|
| 849 | 849 | -- (see Note [Don't w/w join points for CPR])
|
| 850 | 850 | |
| 851 | - work_id = asWorkerLikeId $
|
|
| 851 | + work_id = setCbvCandidate $
|
|
| 852 | 852 | mkWorkerId work_uniq fn_id (exprType work_rhs)
|
| 853 | 853 | `setIdOccInfo` occInfo fn_info
|
| 854 | 854 | -- Copy over occurrence info from parent
|
| ... | ... | @@ -914,23 +914,26 @@ C) Unlift *any* (non-boot exported) functions arguments if they are strict. |
| 914 | 914 | an impedance matcher function. Leading to massive code bloat.
|
| 915 | 915 | Essentially we end up creating a impromptu wrapper function
|
| 916 | 916 | wherever we wouldn't inline the wrapper with a W/W approach.
|
| 917 | - ~ There is the option of achieving this without eta-expansion if we instead expand
|
|
| 918 | - the partial application code to check for demands on the calling convention and
|
|
| 919 | - for it to evaluate the arguments. The main downsides there would be the complexity
|
|
| 920 | - of the implementation and that it carries a certain overhead even for functions who
|
|
| 921 | - don't take advantage of this functionality. I haven't tried this approach because it's
|
|
| 922 | - not trivial to implement and doing W/W splits seems to work well enough.
|
|
| 923 | - |
|
| 924 | -Currently we use the first approach A) by default, with a flag that allows users to fall back to the
|
|
| 925 | -more aggressive approach B).
|
|
| 926 | - |
|
| 927 | -I also tried the third approach C) using eta-expansion at call sites to avoid modifying the PAP-handling
|
|
| 928 | -code which wasn't fruitful. See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903.
|
|
| 929 | -We could still try to do C) in the future by having PAP calls which will evaluate the required arguments
|
|
| 930 | -before calling the partially applied function. But this would be neither a small nor simple change so we
|
|
| 931 | -stick with A) and a flag for B) for now.
|
|
| 932 | - |
|
| 933 | -See also Note [EPT enforcement] and Note [CBV Function Ids]
|
|
| 917 | + ~ There is the option of achieving this without eta-expansion if we instead
|
|
| 918 | + expand the partial application code to check for demands on the calling
|
|
| 919 | + convention and for it to evaluate the arguments. The main downsides there
|
|
| 920 | + would be the complexity of the implementation and that it carries a
|
|
| 921 | + certain overhead even for functions who don't take advantage of this
|
|
| 922 | + functionality. I haven't tried this approach because it's not trivial to
|
|
| 923 | + implement and doing W/W splits seems to work well enough.
|
|
| 924 | + |
|
| 925 | +Currently we use the first approach A) by default, with a flag that allows users
|
|
| 926 | +to fall back to the more aggressive approach B).
|
|
| 927 | + |
|
| 928 | +I also tried the third approach C) using eta-expansion at call sites to avoid
|
|
| 929 | +modifying the PAP-handling code which wasn't fruitful. See
|
|
| 930 | +https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903. We could
|
|
| 931 | +still try to do C) in the future by having PAP calls which will evaluate the
|
|
| 932 | +required arguments before calling the partially applied function. But this would
|
|
| 933 | +be neither a small nor simple change so we stick with A) and a flag for B) for
|
|
| 934 | +now.
|
|
| 935 | + |
|
| 936 | +See also Note [EPT enforcement] and Note [CBV Function Ids: overview]
|
|
| 934 | 937 | |
| 935 | 938 | Note [Worker/wrapper for strict arguments]
|
| 936 | 939 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| ... | ... | @@ -954,7 +957,7 @@ an "eval" (see `GHC.StgToCmm.Expr.cgCase`). A call (f (a:as)) will |
| 954 | 957 | have the wrapper inlined, and will drop the `case x`, so no eval
|
| 955 | 958 | happens at all.
|
| 956 | 959 | |
| 957 | -The worker `$wf` is a CBV function (see `Note [CBV Function Ids]`
|
|
| 960 | +The worker `$wf` is a CBV function (see `Note [CBV Function Ids: overview]`
|
|
| 958 | 961 | in GHC.Types.Id.Info) and the code generator guarantees that every
|
| 959 | 962 | call to `$wf` has a properly tagged argument (see `GHC.Stg.EnforceEpt.Rewrite`).
|
| 960 | 963 | |
| ... | ... | @@ -1044,7 +1047,7 @@ mkWWstr_one opts arg str_mark = |
| 1044 | 1047 | |
| 1045 | 1048 | DontUnbox
|
| 1046 | 1049 | | isStrictDmd arg_dmd || isMarkedStrict str_mark
|
| 1047 | - , wwUseForUnlifting opts -- See Note [CBV Function Ids]
|
|
| 1050 | + , wwUseForUnlifting opts -- See Note [WW for calling convention]
|
|
| 1048 | 1051 | , not (isFunTy arg_ty)
|
| 1049 | 1052 | , not (isUnliftedType arg_ty) -- Already unlifted!
|
| 1050 | 1053 | -- NB: function arguments have a fixed RuntimeRep,
|
| ... | ... | @@ -1311,12 +1314,13 @@ Needless to say, there are some wrinkles: |
| 1311 | 1314 | But that also means we emit a rubbish lit for other args that have
|
| 1312 | 1315 | cardinality 'C_10' (say, the arg to a bottoming function) where we could've
|
| 1313 | 1316 | used an error-thunk.
|
| 1314 | - NB from Andreas: But I think using an error thunk there would be dodgy no matter what
|
|
| 1315 | - for example if we decide to pass the argument to the bottoming function cbv.
|
|
| 1316 | - As we might do if the function in question is a worker.
|
|
| 1317 | - See Note [CBV Function Ids] in GHC.Types.Id.Info. So I just left the strictness check
|
|
| 1318 | - in place on top of threading through the marks from the constructor. It's a *really* cheap
|
|
| 1319 | - and easy check to make anyway.
|
|
| 1317 | + |
|
| 1318 | + NB from Andreas: But I think using an error thunk there would be dodgy no
|
|
| 1319 | + matter what for example if we decide to pass the argument to the bottoming
|
|
| 1320 | + function cbv. As we might do if the function in question is a worker. See
|
|
| 1321 | + Note [CBV Function Ids: overview] in GHC.Types.Id.Info. So I just left the
|
|
| 1322 | + strictness check in place on top of threading through the marks from the
|
|
| 1323 | + constructor. It's a *really* cheap and easy check to make anyway.
|
|
| 1320 | 1324 | |
| 1321 | 1325 | (AF3) We can only emit a LitRubbish if the arg's type `arg_ty` is mono-rep, e.g.
|
| 1322 | 1326 | of the form `TYPE rep` where `rep` is not (and doesn't contain) a variable.
|
| ... | ... | @@ -38,7 +38,7 @@ import GHC.Utils.Outputable |
| 38 | 38 | import GHC.Types.RepType (typePrimRep)
|
| 39 | 39 | import GHC.Utils.Panic
|
| 40 | 40 | import GHC.Types.Basic (isMarkedCbv, CbvMark (..))
|
| 41 | -import GHC.Core.Utils (shouldUseCbvForId)
|
|
| 41 | +import GHC.Core.Utils ( wantCbvForId )
|
|
| 42 | 42 | |
| 43 | 43 | {-
|
| 44 | 44 | ************************************************************************
|
| ... | ... | @@ -70,52 +70,52 @@ tidyBind env (Rec prs) |
| 70 | 70 | (env', Rec (zip bndrs' rhss'))
|
| 71 | 71 | |
| 72 | 72 | |
| 73 | --- Note [Attaching CBV Marks to ids]
|
|
| 74 | --- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 75 | --- See Note [CBV Function Ids] for the *why*.
|
|
| 76 | --- Before tidy, we turn all worker functions into worker like ids.
|
|
| 77 | --- This way we can later tell if we can assume the existence of a wrapper. This also applies to
|
|
| 78 | --- specialized versions of functions generated by SpecConstr for which we, in a sense,
|
|
| 79 | --- consider the unspecialized version to be the wrapper.
|
|
| 80 | --- During tidy we take the demands on the arguments for these ids and compute
|
|
| 81 | --- CBV (call-by-value) semantics for each individual argument.
|
|
| 82 | --- The marks themselves then are put onto the function id itself.
|
|
| 83 | --- This means the code generator can get the full calling convention by only looking at the function
|
|
| 84 | --- itself without having to inspect the RHS.
|
|
| 85 | ---
|
|
| 86 | --- The actual logic is in computeCbvInfo and takes:
|
|
| 87 | --- * The function id
|
|
| 88 | --- * The functions rhs
|
|
| 89 | --- And gives us back the function annotated with the marks.
|
|
| 90 | --- We call it in:
|
|
| 91 | --- * tidyTopPair for top level bindings
|
|
| 92 | --- * tidyBind for local bindings.
|
|
| 93 | ---
|
|
| 94 | --- Not that we *have* to look at the untidied rhs.
|
|
| 95 | --- During tidying some knot-tying occurs which can blow up
|
|
| 96 | --- if we look at the post-tidy types of the arguments here.
|
|
| 97 | --- However we only care if the types are unlifted and that doesn't change during tidy.
|
|
| 98 | --- so we can just look at the untidied types.
|
|
| 99 | ---
|
|
| 100 | --- If the id is boot-exported we don't use a cbv calling convention via marks,
|
|
| 101 | --- as the boot file won't contain them. Which means code calling boot-exported
|
|
| 102 | --- ids might expect these ids to have a vanilla calling convention even if we
|
|
| 103 | --- determine a different one here.
|
|
| 104 | --- To be able to avoid this we pass a set of boot exported ids for this module around.
|
|
| 105 | --- For non top level ids we can skip this. Local ids are never boot-exported
|
|
| 106 | --- as boot files don't have unfoldings. So there this isn't a concern.
|
|
| 107 | --- See also Note [CBV Function Ids]
|
|
| 108 | - |
|
| 109 | - |
|
| 110 | --- See Note [CBV Function Ids]
|
|
| 73 | +{- Note [Attaching CBV Marks to ids]
|
|
| 74 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 75 | +See Note [CBV Function Ids: overview] for what is happening here.
|
|
| 76 | + |
|
| 77 | +During tidy we take the demands on the arguments for any CBV-candidates and
|
|
| 78 | +compute CBV (call-by-value) semantics for each individual argument. The marks
|
|
| 79 | +themselves then are put onto the function id itself. This means the code
|
|
| 80 | +generator can get the full calling convention by only looking at the function
|
|
| 81 | +itself without having to inspect the RHS.
|
|
| 82 | + |
|
| 83 | +The actual logic is in computeCbvInfo and takes:
|
|
| 84 | + * The function id
|
|
| 85 | + * The functions rhs
|
|
| 86 | +And gives us back the function annotated with the marks. We call it in:
|
|
| 87 | + * tidyTopPair for top level bindings
|
|
| 88 | + * tidyBind for local bindings.
|
|
| 89 | + |
|
| 90 | +Wrinkles
|
|
| 91 | + |
|
| 92 | +(ACBV1) Note that we *have* to look at the untidied rhs. During tidying some
|
|
| 93 | + knot-tying occurs which can blow up if we look at the post-tidy types of the
|
|
| 94 | + arguments here. However we only care if the types are unlifted and that
|
|
| 95 | + doesn't change during tidy. so we can just look at the untidied types.
|
|
| 96 | + |
|
| 97 | +(ACBV2) If the id is boot-exported we don't use a cbv calling convention via
|
|
| 98 | + marks, as the boot file won't contain them. Which means code calling
|
|
| 99 | + boot-exported ids might expect these ids to have a vanilla calling convention
|
|
| 100 | + even if we determine a different one here.
|
|
| 101 | + |
|
| 102 | + To be able to avoid this we pass a set of boot exported ids for this module
|
|
| 103 | + around. For non top level ids we can skip this. Local ids are never
|
|
| 104 | + boot-exported as boot files don't have unfoldings. So there this isn't a
|
|
| 105 | + concern. See also Note [CBV Function Ids: overview]
|
|
| 106 | +-}
|
|
| 107 | + |
|
| 111 | 108 | tidyCbvInfoTop :: HasDebugCallStack => NameSet -> Id -> CoreExpr -> Id
|
| 109 | +-- See Note [CBV Function Ids: overview]
|
|
| 112 | 110 | tidyCbvInfoTop boot_exports id rhs
|
| 113 | - -- Can't change calling convention for boot exported things
|
|
| 114 | - | elemNameSet (idName id) boot_exports = id
|
|
| 115 | - | otherwise = computeCbvInfo id rhs
|
|
| 111 | + | elemNameSet (idName id) boot_exports
|
|
| 112 | + = id -- Can't change calling convention for boot exported things
|
|
| 113 | + -- See (ACBV2) in Note [Attaching CBV Marks to ids]
|
|
| 114 | + | otherwise
|
|
| 115 | + = computeCbvInfo id rhs
|
|
| 116 | 116 | |
| 117 | --- See Note [CBV Function Ids]
|
|
| 118 | 117 | tidyCbvInfoLocal :: HasDebugCallStack => Id -> CoreExpr -> Id
|
| 118 | +-- See Note [CBV Function Ids: overview]
|
|
| 119 | 119 | tidyCbvInfoLocal id rhs = computeCbvInfo id rhs
|
| 120 | 120 | |
| 121 | 121 | -- | For a binding we:
|
| ... | ... | @@ -124,9 +124,8 @@ tidyCbvInfoLocal id rhs = computeCbvInfo id rhs |
| 124 | 124 | -- - It's argument to a worker and demanded strictly
|
| 125 | 125 | -- - Unless it's an unlifted type already
|
| 126 | 126 | -- * Update the id
|
| 127 | --- See Note [CBV Function Ids]
|
|
| 127 | +-- See Note [CBV Function Ids: overview]
|
|
| 128 | 128 | -- See Note [Attaching CBV Marks to ids]
|
| 129 | - |
|
| 130 | 129 | computeCbvInfo :: HasCallStack
|
| 131 | 130 | => Id -- The function
|
| 132 | 131 | -> CoreExpr -- It's RHS
|
| ... | ... | @@ -172,7 +171,7 @@ computeCbvInfo fun_id rhs |
| 172 | 171 | | otherwise
|
| 173 | 172 | = -- pprTraceDebug "computeCbvInfo: Worker seems to take unboxed tuple/sum types!"
|
| 174 | 173 | -- (ppr fun_id <+> ppr rhs)
|
| 175 | - asNonWorkerLikeId fun_id
|
|
| 174 | + removeCbvCandidate fun_id
|
|
| 176 | 175 | |
| 177 | 176 | -- We don't set CBV marks on functions which take unboxed tuples or sums as
|
| 178 | 177 | -- arguments. Doing so would require us to compute the result of unarise
|
| ... | ... | @@ -197,7 +196,7 @@ computeCbvInfo fun_id rhs |
| 197 | 196 | isSimplePrimRep _ = False
|
| 198 | 197 | |
| 199 | 198 | mkMark arg
|
| 200 | - | not $ shouldUseCbvForId arg = NotMarkedCbv
|
|
| 199 | + | not $ wantCbvForId arg = NotMarkedCbv
|
|
| 201 | 200 | -- We can only safely use cbv for strict arguments
|
| 202 | 201 | | (isStrUsedDmd (idDemandInfo arg))
|
| 203 | 202 | , not (isDeadEndId fun_id) = MarkedCbv
|
| ... | ... | @@ -152,10 +152,12 @@ updateCaseScaling n opts = opts { unfoldingCaseScaling = n } |
| 152 | 152 | updateReportPrefix :: Maybe String -> UnfoldingOpts -> UnfoldingOpts
|
| 153 | 153 | updateReportPrefix n opts = opts { unfoldingReportPrefix = n }
|
| 154 | 154 | |
| 155 | -data ArgSummary = TrivArg -- Nothing interesting
|
|
| 156 | - | NonTrivArg -- Arg has structure
|
|
| 157 | - | ValueArg -- Arg is a con-app or PAP
|
|
| 158 | - -- ..or con-like. Note [Conlike is interesting]
|
|
| 155 | +data ArgSummary
|
|
| 156 | + = TrivArg -- Nothing interesting
|
|
| 157 | + | NonTrivArg -- Arg has structure
|
|
| 158 | + | ValueArg -- Arg is a con-app or PAP
|
|
| 159 | + -- ..or con-like. See (IA1) in Note [Interesting arguments]
|
|
| 160 | + -- in GHC.Core.Opt.Simplify.Utils
|
|
| 159 | 161 | |
| 160 | 162 | instance Outputable ArgSummary where
|
| 161 | 163 | ppr TrivArg = text "TrivArg"
|
| ... | ... | @@ -776,7 +778,7 @@ litSize _other = 0 -- Must match size of nullary constructors |
| 776 | 778 | -- (eg via case binding)
|
| 777 | 779 | |
| 778 | 780 | classOpSize :: UnfoldingOpts -> Class -> [Id] -> [CoreExpr] -> ExprSize
|
| 779 | --- See Note [Conlike is interesting]
|
|
| 781 | +-- See (IA1) in Note [Interesting arguments] in GHC.Core.Opt.Simplify.Utils
|
|
| 780 | 782 | classOpSize opts cls top_args args
|
| 781 | 783 | | isUnaryClass cls
|
| 782 | 784 | = sizeZero -- See (UCM4) in Note [Unary class magic] in GHC.Core.TyCon
|
| ... | ... | @@ -59,7 +59,7 @@ module GHC.Core.Utils ( |
| 59 | 59 | isJoinBind,
|
| 60 | 60 | |
| 61 | 61 | -- * Tag inference
|
| 62 | - mkStrictFieldSeqs, shouldStrictifyIdForCbv, shouldUseCbvForId,
|
|
| 62 | + mkStrictFieldSeqs, wantCbvForId,
|
|
| 63 | 63 | |
| 64 | 64 | -- * unsafeEqualityProof
|
| 65 | 65 | isUnsafeEqualityCase,
|
| ... | ... | @@ -1509,6 +1509,23 @@ going to put up with this, because the previous more aggressive inlining |
| 1509 | 1509 | (which treated 'noFactor' as work-free) was duplicating primops, which
|
| 1510 | 1510 | in turn was making inner loops of array calculations runs slow (#5623)
|
| 1511 | 1511 | |
| 1512 | +Wrinkles
|
|
| 1513 | + |
|
| 1514 | +(WF1) Strict constructor fields. We regard (K x) as work-free even if
|
|
| 1515 | + K is a strict data constructor (see Note [Strict fields in Core])
|
|
| 1516 | + data T a = K !a
|
|
| 1517 | + If we have
|
|
| 1518 | + let t = K x in ...(case t of K y -> blah)...
|
|
| 1519 | + we want to treat t's binding as expandable so that `exprIsConApp_maybe`
|
|
| 1520 | + will look through its unfolding. (NB: exprIsWorkFree implies
|
|
| 1521 | + exprIsExpandable.)
|
|
| 1522 | + |
|
| 1523 | + Note, however, that because K is strict, after inlining we'll get a leftover
|
|
| 1524 | + eval on x, which may or may not disappear
|
|
| 1525 | + let t = K x in ...(case x of y -> blah)...
|
|
| 1526 | + We put up with this extra eval: in effect we count duplicating the eval as
|
|
| 1527 | + work-free.
|
|
| 1528 | + |
|
| 1512 | 1529 | Note [Case expressions are work-free]
|
| 1513 | 1530 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| 1514 | 1531 | Are case-expressions work-free? Consider
|
| ... | ... | @@ -1650,7 +1667,8 @@ isWorkFreeApp fn n_val_args |
| 1650 | 1667 | = True
|
| 1651 | 1668 | | otherwise
|
| 1652 | 1669 | = case idDetails fn of
|
| 1653 | - DataConWorkId {} -> True
|
|
| 1670 | + DataConWorkId {} -> True -- Even if the data constructor is strict
|
|
| 1671 | + -- See (WF1) in Note [exprIsWorkFree]
|
|
| 1654 | 1672 | PrimOpId op _ -> primOpIsWorkFree op
|
| 1655 | 1673 | _ -> False
|
| 1656 | 1674 | |
| ... | ... | @@ -1751,6 +1769,8 @@ expansion. Specifically: |
| 1751 | 1769 | duplicate the (a +# b) primop, which we should not do lightly.
|
| 1752 | 1770 | (It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.)
|
| 1753 | 1771 | |
| 1772 | +NB: exprIsWorkFree implies exprIsExpandable.
|
|
| 1773 | + |
|
| 1754 | 1774 | Note [isExpandableApp: bottoming functions]
|
| 1755 | 1775 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| 1756 | 1776 | It's important that isExpandableApp does not respond True to bottoming
|
| ... | ... | @@ -2901,29 +2921,38 @@ dumpIdInfoOfProgram dump_locals ppr_id_info binds = vcat (map printId ids) |
| 2901 | 2921 | |
| 2902 | 2922 | {- Note [Call-by-value for worker args]
|
| 2903 | 2923 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| 2904 | -If we unbox a constructor with strict fields we want to
|
|
| 2905 | -preserve the information that some of the arguments came
|
|
| 2906 | -out of strict fields and therefore should be already properly
|
|
| 2907 | -tagged, however we can't express this directly in core.
|
|
| 2908 | - |
|
| 2909 | -Instead what we do is generate a worker like this:
|
|
| 2924 | +If we unbox a constructor with strict fields we want to preserve the information
|
|
| 2925 | +that some of the arguments came out of strict fields and therefore should be
|
|
| 2926 | +already evaluated and properly tagged (EPT) throughout the body of the
|
|
| 2927 | +function. We express this fact in Core like this:
|
|
| 2910 | 2928 | |
| 2911 | 2929 | data T = MkT A !B
|
| 2912 | 2930 | |
| 2913 | 2931 | foo = case T of MkT a b -> $wfoo a b
|
| 2914 | 2932 | |
| 2915 | 2933 | $wfoo a b = case b of b' -> rhs[b/b']
|
| 2934 | + ^^^^ The "extra eval"
|
|
| 2935 | + |
|
| 2936 | +Now
|
|
| 2937 | + * Throughout `rhs` the Simplifier can see that `b` is EPT, and can (say)
|
|
| 2938 | + drop evals on `b`.
|
|
| 2916 | 2939 | |
| 2917 | -This makes the worker strict in b causing us to use a more efficient
|
|
| 2918 | -calling convention for `b` where the caller needs to ensure `b` is
|
|
| 2919 | -properly tagged and evaluated before it's passed to $wfoo. See Note [CBV Function Ids].
|
|
| 2940 | + * The EPT enforcement pass will make $wfoo into a CBV function, where
|
|
| 2941 | + the caller guarantees to pass an EPT argument (see Note [EPT enforcement] in
|
|
| 2942 | + GHC.Core.Stg.EnforceEpt)
|
|
| 2920 | 2943 | |
| 2921 | -Usually the argument will be known to be properly tagged at the call site so there is
|
|
| 2944 | + * The code generator will discard that "extra eval" case, because $wfoo is
|
|
| 2945 | + CBV.
|
|
| 2946 | + |
|
| 2947 | +See also Note [CBV Function Ids: overview].
|
|
| 2948 | + |
|
| 2949 | +In tihs case the argument is known to be properly tagged at the call site so there is
|
|
| 2922 | 2950 | no additional work for the caller and the worker can be more efficient since it can
|
| 2923 | 2951 | assume the presence of a tag.
|
| 2924 | 2952 | |
| 2925 | 2953 | This is especially true for recursive functions like this:
|
| 2926 | 2954 | -- myPred expect it's argument properly tagged
|
| 2955 | + -- The EnforceEPT pass has made it a CBV function
|
|
| 2927 | 2956 | myPred !x = ...
|
| 2928 | 2957 | |
| 2929 | 2958 | loop :: MyPair -> Int
|
| ... | ... | @@ -2933,11 +2962,10 @@ This is especially true for recursive functions like this: |
| 2933 | 2962 | B -> 2
|
| 2934 | 2963 | _ -> loop (MyPair (myPred x) (myPred y))
|
| 2935 | 2964 | |
| 2936 | -Here we would ordinarily not be strict in y after unboxing.
|
|
| 2937 | -However if we pass it as a regular argument then this means on
|
|
| 2938 | -every iteration of loop we will incur an extra seq on y before
|
|
| 2939 | -we can pass it to `myPred` which isn't great! That is in STG after
|
|
| 2940 | -tag inference we get:
|
|
| 2965 | +Here we would ordinarily not be strict in y after unboxing. However if we pass
|
|
| 2966 | +it as a regular argument then this means on every iteration of loop we will
|
|
| 2967 | +incur an extra seq on y before we can pass it to `myPred` which isn't great!
|
|
| 2968 | +That is in STG after tag inference we get:
|
|
| 2941 | 2969 | |
| 2942 | 2970 | Rec {
|
| 2943 | 2971 | Find.$wloop [InlPrag=[2], Occ=LoopBreaker]
|
| ... | ... | @@ -2962,7 +2990,7 @@ tag inference we get: |
| 2962 | 2990 | };
|
| 2963 | 2991 | end Rec }
|
| 2964 | 2992 | |
| 2965 | -Here comes the tricky part: If we make $wloop strict in both x/y and we get:
|
|
| 2993 | +But if we add an extra eval on `y` during worker/wrapper we this this:
|
|
| 2966 | 2994 | |
| 2967 | 2995 | Rec {
|
| 2968 | 2996 | Find.$wloop [InlPrag=[2], Occ=LoopBreaker]
|
| ... | ... | @@ -2986,18 +3014,24 @@ Here comes the tricky part: If we make $wloop strict in both x/y and we get: |
| 2986 | 3014 | };
|
| 2987 | 3015 | end Rec }
|
| 2988 | 3016 | |
| 2989 | -Here both x and y are known to be tagged in the function body since we pass strict worker args using unlifted cbv.
|
|
| 2990 | -This means the seqs on x and y both become no-ops and compared to the first version the seq on `y` disappears at runtime.
|
|
| 2991 | - |
|
| 2992 | -The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated.
|
|
| 2993 | -But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated
|
|
| 2994 | -already at the call site because of the EPT Invariant! See Note [EPT enforcement] for more in this.
|
|
| 2995 | -This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good.
|
|
| 2996 | - |
|
| 2997 | -We only apply this when we think there is a benefit in doing so however. There are a number of cases in which
|
|
| 2998 | -it would be useless to insert an extra seq. ShouldStrictifyIdForCbv tries to identify these to avoid churn in the
|
|
| 3017 | +Here both x and y are known to be tagged in the function body since we pass
|
|
| 3018 | +strict worker args using unlifted cbv. This means the seqs on x and y both
|
|
| 3019 | +become no-ops (via (EPT-codegen) in Not [EPT enforcement]) and, compared to the
|
|
| 3020 | +first version, the seq on `y` disappears at runtime.
|
|
| 3021 | + |
|
| 3022 | +The downside is that the caller of $wfoo potentially has to evaluate `y` once if
|
|
| 3023 | +we can't prove it isn't already evaluated. The wrapper, which calls `$wfoo` has
|
|
| 3024 | +just pulled `y` out of a strict field of a data constructor, so it will always
|
|
| 3025 | +be EPT. See Note [EPT enforcement] for more in this. This makes GHC itself
|
|
| 3026 | +around 1% faster despite doing slightly more work! So this is generally quite
|
|
| 3027 | +good.
|
|
| 3028 | + |
|
| 3029 | +We only apply this when we think there is a benefit in doing so however. There
|
|
| 3030 | +are a number of cases in which it would be useless to insert an extra
|
|
| 3031 | +seq. `wantCbvForId` tries to identify these to avoid churn in the
|
|
| 2999 | 3032 | simplifier. See Note [Which Ids should be strictified] for details on this.
|
| 3000 | 3033 | -}
|
| 3034 | + |
|
| 3001 | 3035 | mkStrictFieldSeqs :: [(Id,StrictnessMark)] -> CoreExpr -> (CoreExpr)
|
| 3002 | 3036 | mkStrictFieldSeqs args rhs =
|
| 3003 | 3037 | foldr addEval rhs args
|
| ... | ... | @@ -3007,7 +3041,7 @@ mkStrictFieldSeqs args rhs = |
| 3007 | 3041 | addEval (arg_id,arg_cbv) (rhs)
|
| 3008 | 3042 | -- Argument representing strict field.
|
| 3009 | 3043 | | isMarkedStrict arg_cbv
|
| 3010 | - , shouldStrictifyIdForCbv arg_id
|
|
| 3044 | + , wantCbvForId arg_id
|
|
| 3011 | 3045 | -- Make sure to remove unfoldings here to avoid the simplifier dropping those for OtherCon[] unfoldings.
|
| 3012 | 3046 | = Case (Var $! zapIdUnfolding arg_id) arg_id case_ty ([Alt DEFAULT [] rhs])
|
| 3013 | 3047 | -- Normal argument
|
| ... | ... | @@ -3027,87 +3061,99 @@ There are multiple reasons why we might not want to insert a seq in the rhs to |
| 3027 | 3061 | strictify a functions argument:
|
| 3028 | 3062 | |
| 3029 | 3063 | 1) The argument doesn't exist at runtime.
|
| 3030 | - |
|
| 3031 | -For zero width types (like Types) there is no benefit as we don't operate on them
|
|
| 3032 | -at runtime at all. This includes things like void#, coercions and state tokens.
|
|
| 3064 | + For zero width types (like Types) there is no benefit as we don't operate on them
|
|
| 3065 | + at runtime at all. This includes things like void#, coercions and state tokens.
|
|
| 3033 | 3066 | |
| 3034 | 3067 | 2) The argument is a unlifted type.
|
| 3035 | - |
|
| 3036 | -If the argument is a unlifted type the calling convention already is explicitly
|
|
| 3037 | -cbv. This means inserting a seq on this argument wouldn't do anything as the seq
|
|
| 3038 | -would be a no-op *and* it wouldn't affect the calling convention.
|
|
| 3068 | + If the argument is a unlifted type the calling convention already is explicitly
|
|
| 3069 | + cbv. This means inserting a seq on this argument wouldn't do anything as the seq
|
|
| 3070 | + would be a no-op *and* it wouldn't affect the calling convention.
|
|
| 3039 | 3071 | |
| 3040 | 3072 | 3) The argument is absent.
|
| 3073 | + If the argument is absent in the body there is no advantage to it being passed as
|
|
| 3074 | + cbv to the function. The function won't ever look at it so we don't save any work.
|
|
| 3041 | 3075 | |
| 3042 | -If the argument is absent in the body there is no advantage to it being passed as
|
|
| 3043 | -cbv to the function. The function won't ever look at it so we don't safe any work.
|
|
| 3044 | - |
|
| 3045 | -This mostly happens for join point. For example we might have:
|
|
| 3046 | - |
|
| 3047 | - data T = MkT ![Int] [Char]
|
|
| 3048 | - f t = case t of MkT xs{strict} ys-> snd (xs,ys)
|
|
| 3049 | - |
|
| 3050 | -and abstract the case alternative to:
|
|
| 3051 | - |
|
| 3052 | - f t = join j1 = \xs ys -> snd (xs,ys)
|
|
| 3053 | - in case t of MkT xs{strict} ys-> j1 xs xy
|
|
| 3054 | - |
|
| 3055 | -While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
|
|
| 3056 | -In short a absent demand means neither our RHS, nor any function we pass the argument
|
|
| 3057 | -to will inspect it. So there is no work to be saved by forcing `xs` early.
|
|
| 3076 | + This mostly happens for join points. For example we might have:
|
|
| 3058 | 3077 | |
| 3059 | -NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
|
|
| 3060 | -Note [Absent fillers] has an example of this. However this is so rare it's not worth
|
|
| 3061 | -caring about here.
|
|
| 3078 | + data T = MkT ![Int] [Char]
|
|
| 3079 | + f t = case t of MkT xs{strict} ys-> snd (xs,ys)
|
|
| 3062 | 3080 | |
| 3063 | -4) The argument is already strict.
|
|
| 3081 | + and abstract the case alternative to:
|
|
| 3064 | 3082 | |
| 3065 | -Consider this code:
|
|
| 3083 | + f t = join j1 = \xs ys -> snd (xs,ys)
|
|
| 3084 | + in case t of MkT xs{strict} ys-> j1 xs xy
|
|
| 3066 | 3085 | |
| 3067 | - data T = MkT ![Int]
|
|
| 3068 | - f t = case t of MkT xs{strict} -> reverse xs
|
|
| 3086 | + While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
|
|
| 3087 | + In short a absent demand means neither our RHS, nor any function we pass the argument
|
|
| 3088 | + to will inspect it. So there is no work to be saved by forcing `xs` early.
|
|
| 3069 | 3089 | |
| 3070 | -The `xs{strict}` indicates that `xs` is used strictly by the `reverse xs`.
|
|
| 3071 | -If we do a w/w split, and add the extra eval on `xs`, we'll get
|
|
| 3072 | - |
|
| 3073 | - $wf xs =
|
|
| 3074 | - case xs of xs1 ->
|
|
| 3075 | - let t = MkT xs1 in
|
|
| 3076 | - case t of MkT xs2 -> reverse xs2
|
|
| 3077 | - |
|
| 3078 | -That's not wrong; but the w/w body will simplify to
|
|
| 3079 | - |
|
| 3080 | - $wf xs = case xs of xs1 -> reverse xs1
|
|
| 3081 | - |
|
| 3082 | -and now we'll drop the `case xs` because `xs1` is used strictly in its scope.
|
|
| 3083 | -Adding that eval was a waste of time. So don't add it for strictly-demanded Ids.
|
|
| 3090 | + NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
|
|
| 3091 | + Note [Absent fillers] has an example of this. However this is so rare it's not worth
|
|
| 3092 | + caring about here.
|
|
| 3084 | 3093 | |
| 3085 | 3094 | 5) Functions
|
| 3086 | - |
|
| 3087 | -Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
|
|
| 3088 | -But the gist of it even if we make a higher order function argument strict
|
|
| 3089 | -we can't avoid the tag check when it's used later in the body.
|
|
| 3090 | -So there is no benefit.
|
|
| 3095 | + Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
|
|
| 3096 | + But the gist of it even if we make a higher order function argument strict
|
|
| 3097 | + we can't avoid the tag check when it's used later in the body.
|
|
| 3098 | + So there is no benefit.
|
|
| 3099 | + |
|
| 3100 | +Wrinkles:
|
|
| 3101 | + |
|
| 3102 | +(WIS1) You might have thought that we can omit the eval if the argument is used
|
|
| 3103 | + strictly demanded in the body. But you'd be wrong. Consider this code:
|
|
| 3104 | + data T = MkT ![Int]
|
|
| 3105 | + f t = case t of MkT xs{Dmd=STR} -> reverse xs
|
|
| 3106 | + |
|
| 3107 | + The `xs{Dmd=STR}` indicates that `xs` is used strictly by the `reverse xs`.
|
|
| 3108 | + If we do a w/w split, and add the extra eval on `xs`, we'll get
|
|
| 3109 | + $wf xs = case xs of xs1 ->
|
|
| 3110 | + let t = MkT xs1 in
|
|
| 3111 | + case t of MkT xs2 -> reverse xs2
|
|
| 3112 | + |
|
| 3113 | + That's not wrong; but you might wonder if the eval on `xs` is needed
|
|
| 3114 | + when it is certainly evaluated by the `reverse`. But yes, it is (#26722):
|
|
| 3115 | + g s True t = f s t t
|
|
| 3116 | + g s False t = g s True t
|
|
| 3117 | + |
|
| 3118 | + f True (MkT xs) t = f False (MkT xs) t
|
|
| 3119 | + f False (MkT xs) _ = xs
|
|
| 3120 | + |
|
| 3121 | + After worker/wrapper we get:
|
|
| 3122 | + g s b t = case t of MkT ww -> $wg s b ww
|
|
| 3123 | + $wg s ds ww = case ds of {
|
|
| 3124 | + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
|
|
| 3125 | + True -> let { t1 = MkT ww } in f s t1 t1 }
|
|
| 3126 | + |
|
| 3127 | + We must make `f` inline inside `$wg`, because `f` too is ww'd, and we
|
|
| 3128 | + don't want to rebox `t1` before passing it to `f`. BUT while `t1`
|
|
| 3129 | + looks like a HNF, `exprIsHNF` will say False because `MkT` is strict
|
|
| 3130 | + and `ww` isn't evaluated. So `f` doesn't inline and we get lots of
|
|
| 3131 | + reboxing.
|
|
| 3132 | + |
|
| 3133 | + The Right Thing to to is to add the eval for the data con argument:
|
|
| 3134 | + $wg s ds ww = case ww of ww' { DEFAULT ->
|
|
| 3135 | + case ds of {
|
|
| 3136 | + False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
|
|
| 3137 | + True -> let { t1 = MkT ww' } in f s t1 t1 } }
|
|
| 3138 | + |
|
| 3139 | + Now `t1` will be a HNF, and `f` will inline, and we get
|
|
| 3140 | + $wg s ds ww = case ww of ww' { DEFAULT ->
|
|
| 3141 | + case ds of {
|
|
| 3142 | + False -> Bar.$wg s True ww'
|
|
| 3143 | + True -> $wf s ww'
|
|
| 3144 | + |
|
| 3145 | + (Ultimately `$wg` will be a CBV function, so that `case ww` will be a
|
|
| 3146 | + no-op: see (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt.)
|
|
| 3091 | 3147 | |
| 3092 | 3148 | -}
|
| 3093 | --- | Do we expect there to be any benefit if we make this var strict
|
|
| 3094 | --- in order for it to get treated as as cbv argument?
|
|
| 3095 | --- See Note [Which Ids should be strictified]
|
|
| 3096 | --- See Note [CBV Function Ids] for more background.
|
|
| 3097 | -shouldStrictifyIdForCbv :: Var -> Bool
|
|
| 3098 | -shouldStrictifyIdForCbv = wantCbvForId False
|
|
| 3099 | - |
|
| 3100 | --- Like shouldStrictifyIdForCbv but also wants to use cbv for strict args.
|
|
| 3101 | -shouldUseCbvForId :: Var -> Bool
|
|
| 3102 | -shouldUseCbvForId = wantCbvForId True
|
|
| 3103 | 3149 | |
| 3104 | 3150 | -- When we strictify we want to skip strict args otherwise the logic is the same
|
| 3151 | +-- as for wantCbvForId so we common up the logic here.
|
|
| 3105 | 3152 | -- Basically returns true if it would be beneficial for runtime to pass this argument
|
| 3106 | 3153 | -- as CBV independent of weither or not it's correct. E.g. it might return true for lazy args
|
| 3107 | 3154 | -- we are not allowed to force.
|
| 3108 | --- we are not allowed to force.
|
|
| 3109 | -wantCbvForId :: Bool -> Var -> Bool
|
|
| 3155 | +wantCbvForId :: Var -> Bool
|
|
| 3156 | +wantCbvForId v
|
|
| 3110 | 3157 | -- Must be a runtime var.
|
| 3111 | 3158 | -- See Note [Which Ids should be strictified] point 1)
|
| 3112 | 3159 | | isId v
|
| ... | ... | @@ -3121,9 +3167,6 @@ wantCbvForId cbv_for_strict v |
| 3121 | 3167 | , not $ isFunTy ty
|
| 3122 | 3168 | -- If the var is strict already a seq is redundant.
|
| 3123 | 3169 | -- See Note [Which Ids should be strictified] point 4)
|
| 3124 | - , not (isStrictDmd dmd) || cbv_for_strict
|
|
| 3125 | - -- If the var is absent a seq is almost always useless.
|
|
| 3126 | - -- See Note [Which Ids should be strictified] point 3)
|
|
| 3127 | 3170 | , not (isAbsDmd dmd)
|
| 3128 | 3171 | = True
|
| 3129 | 3172 | | otherwise
|
| ... | ... | @@ -1582,7 +1582,8 @@ maybeSaturate fn expr n_args unsat_ticks |
| 1582 | 1582 | -- See Note [Eta expansion of hasNoBinding things in CorePrep]
|
| 1583 | 1583 | = return $ wrapLamBody (\body -> foldr mkTick body unsat_ticks) sat_expr
|
| 1584 | 1584 | |
| 1585 | - | mark_arity > 0 -- A call-by-value function. See Note [CBV Function Ids]
|
|
| 1585 | + | mark_arity > 0 -- A call-by-value function.
|
|
| 1586 | + -- See Note [CBV Function Ids: overview]
|
|
| 1586 | 1587 | , not applied_marks
|
| 1587 | 1588 | = assertPpr
|
| 1588 | 1589 | ( not (isJoinId fn)) -- See Note [Do not eta-expand join points]
|
| ... | ... | @@ -66,13 +66,13 @@ SG thinks it would be good to fix this; see #21792. |
| 66 | 66 | Note [EPT enforcement]
|
| 67 | 67 | ~~~~~~~~~~~~~~~~~~~~~~
|
| 68 | 68 | The goal of EnforceEPT pass is to mark as many binders as possible as EPT
|
| 69 | -(see Note [Evaluated and Properly Tagged]).
|
|
| 70 | -To find more EPT binders, it establishes the following
|
|
| 69 | +(see Note [Evaluated and Properly Tagged]). It establishes the following
|
|
| 70 | +invariant:
|
|
| 71 | 71 | |
| 72 | 72 | EPT INVARIANT:
|
| 73 | 73 | > Any binder of
|
| 74 | 74 | > * a strict field (see Note [Strict fields in Core]), or
|
| 75 | -> * a CBV argument (see Note [CBV Function Ids])
|
|
| 75 | +> * a CBV argument (see Note [CBV Function Ids: overview])
|
|
| 76 | 76 | > is EPT.
|
| 77 | 77 | |
| 78 | 78 | (Note that prior to EPT enforcement, this invariant may *not* always be upheld.
|
| ... | ... | @@ -105,7 +105,7 @@ however, we presently only promote worker functions such as $wf to CBV because |
| 105 | 105 | we see all its call sites and can use the proper by-value calling convention.
|
| 106 | 106 | More precisely, with -O0, we guarantee that no CBV functions are visible in
|
| 107 | 107 | the interface file, so that naïve clients do not need to know how to call CBV
|
| 108 | -functions. See Note [CBV Function Ids] for more details.
|
|
| 108 | +functions. See Note [CBV Function Ids: overview] for more details.
|
|
| 109 | 109 | |
| 110 | 110 | Specification
|
| 111 | 111 | -------------
|
| ... | ... | @@ -140,9 +140,11 @@ Afterwards, the *EPT rewriter* inserts the actual evals realising Upcasts. |
| 140 | 140 | Implementation
|
| 141 | 141 | --------------
|
| 142 | 142 | |
| 143 | -* EPT analysis is implemented in GHC.Stg.EnforceEpt.inferTags.
|
|
| 143 | +(EPT-anal) EPT analysis is implemented in `GHC.Stg.EnforceEpt.inferTags.`
|
|
| 144 | 144 | It attaches its result to /binders/, not occurrence sites.
|
| 145 | -* The EPT rewriter establishes the EPT invariant by inserting evals. That is, if
|
|
| 145 | + |
|
| 146 | +(EPT-rewrite) The EPT rewriter, `GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds`,
|
|
| 147 | + establishes the EPT invariant by inserting evals. That is, if
|
|
| 146 | 148 | (a) a binder x is used to
|
| 147 | 149 | * construct a strict field (`SP x y`), or
|
| 148 | 150 | * passed as a CBV argument (`$wf x`),
|
| ... | ... | @@ -152,17 +154,27 @@ Implementation |
| 152 | 154 | case x of x' { __ DEFAULT -> SP x' y }.
|
| 153 | 155 | case x of x' { __ DEFAULT -> $wf x' }.
|
| 154 | 156 | (Recall that the case binder x' is always EPT.)
|
| 155 | - This is implemented in GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds.
|
|
| 157 | + |
|
| 156 | 158 | This pass also propagates the EPTness from binders to occurrences.
|
| 159 | + |
|
| 157 | 160 | It is sound to insert evals on strict fields (Note [Strict fields in Core]),
|
| 158 | - and on CBV arguments as well (Note [CBV Function Ids]).
|
|
| 159 | -* We also export the EPTness of top level bindings to allow this optimisation
|
|
| 161 | + and on CBV arguments as well (Note [CBV Function Ids: overview]).
|
|
| 162 | + |
|
| 163 | +(EPT-codegen) Finally, code generation for (case x of alts) skips the thunk check
|
|
| 164 | + when `x` is EPT. This is done (a bit indirectly) thus:
|
|
| 165 | + * GHC.StgToCmm.Expr.cgCase: builds a `sequel`, and recurses into `cgExpr` on `x`.
|
|
| 166 | + * When `cgExpr` sees a `x` goes to `cgIdApp`, which uses `getCallMethod`.
|
|
| 167 | + * Then `getCallMethod` sees that `x` is EPT (via `idTagSigMaybe`), and
|
|
| 168 | + returns `InferredReturnIt`.
|
|
| 169 | + * Now `cgIdApp` can jump straight to the case-alternative switch in the `sequel`
|
|
| 170 | + constructed by `cgCase`.
|
|
| 171 | + |
|
| 172 | +(EPT-export) We also export the EPTness of top level bindings to allow this optimisation
|
|
| 160 | 173 | to work across module boundaries.
|
| 174 | + |
|
| 161 | 175 | NB: The EPT Invariant *must* be upheld, regardless of the optimisation level;
|
| 162 | 176 | hence EPTness is practically part of the internal ABI of a strict data
|
| 163 | - constructor or CBV function. Note [CBV Function Ids] contains the details.
|
|
| 164 | -* Finally, code generation skips the thunk check when branching on binders that
|
|
| 165 | - are EPT. This is done by `cgExpr`/`cgCase` in the backend.
|
|
| 177 | + constructor or CBV function. Note [CBV Function Ids: overview] has the details.
|
|
| 166 | 178 | |
| 167 | 179 | Evaluation
|
| 168 | 180 | ----------
|
| ... | ... | @@ -420,7 +420,7 @@ lintAppCbvMarks e@(StgApp fun args) = do |
| 420 | 420 | when (lf_unarised lf) $ do
|
| 421 | 421 | -- A function which expects a unlifted argument as n'th argument
|
| 422 | 422 | -- always needs to be applied to n arguments.
|
| 423 | - -- See Note [CBV Function Ids].
|
|
| 423 | + -- See Note [CBV Function Ids: overview].
|
|
| 424 | 424 | let marks = fromMaybe [] $ idCbvMarks_maybe fun
|
| 425 | 425 | when (length (dropWhileEndLE (not . isMarkedCbv) marks) > length args) $ do
|
| 426 | 426 | addErrL $ hang (text "Undersatured cbv marked ID in App" <+> ppr e ) 2 $
|
| ... | ... | @@ -617,12 +617,15 @@ getCallMethod cfg name id (LFThunk _ _ updatable std_form_info is_fun) |
| 617 | 617 | getCallMethod cfg name id (LFUnknown might_be_a_function) n_args _cg_locs _self_loop_info
|
| 618 | 618 | | n_args == 0
|
| 619 | 619 | , Just sig <- idTagSig_maybe id
|
| 620 | - , isTaggedSig sig -- Infered to be already evaluated by EPT analysis
|
|
| 621 | - -- When profiling we must enter all potential functions to make sure we update the SCC
|
|
| 622 | - -- even if the function itself is already evaluated.
|
|
| 620 | + , isTaggedSig sig -- This `id` is evaluated and properly tagged; no need to enter it
|
|
| 621 | + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
|
|
| 622 | + |
|
| 623 | + -- When profiling we must enter all potential functions to make sure we update
|
|
| 624 | + -- the SCC even if the function itself is already evaluated.
|
|
| 623 | 625 | -- See Note [Evaluating functions with profiling] in rts/Apply.cmm
|
| 624 | 626 | , not (profileIsProfiling (stgToCmmProfile cfg) && might_be_a_function)
|
| 625 | - = InferedReturnIt -- See Note [EPT enforcement]
|
|
| 627 | + |
|
| 628 | + = InferedReturnIt -- See (EPT-codegen) in Note [EPT enforcement]
|
|
| 626 | 629 | |
| 627 | 630 | | might_be_a_function = SlowCall
|
| 628 | 631 |
| ... | ... | @@ -1053,6 +1053,7 @@ cgIdApp fun_id args = do |
| 1053 | 1053 | | otherwise -> emitReturn [fun]
|
| 1054 | 1054 | |
| 1055 | 1055 | -- A value infered to be in WHNF, so we can just return it.
|
| 1056 | + -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
|
|
| 1056 | 1057 | InferedReturnIt
|
| 1057 | 1058 | | isZeroBitTy (idType fun_id) -> trace >> emitReturn []
|
| 1058 | 1059 | | otherwise -> trace >> assertTag >>
|
| ... | ... | @@ -117,7 +117,7 @@ module GHC.Types.Id ( |
| 117 | 117 | setIdCbvMarks,
|
| 118 | 118 | idCbvMarks_maybe,
|
| 119 | 119 | idCbvMarkArity,
|
| 120 | - asWorkerLikeId, asNonWorkerLikeId,
|
|
| 120 | + setCbvCandidate, removeCbvCandidate,
|
|
| 121 | 121 | |
| 122 | 122 | idDemandInfo,
|
| 123 | 123 | idDmdSig,
|
| ... | ... | @@ -563,7 +563,7 @@ isDataConId id = case Var.idDetails id of |
| 563 | 563 | |
| 564 | 564 | -- | An Id for which we might require all callers to pass strict arguments properly tagged + evaluated.
|
| 565 | 565 | --
|
| 566 | --- See Note [CBV Function Ids]
|
|
| 566 | +-- See Note [CBV Function Ids: overview]
|
|
| 567 | 567 | isWorkerLikeId :: Id -> Bool
|
| 568 | 568 | isWorkerLikeId id = case Var.idDetails id of
|
| 569 | 569 | WorkerLikeId _ -> True
|
| ... | ... | @@ -668,19 +668,20 @@ idJoinArity id = case idJoinPointHood id of |
| 668 | 668 | NotJoinPoint -> pprPanic "idJoinArity" (ppr id)
|
| 669 | 669 | |
| 670 | 670 | asJoinId :: Id -> JoinArity -> JoinId
|
| 671 | -asJoinId id arity = warnPprTrace (not (isLocalId id))
|
|
| 672 | - "global id being marked as join var" (ppr id) $
|
|
| 673 | - warnPprTrace (not (is_vanilla_or_join id))
|
|
| 674 | - "asJoinId"
|
|
| 675 | - (ppr id <+> pprIdDetails (idDetails id)) $
|
|
| 676 | - id `setIdDetails` JoinId arity (idCbvMarks_maybe id)
|
|
| 671 | +asJoinId id arity
|
|
| 672 | + = warnPprTrace (not (isLocalId id))
|
|
| 673 | + "global id being marked as join var" (ppr id) $
|
|
| 674 | + id `setIdDetails` JoinId arity cbv_info
|
|
| 677 | 675 | where
|
| 678 | - is_vanilla_or_join id = case Var.idDetails id of
|
|
| 679 | - VanillaId -> True
|
|
| 680 | - -- Can workers become join ids? Yes!
|
|
| 681 | - WorkerLikeId {} -> pprTraceDebug "asJoinId (call by value function)" (ppr id) True
|
|
| 682 | - JoinId {} -> True
|
|
| 683 | - _ -> False
|
|
| 676 | + cbv_info = case Var.idDetails id of
|
|
| 677 | + VanillaId -> Nothing
|
|
| 678 | + WorkerLikeId marks -> Just marks
|
|
| 679 | + JoinId _ mb_marks -> mb_marks
|
|
| 680 | + _ -> pprTraceDebug "asJoinId"
|
|
| 681 | + (ppr id <+> pprIdDetails (idDetails id)) $
|
|
| 682 | + Nothing
|
|
| 683 | + -- Can workers become join ids? Yes!
|
|
| 684 | + -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
|
|
| 684 | 685 | |
| 685 | 686 | zapJoinId :: Id -> Id
|
| 686 | 687 | -- May be a regular id already
|
| ... | ... | @@ -691,7 +692,7 @@ zapJoinId jid | isJoinId jid = zapIdTailCallInfo (newIdDetails `seq` jid `setIdD |
| 691 | 692 | where
|
| 692 | 693 | newIdDetails = case idDetails jid of
|
| 693 | 694 | -- We treat join points as CBV functions. Even after they are floated out.
|
| 694 | - -- See Note [Use CBV semantics only for join points and workers]
|
|
| 695 | + -- See Note [Which Ids should be CBV candidates?]
|
|
| 695 | 696 | JoinId _ (Just marks) -> WorkerLikeId marks
|
| 696 | 697 | JoinId _ Nothing -> WorkerLikeId []
|
| 697 | 698 | _ -> panic "zapJoinId: newIdDetails can only be used if Id was a join Id."
|
| ... | ... | @@ -840,7 +841,7 @@ setIdCbvMarks id marks |
| 840 | 841 | -- Perhaps that's sensible but for now be conservative.
|
| 841 | 842 | -- Similarly we don't need any lazy marks at the end of the list.
|
| 842 | 843 | -- This way the length of the list is always exactly number of arguments
|
| 843 | - -- that must be visible to CodeGen. See See Note [CBV Function Ids]
|
|
| 844 | + -- that must be visible to CodeGen. See Note [CBV Function Ids: overview]
|
|
| 844 | 845 | -- for more details.
|
| 845 | 846 | trimmedMarks = dropWhileEndLE (not . isMarkedCbv) $ take (idArity id) marks
|
| 846 | 847 | |
| ... | ... | @@ -855,18 +856,10 @@ idCbvMarks_maybe id = case idDetails id of |
| 855 | 856 | idCbvMarkArity :: Id -> Arity
|
| 856 | 857 | idCbvMarkArity fn = maybe 0 length (idCbvMarks_maybe fn)
|
| 857 | 858 | |
| 858 | --- | Remove any cbv marks on arguments from a given Id.
|
|
| 859 | -asNonWorkerLikeId :: Id -> Id
|
|
| 860 | -asNonWorkerLikeId id =
|
|
| 861 | - let details = case idDetails id of
|
|
| 862 | - WorkerLikeId{} -> Just $ VanillaId
|
|
| 863 | - JoinId arity Just{} -> Just $ JoinId arity Nothing
|
|
| 864 | - _ -> Nothing
|
|
| 865 | - in maybeModifyIdDetails details id
|
|
| 866 | - |
|
| 867 | --- | Turn this id into a WorkerLikeId if possible.
|
|
| 868 | -asWorkerLikeId :: Id -> Id
|
|
| 869 | -asWorkerLikeId id =
|
|
| 859 | +-- | Make this Id into a candidate for CBV treatment, if possible.
|
|
| 860 | +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
|
|
| 861 | +setCbvCandidate :: Id -> Id
|
|
| 862 | +setCbvCandidate id =
|
|
| 870 | 863 | let details = case idDetails id of
|
| 871 | 864 | WorkerLikeId{} -> Nothing
|
| 872 | 865 | JoinId _arity Just{} -> Nothing
|
| ... | ... | @@ -875,6 +868,16 @@ asWorkerLikeId id = |
| 875 | 868 | _ -> Nothing
|
| 876 | 869 | in maybeModifyIdDetails details id
|
| 877 | 870 | |
| 871 | +-- | Remove any CBV-candidate info from a given Id.
|
|
| 872 | +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
|
|
| 873 | +removeCbvCandidate :: Id -> Id
|
|
| 874 | +removeCbvCandidate id =
|
|
| 875 | + let details = case idDetails id of
|
|
| 876 | + WorkerLikeId{} -> Just $ VanillaId
|
|
| 877 | + JoinId arity Just{} -> Just $ JoinId arity Nothing
|
|
| 878 | + _ -> Nothing
|
|
| 879 | + in maybeModifyIdDetails details id
|
|
| 880 | + |
|
| 878 | 881 | setCaseBndrEvald :: StrictnessMark -> Id -> Id
|
| 879 | 882 | -- Used for variables bound by a case expressions, both the case-binder
|
| 880 | 883 | -- itself, and any pattern-bound variables that are argument of a
|
| ... | ... | @@ -208,14 +208,14 @@ data IdDetails |
| 208 | 208 | -- ^ An 'Id' for a join point taking n arguments
|
| 209 | 209 | -- Note [Join points] in "GHC.Core"
|
| 210 | 210 | -- Can also work as a WorkerLikeId if given `CbvMark`s.
|
| 211 | - -- See Note [CBV Function Ids]
|
|
| 211 | + -- See Note [CBV Function Ids: overview]
|
|
| 212 | 212 | -- The [CbvMark] is always empty (and ignored) until after Tidy.
|
| 213 | 213 | | WorkerLikeId [CbvMark]
|
| 214 | 214 | -- ^ An 'Id' for a worker like function, which might expect some arguments to be
|
| 215 | 215 | -- passed both evaluated and tagged.
|
| 216 | 216 | -- Worker like functions are create by W/W and SpecConstr and we can expect that they
|
| 217 | 217 | -- aren't used unapplied.
|
| 218 | - -- See Note [CBV Function Ids]
|
|
| 218 | + -- See Note [CBV Function Ids: overview]
|
|
| 219 | 219 | -- See Note [EPT enforcement]
|
| 220 | 220 | -- The [CbvMark] is always empty (and ignored) until after Tidy for ids from the current
|
| 221 | 221 | -- module.
|
| ... | ... | @@ -244,85 +244,114 @@ conLikesRecSelInfo con_likes lbls |
| 244 | 244 | has_fld dc lbl = any (\ fl -> flLabel fl == lbl) (conLikeFieldLabels dc)
|
| 245 | 245 | |
| 246 | 246 | |
| 247 | -{- Note [CBV Function Ids]
|
|
| 248 | -~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 249 | -A WorkerLikeId essentially allows us to constrain the calling convention
|
|
| 250 | -for the given Id. Each such Id carries with it a list of CbvMarks
|
|
| 251 | -with each element representing a value argument. Arguments who have
|
|
| 252 | -a matching `MarkedCbv` entry in the list need to be passed evaluated+*properly tagged*.
|
|
| 253 | - |
|
| 254 | -CallByValueFunIds give us additional expressiveness which we use to improve
|
|
| 255 | -runtime. This is all part of the EPT enforcement work. See also Note [EPT enforcement].
|
|
| 256 | - |
|
| 257 | -They allows us to express the fact that an argument is not only evaluated to WHNF once we
|
|
| 258 | -entered it's RHS but also that an lifted argument is already *properly tagged* once we jump
|
|
| 259 | -into the RHS.
|
|
| 260 | -This means when e.g. branching on such an argument the RHS doesn't needed to perform
|
|
| 261 | -an eval check to ensure the argument isn't an indirection. All seqs on such an argument in
|
|
| 262 | -the functions body become no-ops as well.
|
|
| 263 | - |
|
| 264 | -The invariants around the arguments of call by value function like Ids are then:
|
|
| 265 | - |
|
| 266 | -* In any call `(f e1 .. en)`, if `f`'s i'th argument is marked `MarkedCbv`,
|
|
| 267 | - then the caller must ensure that the i'th argument
|
|
| 268 | - * points directly to the value (and hence is certainly evaluated before the call)
|
|
| 269 | - * is a properly tagged pointer to that value
|
|
| 270 | - |
|
| 271 | -* The following functions (and only these functions) have `CbvMarks`:
|
|
| 272 | - * Any `WorkerLikeId`
|
|
| 273 | - * Some `JoinId` bindings.
|
|
| 274 | - |
|
| 275 | -This works analogous to the EPT Invariant. See also Note [EPT enforcement].
|
|
| 276 | - |
|
| 277 | -To make this work what we do is:
|
|
| 278 | -* During W/W and SpecConstr any worker/specialized binding we introduce
|
|
| 279 | - is marked as a worker binding by `asWorkerLikeId`.
|
|
| 280 | -* W/W and SpecConstr further set OtherCon[] unfoldings on arguments which
|
|
| 281 | - represent contents of a strict fields.
|
|
| 282 | -* During Tidy we look at all bindings.
|
|
| 283 | - For any callByValueLike Id and join point we mark arguments as cbv if they
|
|
| 284 | - Are strict. We don't do so for regular bindings.
|
|
| 285 | - See Note [Use CBV semantics only for join points and workers] for why.
|
|
| 286 | - We might have made some ids rhs *more* strict in order to make their arguments
|
|
| 287 | - be passed CBV. See Note [Call-by-value for worker args] for why.
|
|
| 288 | -* During CorePrep calls to CallByValueFunIds are eta expanded.
|
|
| 247 | +{- Note [CBV Function Ids: overview]
|
|
| 248 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 249 | +GHC can decide to use a call-by-value (CBV) calling convention for
|
|
| 250 | +(some arguments of) a function, implying that:
|
|
| 251 | + |
|
| 252 | +* The caller /must/ pass an argument that is evaluated and properly
|
|
| 253 | + tagged (EPT). See Note [Evaluated and Properly Tagged] in GHC.Stg.EnforceEpt.
|
|
| 254 | + |
|
| 255 | +* The function may /assume/ that the argument is EPT, and thereby omit
|
|
| 256 | + evals that would otherwise be necessary.
|
|
| 257 | + |
|
| 258 | +CBV-ness is part of the calling convention; it is not optional. If a function
|
|
| 259 | +is compiled with CBV arguments, callers /must/ respect it, else seg-fault
|
|
| 260 | +beckon.
|
|
| 261 | + |
|
| 262 | +Apart from the more efficent calling convention, a compelling reason for
|
|
| 263 | +a CBV calling conventions is worker-functions for strict data types.
|
|
| 264 | +Example:
|
|
| 265 | + data T a = MkT ![a]
|
|
| 266 | + f :: T Int -> blah
|
|
| 267 | + f (MkT y) = ...
|
|
| 268 | +We get a w/w split
|
|
| 269 | + $wf y = let x = MkT y in ...
|
|
| 270 | + f x = case x of MkT y -> $wf y
|
|
| 271 | +But in `$wf`, in general, we'd need to evaluate `y`, becuase `MkT` is strict.
|
|
| 272 | +With a CBV calling convention we can drop that stupid extra eval.
|
|
| 273 | + |
|
| 274 | +Here's how it all works:
|
|
| 275 | + |
|
| 276 | +* We identify some function Ids as "CBV candidates";
|
|
| 277 | + see Note [Which Ids should be CBV candidates?]
|
|
| 278 | + |
|
| 279 | +* During W/W and SpecConstr: any worker/specialized binding we introduce
|
|
| 280 | + is marked as a CBV-candidate by `asCbvCandidate`. This simply marks
|
|
| 281 | + the binding as a candidate for CBV-ness, using IdDetails `WorkerLikeId []`.
|
|
| 282 | + See Note [Which Ids should be CBV candidates?].
|
|
| 283 | + |
|
| 284 | + See also Note [Call-by-value for worker args] for how we build the worker RHS.
|
|
| 285 | + |
|
| 286 | +* A CBV candidate may become a join point; we are careful to retain
|
|
| 287 | + its CBV-candidature; see `GHC.Types.Id.asJoinId`. (Actually that hardly
|
|
| 288 | + matters because all join points are CBV-candidates.) A join point can also
|
|
| 289 | + become an ordinary Id, due to floating (see `zapJoinId`); again we are
|
|
| 290 | + careful to retain CBV-candidature.
|
|
| 291 | + |
|
| 292 | +* During Tidy, for CBV-candidate Ids, including join points, we mark any
|
|
| 293 | + /strict/ arguments as CBV. This is the point at which the CbvMarks inside a
|
|
| 294 | + WorkerLikeId are set. See `GHC.Core.Tidy.computeCbvInfo`, and
|
|
| 295 | + |
|
| 296 | + This step is informed by a late demand analysis, performed just before tidying
|
|
| 297 | + to identify strict arguments. See Note [Call-by-value for worker args] for
|
|
| 298 | + how a worker guarantees to be strict in strict datacon fields.
|
|
| 299 | + |
|
| 300 | + TODO: We currently don't do this for arguments that are unboxed sums or tuples,
|
|
| 301 | + because then we'd have to predict the result of unarisation. But it would be nice to
|
|
| 302 | + do so. See `computeCbvInfo`.
|
|
| 303 | + |
|
| 304 | +* During CorePrep calls to CBV Ids are eta expanded.
|
|
| 305 | + See `GHC.CoreToStg.Prep.maybeSaturate`.
|
|
| 306 | + |
|
| 289 | 307 | * During Stg CodeGen:
|
| 290 | 308 | * When we see a call to a callByValueLike Id:
|
| 291 | 309 | * We check if all arguments marked to be passed unlifted are already tagged.
|
| 292 | 310 | * If they aren't we will wrap the call in case expressions which will evaluate+tag
|
| 293 | 311 | these arguments before jumping to the function.
|
| 312 | + See (EPT-rewrite) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
|
|
| 313 | + |
|
| 294 | 314 | * During Cmm codeGen:
|
| 295 | 315 | * When generating code for the RHS of a StrictWorker binding
|
| 296 | 316 | we omit tag checks when using arguments marked as tagged.
|
| 317 | + See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
|
|
| 318 | + |
|
| 319 | +* Imported functions may be CBV, and then there is no point in eta-reducing
|
|
| 320 | + them; we'll just have to eta-expand later; see GHC.Core.Opt.Arity.cantEtaReduceFun.
|
|
| 297 | 321 | |
| 322 | +*** SPJ really? Andreas? ****
|
|
| 298 | 323 | We only use this for workers and specialized versions of SpecConstr
|
| 299 | 324 | But we also check other functions during tidy and potentially turn some of them into
|
| 300 | 325 | call by value functions and mark some of their arguments as call-by-value by looking at
|
| 301 | 326 | argument unfoldings.
|
| 302 | 327 | |
| 303 | -NB: I choose to put the information into a new Id constructor since these are loaded
|
|
| 304 | -at all optimization levels. This makes it trivial to ensure the additional
|
|
| 305 | -calling convention demands are available at all call sites. Putting it into
|
|
| 306 | -IdInfo would require us at the very least to always decode the IdInfo
|
|
| 328 | + |
|
| 329 | +NB: I choose to put the CBV information into the IdDetails since these are
|
|
| 330 | +loaded at all optimization levels. This makes it trivial to ensure the
|
|
| 331 | +additional calling convention demands are available at all call sites. Putting
|
|
| 332 | +it into IdInfo would require us at the very least to always decode the IdInfo
|
|
| 307 | 333 | just to decide if we need to throw it away or not after.
|
| 308 | 334 | |
| 309 | -Note [Use CBV semantics only for join points and workers]
|
|
| 310 | -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 311 | -A function with cbv-semantics requires arguments to be visible
|
|
| 312 | -and if no arguments are visible requires us to eta-expand it's
|
|
| 313 | -call site. That is for a binding with three cbv arguments like
|
|
| 314 | -`w[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated
|
|
| 315 | -occurrences like `map w xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs.
|
|
| 316 | - |
|
| 317 | -In experiments it turned out that the code size increase of doing so
|
|
| 318 | -can outweigh the performance benefits of doing so.
|
|
| 319 | -So we only do this for join points, workers and
|
|
| 320 | -specialized functions (from SpecConstr).
|
|
| 321 | -Join points are naturally always called saturated so
|
|
| 322 | -this problem can't occur for them.
|
|
| 323 | -For workers and specialized functions there are also always at least
|
|
| 324 | -some applied arguments as we won't inline the wrapper/apply their rule
|
|
| 325 | -if there are unapplied occurrences like `map f xs`.
|
|
| 335 | +Note [Which Ids should be CBV candidates?]
|
|
| 336 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
| 337 | +In principle, we could use a CBV calling convention for /any/ strict function.
|
|
| 338 | +But when we use CBV semantics the caller must obey the EPT calling convention,
|
|
| 339 | +and that may mean eta-expansion. For example, for a binding with three CBV
|
|
| 340 | +arguments like `foo[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated
|
|
| 341 | +occurrences like `map foo xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs.
|
|
| 342 | + |
|
| 343 | +In experiments it turned out that the code size increase of doing so can
|
|
| 344 | +outweigh the performance benefits of doing so.
|
|
| 345 | + |
|
| 346 | +So we treat only certain functions as candidates for CBV treatment:
|
|
| 347 | + * Workers created by worker/wrapper.
|
|
| 348 | + * Specialised functions from SpecConstr
|
|
| 349 | + * Join points
|
|
| 350 | + |
|
| 351 | +Reason:
|
|
| 352 | + * All of these are always called saturated (at birth anyway)
|
|
| 353 | + * For workers in particular we want to use CBV for strict
|
|
| 354 | + fields of data constructors
|
|
| 326 | 355 | -}
|
| 327 | 356 | |
| 328 | 357 | -- | Parent of a record selector function.
|
| ... | ... | @@ -1038,7 +1038,7 @@ until the final simplifier phase; see Note [Activation for data |
| 1038 | 1038 | constructor wrappers].
|
| 1039 | 1039 | |
| 1040 | 1040 | For further reading, see:
|
| 1041 | - * Note [Conlike is interesting] in GHC.Core.Op.Simplify.Utils
|
|
| 1041 | + * (IA1) in Note [Interesting arguments] in GHC.Core.Op.Simplify.Utils
|
|
| 1042 | 1042 | * Note [Lone variables] in GHC.Core.Unfold
|
| 1043 | 1043 | * Note [exprIsConApp_maybe on data constructors with wrappers]
|
| 1044 | 1044 | in GHC.Core.SimpleOpt
|
| ... | ... | @@ -143,14 +143,14 @@ T18013.$wmapMaybeRule [InlPrag=NOINLINE] |
| 143 | 143 | Unf=OtherCon []]
|
| 144 | 144 | T18013.$wmapMaybeRule
|
| 145 | 145 | = \ (@a) (@b) (@s) (ww :: s) (ww1 :: s -> a -> IO (Result s b)) ->
|
| 146 | + case ww of ww2 { __DEFAULT ->
|
|
| 146 | 147 | case ww1 of wild { __DEFAULT ->
|
| 147 | - case ww of wild1 { __DEFAULT ->
|
|
| 148 | 148 | T18013a.Rule
|
| 149 | 149 | @IO
|
| 150 | 150 | @(Maybe a)
|
| 151 | 151 | @(Maybe b)
|
| 152 | 152 | @s
|
| 153 | - wild1
|
|
| 153 | + ww2
|
|
| 154 | 154 | ((\ (s2 :: s)
|
| 155 | 155 | (a1 :: Maybe a)
|
| 156 | 156 | (s1 :: GHC.Internal.Prim.State# GHC.Internal.Prim.RealWorld) ->
|
| ... | ... | @@ -158,7 +158,7 @@ T18013.$wmapMaybeRule |
| 158 | 158 | Nothing ->
|
| 159 | 159 | (# s1,
|
| 160 | 160 | T18013a.Result
|
| 161 | - @s @(Maybe b) wild1 (GHC.Internal.Maybe.Nothing @b) #);
|
|
| 161 | + @s @(Maybe b) ww2 (GHC.Internal.Maybe.Nothing @b) #);
|
|
| 162 | 162 | Just x ->
|
| 163 | 163 | case ((wild s2 x)
|
| 164 | 164 | `cast` <Co:4> :: IO (Result s b)
|
| 1 | +module T26722 where
|
|
| 2 | + |
|
| 3 | +data T = MkT ![Int]
|
|
| 4 | + |
|
| 5 | +g s True t = f s t t
|
|
| 6 | +g s False t = g s True t
|
|
| 7 | + |
|
| 8 | +f True (MkT xs) t = f False (MkT xs) t
|
|
| 9 | +f False (MkT xs) _ = xs |
| 1 | + |
|
| \ No newline at end of file |
| ... | ... | @@ -575,3 +575,6 @@ test('T26682', normal, multimod_compile, ['T26682', '-O -v0']) |
| 575 | 575 | # In the bug report #26615, the overloaded calls were signalled by a dictionary
|
| 576 | 576 | # argument like fEqList_xxxx, so we grep for that. Not a very robust test
|
| 577 | 577 | test('T26615', [grep_errmsg(r'fEqList')], multimod_compile, ['T26615', '-O -fspec-constr -ddump-simpl -dsuppress-uniques'])
|
| 578 | + |
|
| 579 | +# T26722: there should be no reboxing in $wg
|
|
| 580 | +test('T26722', [grep_errmsg(r'SPEC')], compile, ['-O -dno-typeable-binds']) |