Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC

Commits:

20 changed files:

Changes:

  • compiler/GHC/Core/Opt/Arity.hs
    ... ... @@ -2515,7 +2515,7 @@ eta-reduce that are specific to Core and GHC:
    2515 2515
         See Note [Eta expanding primops].
    
    2516 2516
     
    
    2517 2517
     (W) We may not undersaturate StrictWorkerIds.
    
    2518
    -    See Note [CBV Function Ids] in GHC.Types.Id.Info.
    
    2518
    +    See Note [CBV Function Ids: overview] in GHC.Types.Id.Info.
    
    2519 2519
     
    
    2520 2520
     Here is a list of historic accidents surrounding unsound eta-reduction:
    
    2521 2521
     
    
    ... ... @@ -2848,7 +2848,7 @@ cantEtaReduceFun fun
    2848 2848
     
    
    2849 2849
         || (isJust (idCbvMarks_maybe fun)) -- (W)
    
    2850 2850
            -- Don't undersaturate StrictWorkerIds.
    
    2851
    -       -- See Note [CBV Function Ids] in GHC.Types.Id.Info.
    
    2851
    +       -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info.
    
    2852 2852
     
    
    2853 2853
     
    
    2854 2854
     {- *********************************************************************
    

  • compiler/GHC/Core/Opt/Simplify/Utils.hs
    ... ... @@ -982,15 +982,58 @@ But we don't regard (f x y) as interesting, unless f is unsaturated.
    982 982
     If it's saturated and f hasn't inlined, then it's probably not going
    
    983 983
     to now!
    
    984 984
     
    
    985
    -Note [Conlike is interesting]
    
    986
    -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    987
    -Consider
    
    988
    -        f d = ...((*) d x y)...
    
    989
    -        ... f (df d')...
    
    990
    -where df is con-like. Then we'd really like to inline 'f' so that the
    
    991
    -rule for (*) (df d) can fire.  To do this
    
    992
    -  a) we give a discount for being an argument of a class-op (eg (*) d)
    
    993
    -  b) we say that a con-like argument (eg (df d)) is interesting
    
    985
    +Wrinkles:
    
    986
    +
    
    987
    +(IA1) Conlike is interesting.
    
    988
    +   Consider
    
    989
    +           f d = ...((*) d x y)...
    
    990
    +           ... f (df d')...
    
    991
    +   where df is con-like. Then we'd really like to inline 'f' so that the
    
    992
    +   rule for (*) (df d) can fire.  To do this
    
    993
    +     a) we give a discount for being an argument of a class-op (eg (*) d)
    
    994
    +     b) we say that a con-like argument (eg (df d)) is interesting
    
    995
    +
    
    996
    +(IA2) OtherCon.
    
    997
    +   interestingArg returns
    
    998
    +      (a) NonTrivArg for an arg with an OtherCon [] unfolding
    
    999
    +      (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding.
    
    1000
    +
    
    1001
    +   Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was
    
    1002
    +   inlining a pretty big function when all we knew was that its arguments
    
    1003
    +   were evaluated, nothing more.  That in turn make the enclosing function
    
    1004
    +   too big to inline elsewhere.
    
    1005
    +
    
    1006
    +   Reason for (b): we want to inline integerCompare here
    
    1007
    +     integerLt# :: Integer -> Integer -> Bool#
    
    1008
    +     integerLt# (IS x) (IS y)                  = x <# y
    
    1009
    +     integerLt# x y | LT <- integerCompare x y = 1#
    
    1010
    +     integerLt# _ _                            = 0#
    
    1011
    +
    
    1012
    +(IA3) Rubbish literals.
    
    1013
    +   In a worker we might see
    
    1014
    +     $wfoo x = let y = RUBBISH in
    
    1015
    +               ...(g y True)...
    
    1016
    +   where `g` has a wrapper that discards its first argment.  We really really
    
    1017
    +   want to inline g's wrapper, to expose that it discards its RUBBISH arg.
    
    1018
    +   That may not happen if RUBBISH looks like TrivArg, so we use NonTrivArg
    
    1019
    +   instead.  See #26722.  (This reverses the plan in #20035, but the problem
    
    1020
    +   reported there appears to have gone away.)
    
    1021
    +
    
    1022
    +(IA4) Consider a call `f (g x)`. If `f` has a an argument discount on its argument,
    
    1023
    +   then f's body scrutinises its argument in a `case` expression, or perhaps applies
    
    1024
    +   it.  We give the arg `(g x)` an ArgSummary of `NonTrivArg` so that `f` has a bit
    
    1025
    +   of encouragment to inline in these cases.
    
    1026
    +
    
    1027
    +   Now consider `let y = g x in f y`.  Now we have to look through y's unfolding.
    
    1028
    +   When should we do so?  Suppose we did inline `f` so we ended up with
    
    1029
    +       let y = g x in ...(case y of alts)...
    
    1030
    +   Then we'll call `exprIsConApp_maybe` on `y`; and that looks through "expandable"
    
    1031
    +   unfoldings; indeed that's the whole purpose of `exprIsExpanadable`. See
    
    1032
    +   Note [exprIsExpandable] in GHC.Core.Utils.
    
    1033
    +
    
    1034
    +   Conclusion: `interestingArg` should give some encouragement (NonTrivArg) to `f`
    
    1035
    +   when the argument is expandable. Hence `uf_expandable` in the `Var` case.
    
    1036
    +
    
    994 1037
     -}
    
    995 1038
     
    
    996 1039
     interestingArg :: SimplEnv -> CoreExpr -> ArgSummary
    
    ... ... @@ -1005,7 +1048,7 @@ interestingArg env e = go env 0 e
    1005 1048
                ContEx tvs cvs ids e -> go (setSubstEnv env tvs cvs ids) n e
    
    1006 1049
     
    
    1007 1050
         go _   _ (Lit l)
    
    1008
    -       | isLitRubbish l        = TrivArg -- Leads to unproductive inlining in WWRec, #20035
    
    1051
    +       | isLitRubbish l        = NonTrivArg -- See (IA3) in Note [Interesting arguments]
    
    1009 1052
            | otherwise             = ValueArg
    
    1010 1053
         go _   _ (Type _)          = TrivArg
    
    1011 1054
         go _   _ (Coercion _)      = TrivArg
    
    ... ... @@ -1027,45 +1070,29 @@ interestingArg env e = go env 0 e
    1027 1070
         go_var n v
    
    1028 1071
            | isConLikeId v = ValueArg   -- Experimenting with 'conlike' rather that
    
    1029 1072
                                         --    data constructors here
    
    1030
    -                                    -- DFuns are con-like; see Note [Conlike is interesting]
    
    1073
    +                                    -- DFuns are con-like;
    
    1074
    +                                    --    see (IA1) in Note [Interesting arguments]
    
    1031 1075
            | idArity v > n = ValueArg   -- Catches (eg) primops with arity but no unfolding
    
    1032 1076
            | n > 0         = NonTrivArg -- Saturated or unknown call
    
    1033 1077
            | otherwise  -- n==0, no value arguments; look for an interesting unfolding
    
    1034 1078
            = case idUnfolding v of
    
    1035 1079
                OtherCon [] -> NonTrivArg   -- It's evaluated, but that's all we know
    
    1036 1080
                OtherCon _  -> ValueArg     -- Evaluated and we know it isn't these constructors
    
    1037
    -              -- See Note [OtherCon and interestingArg]
    
    1081
    +              -- See (IA2) in Note [Interesting arguments]
    
    1038 1082
                DFunUnfolding {} -> ValueArg   -- We konw that idArity=0
    
    1039 1083
                CoreUnfolding{ uf_cache = cache }
    
    1040 1084
                  | uf_is_conlike cache -> ValueArg    -- Includes constructor applications
    
    1041
    -             | uf_is_value cache   -> NonTrivArg  -- Things like partial applications
    
    1085
    +             | uf_expandable cache -> NonTrivArg  -- See (IA4)
    
    1042 1086
                  | otherwise           -> TrivArg
    
    1043 1087
                BootUnfolding           -> TrivArg
    
    1044 1088
                NoUnfolding             -> TrivArg
    
    1045 1089
     
    
    1046
    -{- Note [OtherCon and interestingArg]
    
    1047
    -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    1048
    -interstingArg returns
    
    1049
    -   (a) NonTrivArg for an arg with an OtherCon [] unfolding
    
    1050
    -   (b) ValueArg for an arg with an OtherCon [c1,c2..] unfolding.
    
    1051
    -
    
    1052
    -Reason for (a): I found (in the GHC.Internal.Bignum.Integer module) that I was
    
    1053
    -inlining a pretty big function when all we knew was that its arguments
    
    1054
    -were evaluated, nothing more.  That in turn make the enclosing function
    
    1055
    -too big to inline elsewhere.
    
    1056
    -
    
    1057
    -Reason for (b): we want to inline integerCompare here
    
    1058
    -  integerLt# :: Integer -> Integer -> Bool#
    
    1059
    -  integerLt# (IS x) (IS y)                  = x <# y
    
    1060
    -  integerLt# x y | LT <- integerCompare x y = 1#
    
    1061
    -  integerLt# _ _                            = 0#
    
    1062 1090
     
    
    1063
    -************************************************************************
    
    1091
    +{- *********************************************************************
    
    1064 1092
     *                                                                      *
    
    1065 1093
                       SimplMode
    
    1066 1094
     *                                                                      *
    
    1067
    -************************************************************************
    
    1068
    --}
    
    1095
    +********************************************************************* -}
    
    1069 1096
     
    
    1070 1097
     updModeForStableUnfoldings :: ActivationGhc -> SimplMode -> SimplMode
    
    1071 1098
     -- See Note [The environments of the Simplify pass]
    

  • compiler/GHC/Core/Opt/SpecConstr.hs
    ... ... @@ -1994,7 +1994,7 @@ spec_one env fn arg_bndrs body (call_pat, rule_number)
    1994 1994
                   spec_arity = count isId spec_lam_args
    
    1995 1995
                   spec_join_arity | isJoinId fn = JoinPoint (length spec_call_args)
    
    1996 1996
                                   | otherwise   = NotJoinPoint
    
    1997
    -              spec_id    = asWorkerLikeId $
    
    1997
    +              spec_id    = setCbvCandidate $
    
    1998 1998
                                mkLocalId spec_name ManyTy spec_id_ty
    
    1999 1999
                                  -- See Note [Transfer strictness]
    
    2000 2000
                                  `setIdDmdSig`     spec_sig
    
    ... ... @@ -2065,7 +2065,7 @@ mkSeqs seqees res_ty rhs =
    2065 2065
           addEval :: Var -> CoreExpr -> CoreExpr
    
    2066 2066
           addEval arg_id rhs
    
    2067 2067
             -- Argument representing strict field and it's worth passing via cbv
    
    2068
    -        | shouldStrictifyIdForCbv arg_id
    
    2068
    +        | wantCbvForId arg_id
    
    2069 2069
             = Case (Var arg_id)
    
    2070 2070
                    (localiseId arg_id)  -- See (SCF1) in Note [SpecConstr and strict fields]
    
    2071 2071
                    res_ty
    

  • compiler/GHC/Core/Opt/WorkWrap.hs
    ... ... @@ -848,7 +848,7 @@ mkWWBindPair ww_opts fn_id fn_info fn_args fn_body work_uniq div
    848 848
           -- worker is join point iff wrapper is join point
    
    849 849
           -- (see Note [Don't w/w join points for CPR])
    
    850 850
     
    
    851
    -    work_id  = asWorkerLikeId $
    
    851
    +    work_id  = setCbvCandidate $
    
    852 852
                    mkWorkerId work_uniq fn_id (exprType work_rhs)
    
    853 853
                     `setIdOccInfo` occInfo fn_info
    
    854 854
                             -- Copy over occurrence info from parent
    

  • compiler/GHC/Core/Opt/WorkWrap/Utils.hs
    ... ... @@ -914,23 +914,26 @@ C) Unlift *any* (non-boot exported) functions arguments if they are strict.
    914 914
           an impedance matcher function. Leading to massive code bloat.
    
    915 915
           Essentially we end up creating a impromptu wrapper function
    
    916 916
           wherever we wouldn't inline the wrapper with a W/W approach.
    
    917
    -    ~ There is the option of achieving this without eta-expansion if we instead expand
    
    918
    -      the partial application code to check for demands on the calling convention and
    
    919
    -      for it to evaluate the arguments. The main downsides there would be the complexity
    
    920
    -      of the implementation and that it carries a certain overhead even for functions who
    
    921
    -      don't take advantage of this functionality. I haven't tried this approach because it's
    
    922
    -      not trivial to implement and doing W/W splits seems to work well enough.
    
    923
    -
    
    924
    -Currently we use the first approach A) by default, with a flag that allows users to fall back to the
    
    925
    -more aggressive approach B).
    
    926
    -
    
    927
    -I also tried the third approach C) using eta-expansion at call sites to avoid modifying the PAP-handling
    
    928
    -code which wasn't fruitful. See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903.
    
    929
    -We could still try to do C) in the future by having PAP calls which will evaluate the required arguments
    
    930
    -before calling the partially applied function. But this would be neither a small nor simple change so we
    
    931
    -stick with A) and a flag for B) for now.
    
    932
    -
    
    933
    -See also Note [EPT enforcement] and Note [CBV Function Ids]
    
    917
    +    ~ There is the option of achieving this without eta-expansion if we instead
    
    918
    +      expand the partial application code to check for demands on the calling
    
    919
    +      convention and for it to evaluate the arguments. The main downsides there
    
    920
    +      would be the complexity of the implementation and that it carries a
    
    921
    +      certain overhead even for functions who don't take advantage of this
    
    922
    +      functionality. I haven't tried this approach because it's not trivial to
    
    923
    +      implement and doing W/W splits seems to work well enough.
    
    924
    +
    
    925
    +Currently we use the first approach A) by default, with a flag that allows users
    
    926
    +to fall back to the more aggressive approach B).
    
    927
    +
    
    928
    +I also tried the third approach C) using eta-expansion at call sites to avoid
    
    929
    +modifying the PAP-handling code which wasn't fruitful. See
    
    930
    +https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5614#note_389903.  We could
    
    931
    +still try to do C) in the future by having PAP calls which will evaluate the
    
    932
    +required arguments before calling the partially applied function. But this would
    
    933
    +be neither a small nor simple change so we stick with A) and a flag for B) for
    
    934
    +now.
    
    935
    +
    
    936
    +See also Note [EPT enforcement] and Note [CBV Function Ids: overview]
    
    934 937
     
    
    935 938
     Note [Worker/wrapper for strict arguments]
    
    936 939
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    ... ... @@ -954,7 +957,7 @@ an "eval" (see `GHC.StgToCmm.Expr.cgCase`). A call (f (a:as)) will
    954 957
     have the wrapper inlined, and will drop the `case x`, so no eval
    
    955 958
     happens at all.
    
    956 959
     
    
    957
    -The worker `$wf` is a CBV function (see `Note [CBV Function Ids]`
    
    960
    +The worker `$wf` is a CBV function (see `Note [CBV Function Ids: overview]`
    
    958 961
     in GHC.Types.Id.Info) and the code generator guarantees that every
    
    959 962
     call to `$wf` has a properly tagged argument (see `GHC.Stg.EnforceEpt.Rewrite`).
    
    960 963
     
    
    ... ... @@ -1044,7 +1047,7 @@ mkWWstr_one opts arg str_mark =
    1044 1047
     
    
    1045 1048
         DontUnbox
    
    1046 1049
           | isStrictDmd arg_dmd || isMarkedStrict str_mark
    
    1047
    -      , wwUseForUnlifting opts  -- See Note [CBV Function Ids]
    
    1050
    +      , wwUseForUnlifting opts  -- See Note [WW for calling convention]
    
    1048 1051
           , not (isFunTy arg_ty)
    
    1049 1052
           , not (isUnliftedType arg_ty) -- Already unlifted!
    
    1050 1053
             -- NB: function arguments have a fixed RuntimeRep,
    
    ... ... @@ -1311,12 +1314,13 @@ Needless to say, there are some wrinkles:
    1311 1314
          But that also means we emit a rubbish lit for other args that have
    
    1312 1315
          cardinality 'C_10' (say, the arg to a bottoming function) where we could've
    
    1313 1316
          used an error-thunk.
    
    1314
    -     NB from Andreas: But I think using an error thunk there would be dodgy no matter what
    
    1315
    -     for example if we decide to pass the argument to the bottoming function cbv.
    
    1316
    -     As we might do if the function in question is a worker.
    
    1317
    -     See Note [CBV Function Ids] in GHC.Types.Id.Info. So I just left the strictness check
    
    1318
    -     in place on top of threading through the marks from the constructor. It's a *really* cheap
    
    1319
    -     and easy check to make anyway.
    
    1317
    +
    
    1318
    +     NB from Andreas: But I think using an error thunk there would be dodgy no
    
    1319
    +     matter what for example if we decide to pass the argument to the bottoming
    
    1320
    +     function cbv.  As we might do if the function in question is a worker.  See
    
    1321
    +     Note [CBV Function Ids: overview] in GHC.Types.Id.Info. So I just left the
    
    1322
    +     strictness check in place on top of threading through the marks from the
    
    1323
    +     constructor. It's a *really* cheap and easy check to make anyway.
    
    1320 1324
     
    
    1321 1325
     (AF3) We can only emit a LitRubbish if the arg's type `arg_ty` is mono-rep, e.g.
    
    1322 1326
          of the form `TYPE rep` where `rep` is not (and doesn't contain) a variable.
    

  • compiler/GHC/Core/Tidy.hs
    ... ... @@ -38,7 +38,7 @@ import GHC.Utils.Outputable
    38 38
     import GHC.Types.RepType (typePrimRep)
    
    39 39
     import GHC.Utils.Panic
    
    40 40
     import GHC.Types.Basic (isMarkedCbv, CbvMark (..))
    
    41
    -import GHC.Core.Utils (shouldUseCbvForId)
    
    41
    +import GHC.Core.Utils ( wantCbvForId )
    
    42 42
     
    
    43 43
     {-
    
    44 44
     ************************************************************************
    
    ... ... @@ -70,52 +70,52 @@ tidyBind env (Rec prs)
    70 70
         (env', Rec (zip bndrs' rhss'))
    
    71 71
     
    
    72 72
     
    
    73
    --- Note [Attaching CBV Marks to ids]
    
    74
    --- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    75
    --- See Note [CBV Function Ids] for the *why*.
    
    76
    --- Before tidy, we turn all worker functions into worker like ids.
    
    77
    --- This way we can later tell if we can assume the existence of a wrapper. This also applies to
    
    78
    --- specialized versions of functions generated by SpecConstr for which we, in a sense,
    
    79
    --- consider the unspecialized version to be the wrapper.
    
    80
    --- During tidy we take the demands on the arguments for these ids and compute
    
    81
    --- CBV (call-by-value) semantics for each individual argument.
    
    82
    --- The marks themselves then are put onto the function id itself.
    
    83
    --- This means the code generator can get the full calling convention by only looking at the function
    
    84
    --- itself without having to inspect the RHS.
    
    85
    ---
    
    86
    --- The actual logic is in computeCbvInfo and takes:
    
    87
    --- * The function id
    
    88
    --- * The functions rhs
    
    89
    --- And gives us back the function annotated with the marks.
    
    90
    --- We call it in:
    
    91
    --- * tidyTopPair for top level bindings
    
    92
    --- * tidyBind for local bindings.
    
    93
    ---
    
    94
    --- Not that we *have* to look at the untidied rhs.
    
    95
    --- During tidying some knot-tying occurs which can blow up
    
    96
    --- if we look at the post-tidy types of the arguments here.
    
    97
    --- However we only care if the types are unlifted and that doesn't change during tidy.
    
    98
    --- so we can just look at the untidied types.
    
    99
    ---
    
    100
    --- If the id is boot-exported we don't use a cbv calling convention via marks,
    
    101
    --- as the boot file won't contain them. Which means code calling boot-exported
    
    102
    --- ids might expect these ids to have a vanilla calling convention even if we
    
    103
    --- determine a different one here.
    
    104
    --- To be able to avoid this we pass a set of boot exported ids for this module around.
    
    105
    --- For non top level ids we can skip this. Local ids are never boot-exported
    
    106
    --- as boot files don't have unfoldings. So there this isn't a concern.
    
    107
    --- See also Note [CBV Function Ids]
    
    108
    -
    
    109
    -
    
    110
    --- See Note [CBV Function Ids]
    
    73
    +{- Note [Attaching CBV Marks to ids]
    
    74
    +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    75
    +See Note [CBV Function Ids: overview] for what is happening here.
    
    76
    +
    
    77
    +During tidy we take the demands on the arguments for any CBV-candidates and
    
    78
    +compute CBV (call-by-value) semantics for each individual argument.  The marks
    
    79
    +themselves then are put onto the function id itself.  This means the code
    
    80
    +generator can get the full calling convention by only looking at the function
    
    81
    +itself without having to inspect the RHS.
    
    82
    +
    
    83
    +The actual logic is in computeCbvInfo and takes:
    
    84
    +  * The function id
    
    85
    +  * The functions rhs
    
    86
    +And gives us back the function annotated with the marks. We call it in:
    
    87
    +  * tidyTopPair for top level bindings
    
    88
    +  * tidyBind for local bindings.
    
    89
    +
    
    90
    +Wrinkles
    
    91
    +
    
    92
    +(ACBV1) Note that we *have* to look at the untidied rhs.  During tidying some
    
    93
    +  knot-tying occurs which can blow up if we look at the post-tidy types of the
    
    94
    +  arguments here.  However we only care if the types are unlifted and that
    
    95
    +  doesn't change during tidy.  so we can just look at the untidied types.
    
    96
    +
    
    97
    +(ACBV2) If the id is boot-exported we don't use a cbv calling convention via
    
    98
    +  marks, as the boot file won't contain them. Which means code calling
    
    99
    +  boot-exported ids might expect these ids to have a vanilla calling convention
    
    100
    +  even if we determine a different one here.
    
    101
    +
    
    102
    +  To be able to avoid this we pass a set of boot exported ids for this module
    
    103
    +  around.  For non top level ids we can skip this. Local ids are never
    
    104
    +  boot-exported as boot files don't have unfoldings. So there this isn't a
    
    105
    +  concern.  See also Note [CBV Function Ids: overview]
    
    106
    +-}
    
    107
    +
    
    111 108
     tidyCbvInfoTop :: HasDebugCallStack => NameSet -> Id -> CoreExpr -> Id
    
    109
    +-- See Note [CBV Function Ids: overview]
    
    112 110
     tidyCbvInfoTop boot_exports id rhs
    
    113
    -  -- Can't change calling convention for boot exported things
    
    114
    -  | elemNameSet (idName id) boot_exports = id
    
    115
    -  | otherwise = computeCbvInfo id rhs
    
    111
    +  | elemNameSet (idName id) boot_exports
    
    112
    +  = id  -- Can't change calling convention for boot exported things
    
    113
    +        -- See (ACBV2) in Note [Attaching CBV Marks to ids]
    
    114
    +  | otherwise
    
    115
    +  = computeCbvInfo id rhs
    
    116 116
     
    
    117
    --- See Note [CBV Function Ids]
    
    118 117
     tidyCbvInfoLocal :: HasDebugCallStack => Id -> CoreExpr -> Id
    
    118
    +-- See Note [CBV Function Ids: overview]
    
    119 119
     tidyCbvInfoLocal id rhs = computeCbvInfo id rhs
    
    120 120
     
    
    121 121
     -- | For a binding we:
    
    ... ... @@ -124,9 +124,8 @@ tidyCbvInfoLocal id rhs = computeCbvInfo id rhs
    124 124
     --   - It's argument to a worker and demanded strictly
    
    125 125
     --   - Unless it's an unlifted type already
    
    126 126
     -- * Update the id
    
    127
    --- See Note [CBV Function Ids]
    
    127
    +-- See Note [CBV Function Ids: overview]
    
    128 128
     -- See Note [Attaching CBV Marks to ids]
    
    129
    -
    
    130 129
     computeCbvInfo :: HasCallStack
    
    131 130
                    => Id            -- The function
    
    132 131
                    -> CoreExpr      -- It's RHS
    
    ... ... @@ -172,7 +171,7 @@ computeCbvInfo fun_id rhs
    172 171
                  | otherwise
    
    173 172
                  = -- pprTraceDebug "computeCbvInfo: Worker seems to take unboxed tuple/sum types!"
    
    174 173
                    --    (ppr fun_id <+> ppr rhs)
    
    175
    -               asNonWorkerLikeId fun_id
    
    174
    +               removeCbvCandidate fun_id
    
    176 175
     
    
    177 176
         -- We don't set CBV marks on functions which take unboxed tuples or sums as
    
    178 177
         -- arguments.  Doing so would require us to compute the result of unarise
    
    ... ... @@ -197,7 +196,7 @@ computeCbvInfo fun_id rhs
    197 196
             isSimplePrimRep _   = False
    
    198 197
     
    
    199 198
         mkMark arg
    
    200
    -      | not $ shouldUseCbvForId arg = NotMarkedCbv
    
    199
    +      | not $ wantCbvForId arg = NotMarkedCbv
    
    201 200
           -- We can only safely use cbv for strict arguments
    
    202 201
           | (isStrUsedDmd (idDemandInfo arg))
    
    203 202
           , not (isDeadEndId fun_id) = MarkedCbv
    

  • compiler/GHC/Core/Unfold.hs
    ... ... @@ -152,10 +152,12 @@ updateCaseScaling n opts = opts { unfoldingCaseScaling = n }
    152 152
     updateReportPrefix :: Maybe String -> UnfoldingOpts -> UnfoldingOpts
    
    153 153
     updateReportPrefix n opts = opts { unfoldingReportPrefix = n }
    
    154 154
     
    
    155
    -data ArgSummary = TrivArg       -- Nothing interesting
    
    156
    -                | NonTrivArg    -- Arg has structure
    
    157
    -                | ValueArg      -- Arg is a con-app or PAP
    
    158
    -                                -- ..or con-like. Note [Conlike is interesting]
    
    155
    +data ArgSummary
    
    156
    +  = TrivArg       -- Nothing interesting
    
    157
    +  | NonTrivArg    -- Arg has structure
    
    158
    +  | ValueArg      -- Arg is a con-app or PAP
    
    159
    +                  -- ..or con-like. See (IA1) in Note [Interesting arguments]
    
    160
    +                  --                          in GHC.Core.Opt.Simplify.Utils
    
    159 161
     
    
    160 162
     instance Outputable ArgSummary where
    
    161 163
       ppr TrivArg    = text "TrivArg"
    
    ... ... @@ -776,7 +778,7 @@ litSize _other = 0 -- Must match size of nullary constructors
    776 778
                           --            (eg via case binding)
    
    777 779
     
    
    778 780
     classOpSize :: UnfoldingOpts -> Class -> [Id] -> [CoreExpr] -> ExprSize
    
    779
    --- See Note [Conlike is interesting]
    
    781
    +-- See (IA1) in Note [Interesting arguments] in GHC.Core.Opt.Simplify.Utils
    
    780 782
     classOpSize opts cls top_args args
    
    781 783
       | isUnaryClass cls
    
    782 784
       = sizeZero   -- See (UCM4) in Note [Unary class magic] in GHC.Core.TyCon
    

  • compiler/GHC/Core/Utils.hs
    ... ... @@ -59,7 +59,7 @@ module GHC.Core.Utils (
    59 59
             isJoinBind,
    
    60 60
     
    
    61 61
             -- * Tag inference
    
    62
    -        mkStrictFieldSeqs, shouldStrictifyIdForCbv, shouldUseCbvForId,
    
    62
    +        mkStrictFieldSeqs, wantCbvForId,
    
    63 63
     
    
    64 64
             -- * unsafeEqualityProof
    
    65 65
             isUnsafeEqualityCase,
    
    ... ... @@ -1509,6 +1509,23 @@ going to put up with this, because the previous more aggressive inlining
    1509 1509
     (which treated 'noFactor' as work-free) was duplicating primops, which
    
    1510 1510
     in turn was making inner loops of array calculations runs slow (#5623)
    
    1511 1511
     
    
    1512
    +Wrinkles
    
    1513
    +
    
    1514
    +(WF1) Strict constructor fields.  We regard (K x) as work-free even if
    
    1515
    +  K is a strict data constructor (see Note [Strict fields in Core])
    
    1516
    +      data T a = K !a
    
    1517
    +  If we have
    
    1518
    +    let t = K x in ...(case t of K y -> blah)...
    
    1519
    +  we want to treat t's binding as expandable so that `exprIsConApp_maybe`
    
    1520
    +  will look through its unfolding. (NB: exprIsWorkFree implies
    
    1521
    +  exprIsExpandable.)
    
    1522
    +
    
    1523
    +  Note, however, that because K is strict, after inlining we'll get a leftover
    
    1524
    +  eval on x, which may or may not disappear
    
    1525
    +    let t = K x in ...(case x of y -> blah)...
    
    1526
    +  We put up with this extra eval: in effect we count duplicating the eval as
    
    1527
    +  work-free.
    
    1528
    +
    
    1512 1529
     Note [Case expressions are work-free]
    
    1513 1530
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    1514 1531
     Are case-expressions work-free?  Consider
    
    ... ... @@ -1650,7 +1667,8 @@ isWorkFreeApp fn n_val_args
    1650 1667
       = True
    
    1651 1668
       | otherwise
    
    1652 1669
       = case idDetails fn of
    
    1653
    -      DataConWorkId {} -> True
    
    1670
    +      DataConWorkId {} -> True  -- Even if the data constructor is strict
    
    1671
    +                                -- See (WF1) in Note [exprIsWorkFree]
    
    1654 1672
           PrimOpId op _    -> primOpIsWorkFree op
    
    1655 1673
           _                -> False
    
    1656 1674
     
    
    ... ... @@ -1751,6 +1769,8 @@ expansion. Specifically:
    1751 1769
       duplicate the (a +# b) primop, which we should not do lightly.
    
    1752 1770
       (It's quite hard to trigger this bug, but T13155 does so for GHC 8.0.)
    
    1753 1771
     
    
    1772
    +NB: exprIsWorkFree implies exprIsExpandable.
    
    1773
    +
    
    1754 1774
     Note [isExpandableApp: bottoming functions]
    
    1755 1775
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    1756 1776
     It's important that isExpandableApp does not respond True to bottoming
    
    ... ... @@ -2901,29 +2921,38 @@ dumpIdInfoOfProgram dump_locals ppr_id_info binds = vcat (map printId ids)
    2901 2921
     
    
    2902 2922
     {- Note [Call-by-value for worker args]
    
    2903 2923
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    2904
    -If we unbox a constructor with strict fields we want to
    
    2905
    -preserve the information that some of the arguments came
    
    2906
    -out of strict fields and therefore should be already properly
    
    2907
    -tagged, however we can't express this directly in core.
    
    2908
    -
    
    2909
    -Instead what we do is generate a worker like this:
    
    2924
    +If we unbox a constructor with strict fields we want to preserve the information
    
    2925
    +that some of the arguments came out of strict fields and therefore should be
    
    2926
    +already evaluated and properly tagged (EPT) throughout the body of the
    
    2927
    +function.  We express this fact in Core like this:
    
    2910 2928
     
    
    2911 2929
       data T = MkT A !B
    
    2912 2930
     
    
    2913 2931
       foo = case T of MkT a b -> $wfoo a b
    
    2914 2932
     
    
    2915 2933
       $wfoo a b = case b of b' -> rhs[b/b']
    
    2934
    +              ^^^^ The "extra eval"
    
    2935
    +
    
    2936
    +Now
    
    2937
    + * Throughout `rhs` the Simplifier can see that `b` is EPT, and can (say)
    
    2938
    +   drop evals on `b`.
    
    2916 2939
     
    
    2917
    -This makes the worker strict in b causing us to use a more efficient
    
    2918
    -calling convention for `b` where the caller needs to ensure `b` is
    
    2919
    -properly tagged and evaluated before it's passed to $wfoo. See Note [CBV Function Ids].
    
    2940
    + * The EPT enforcement pass will make $wfoo into a CBV function, where
    
    2941
    +   the caller guarantees to pass an EPT argument (see Note [EPT enforcement] in
    
    2942
    +   GHC.Core.Stg.EnforceEpt)
    
    2920 2943
     
    
    2921
    -Usually the argument will be known to be properly tagged at the call site so there is
    
    2944
    + * The code generator will discard that "extra eval" case, because $wfoo is
    
    2945
    +   CBV.
    
    2946
    +
    
    2947
    +See also Note [CBV Function Ids: overview].
    
    2948
    +
    
    2949
    +In tihs case the argument is known to be properly tagged at the call site so there is
    
    2922 2950
     no additional work for the caller and the worker can be more efficient since it can
    
    2923 2951
     assume the presence of a tag.
    
    2924 2952
     
    
    2925 2953
     This is especially true for recursive functions like this:
    
    2926 2954
         -- myPred expect it's argument properly tagged
    
    2955
    +    -- The EnforceEPT pass has made it a CBV function
    
    2927 2956
         myPred !x = ...
    
    2928 2957
     
    
    2929 2958
         loop :: MyPair -> Int
    
    ... ... @@ -2933,11 +2962,10 @@ This is especially true for recursive functions like this:
    2933 2962
                 B -> 2
    
    2934 2963
                 _ -> loop (MyPair (myPred x) (myPred y))
    
    2935 2964
     
    
    2936
    -Here we would ordinarily not be strict in y after unboxing.
    
    2937
    -However if we pass it as a regular argument then this means on
    
    2938
    -every iteration of loop we will incur an extra seq on y before
    
    2939
    -we can pass it to `myPred` which isn't great! That is in STG after
    
    2940
    -tag inference we get:
    
    2965
    +Here we would ordinarily not be strict in y after unboxing.  However if we pass
    
    2966
    +it as a regular argument then this means on every iteration of loop we will
    
    2967
    +incur an extra seq on y before we can pass it to `myPred` which isn't great!
    
    2968
    +That is in STG after tag inference we get:
    
    2941 2969
     
    
    2942 2970
         Rec {
    
    2943 2971
         Find.$wloop [InlPrag=[2], Occ=LoopBreaker]
    
    ... ... @@ -2962,7 +2990,7 @@ tag inference we get:
    2962 2990
                 };
    
    2963 2991
         end Rec }
    
    2964 2992
     
    
    2965
    -Here comes the tricky part: If we make $wloop strict in both x/y and we get:
    
    2993
    +But if we add an extra eval on `y` during worker/wrapper we this this:
    
    2966 2994
     
    
    2967 2995
         Rec {
    
    2968 2996
         Find.$wloop [InlPrag=[2], Occ=LoopBreaker]
    
    ... ... @@ -2986,18 +3014,24 @@ Here comes the tricky part: If we make $wloop strict in both x/y and we get:
    2986 3014
                 };
    
    2987 3015
         end Rec }
    
    2988 3016
     
    
    2989
    -Here both x and y are known to be tagged in the function body since we pass strict worker args using unlifted cbv.
    
    2990
    -This means the seqs on x and y both become no-ops and compared to the first version the seq on `y` disappears at runtime.
    
    2991
    -
    
    2992
    -The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated.
    
    2993
    -But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated
    
    2994
    -already at the call site because of the EPT Invariant! See Note [EPT enforcement] for more in this.
    
    2995
    -This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good.
    
    2996
    -
    
    2997
    -We only apply this when we think there is a benefit in doing so however. There are a number of cases in which
    
    2998
    -it would be useless to insert an extra seq. ShouldStrictifyIdForCbv tries to identify these to avoid churn in the
    
    3017
    +Here both x and y are known to be tagged in the function body since we pass
    
    3018
    +strict worker args using unlifted cbv.  This means the seqs on x and y both
    
    3019
    +become no-ops (via (EPT-codegen) in Not [EPT enforcement]) and, compared to the
    
    3020
    +first version, the seq on `y` disappears at runtime.
    
    3021
    +
    
    3022
    +The downside is that the caller of $wfoo potentially has to evaluate `y` once if
    
    3023
    +we can't prove it isn't already evaluated.  The wrapper, which calls `$wfoo` has
    
    3024
    +just pulled `y` out of a strict field of a data constructor, so it will always
    
    3025
    +be EPT.  See Note [EPT enforcement] for more in this.  This makes GHC itself
    
    3026
    +around 1% faster despite doing slightly more work! So this is generally quite
    
    3027
    +good.
    
    3028
    +
    
    3029
    +We only apply this when we think there is a benefit in doing so however. There
    
    3030
    +are a number of cases in which it would be useless to insert an extra
    
    3031
    +seq. `wantCbvForId` tries to identify these to avoid churn in the
    
    2999 3032
     simplifier. See Note [Which Ids should be strictified] for details on this.
    
    3000 3033
     -}
    
    3034
    +
    
    3001 3035
     mkStrictFieldSeqs :: [(Id,StrictnessMark)] -> CoreExpr -> (CoreExpr)
    
    3002 3036
     mkStrictFieldSeqs args rhs =
    
    3003 3037
       foldr addEval rhs args
    
    ... ... @@ -3007,7 +3041,7 @@ mkStrictFieldSeqs args rhs =
    3007 3041
           addEval (arg_id,arg_cbv) (rhs)
    
    3008 3042
             -- Argument representing strict field.
    
    3009 3043
             | isMarkedStrict arg_cbv
    
    3010
    -        , shouldStrictifyIdForCbv arg_id
    
    3044
    +        , wantCbvForId arg_id
    
    3011 3045
             -- Make sure to remove unfoldings here to avoid the simplifier dropping those for OtherCon[] unfoldings.
    
    3012 3046
             = Case (Var $! zapIdUnfolding arg_id) arg_id case_ty ([Alt DEFAULT [] rhs])
    
    3013 3047
             -- Normal argument
    
    ... ... @@ -3027,87 +3061,99 @@ There are multiple reasons why we might not want to insert a seq in the rhs to
    3027 3061
     strictify a functions argument:
    
    3028 3062
     
    
    3029 3063
     1) The argument doesn't exist at runtime.
    
    3030
    -
    
    3031
    -For zero width types (like Types) there is no benefit as we don't operate on them
    
    3032
    -at runtime at all. This includes things like void#, coercions and state tokens.
    
    3064
    +   For zero width types (like Types) there is no benefit as we don't operate on them
    
    3065
    +   at runtime at all. This includes things like void#, coercions and state tokens.
    
    3033 3066
     
    
    3034 3067
     2) The argument is a unlifted type.
    
    3035
    -
    
    3036
    -If the argument is a unlifted type the calling convention already is explicitly
    
    3037
    -cbv. This means inserting a seq on this argument wouldn't do anything as the seq
    
    3038
    -would be a no-op *and* it wouldn't affect the calling convention.
    
    3068
    +   If the argument is a unlifted type the calling convention already is explicitly
    
    3069
    +   cbv. This means inserting a seq on this argument wouldn't do anything as the seq
    
    3070
    +   would be a no-op *and* it wouldn't affect the calling convention.
    
    3039 3071
     
    
    3040 3072
     3) The argument is absent.
    
    3073
    +   If the argument is absent in the body there is no advantage to it being passed as
    
    3074
    +   cbv to the function. The function won't ever look at it so we don't save any work.
    
    3041 3075
     
    
    3042
    -If the argument is absent in the body there is no advantage to it being passed as
    
    3043
    -cbv to the function. The function won't ever look at it so we don't safe any work.
    
    3044
    -
    
    3045
    -This mostly happens for join point. For example we might have:
    
    3046
    -
    
    3047
    -    data T = MkT ![Int] [Char]
    
    3048
    -    f t = case t of MkT xs{strict} ys-> snd (xs,ys)
    
    3049
    -
    
    3050
    -and abstract the case alternative to:
    
    3051
    -
    
    3052
    -    f t = join j1 = \xs ys -> snd (xs,ys)
    
    3053
    -          in case t of MkT xs{strict} ys-> j1 xs xy
    
    3054
    -
    
    3055
    -While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
    
    3056
    -In short a absent demand means neither our RHS, nor any function we pass the argument
    
    3057
    -to will inspect it. So there is no work to be saved by forcing `xs` early.
    
    3076
    +   This mostly happens for join points. For example we might have:
    
    3058 3077
     
    
    3059
    -NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
    
    3060
    -Note [Absent fillers] has an example of this. However this is so rare it's not worth
    
    3061
    -caring about here.
    
    3078
    +       data T = MkT ![Int] [Char]
    
    3079
    +       f t = case t of MkT xs{strict} ys-> snd (xs,ys)
    
    3062 3080
     
    
    3063
    -4) The argument is already strict.
    
    3081
    +   and abstract the case alternative to:
    
    3064 3082
     
    
    3065
    -Consider this code:
    
    3083
    +       f t = join j1 = \xs ys -> snd (xs,ys)
    
    3084
    +             in case t of MkT xs{strict} ys-> j1 xs xy
    
    3066 3085
     
    
    3067
    -    data T = MkT ![Int]
    
    3068
    -    f t = case t of MkT xs{strict} -> reverse xs
    
    3086
    +   While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
    
    3087
    +   In short a absent demand means neither our RHS, nor any function we pass the argument
    
    3088
    +   to will inspect it. So there is no work to be saved by forcing `xs` early.
    
    3069 3089
     
    
    3070
    -The `xs{strict}` indicates that `xs` is used strictly by the `reverse xs`.
    
    3071
    -If we do a w/w split, and add the extra eval on `xs`, we'll get
    
    3072
    -
    
    3073
    -    $wf xs =
    
    3074
    -        case xs of xs1 ->
    
    3075
    -            let t = MkT xs1 in
    
    3076
    -            case t of MkT xs2 -> reverse xs2
    
    3077
    -
    
    3078
    -That's not wrong; but the w/w body will simplify to
    
    3079
    -
    
    3080
    -    $wf xs = case xs of xs1 -> reverse xs1
    
    3081
    -
    
    3082
    -and now we'll drop the `case xs` because `xs1` is used strictly in its scope.
    
    3083
    -Adding that eval was a waste of time.  So don't add it for strictly-demanded Ids.
    
    3090
    +   NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
    
    3091
    +   Note [Absent fillers] has an example of this. However this is so rare it's not worth
    
    3092
    +   caring about here.
    
    3084 3093
     
    
    3085 3094
     5) Functions
    
    3086
    -
    
    3087
    -Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
    
    3088
    -But the gist of it even if we make a higher order function argument strict
    
    3089
    -we can't avoid the tag check when it's used later in the body.
    
    3090
    -So there is no benefit.
    
    3095
    +   Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
    
    3096
    +   But the gist of it even if we make a higher order function argument strict
    
    3097
    +   we can't avoid the tag check when it's used later in the body.
    
    3098
    +   So there is no benefit.
    
    3099
    +
    
    3100
    +Wrinkles:
    
    3101
    +
    
    3102
    +(WIS1) You might have thought that we can omit the eval if the argument is used
    
    3103
    +   strictly demanded in the body.  But you'd be wrong. Consider this code:
    
    3104
    +          data T = MkT ![Int]
    
    3105
    +          f t = case t of MkT xs{Dmd=STR} -> reverse xs
    
    3106
    +
    
    3107
    +   The `xs{Dmd=STR}` indicates that `xs` is used strictly by the `reverse xs`.
    
    3108
    +   If we do a w/w split, and add the extra eval on `xs`, we'll get
    
    3109
    +       $wf xs = case xs of xs1 ->
    
    3110
    +                  let t = MkT xs1 in
    
    3111
    +                  case t of MkT xs2 -> reverse xs2
    
    3112
    +
    
    3113
    +   That's not wrong; but you might wonder if the eval on `xs` is needed
    
    3114
    +   when it is certainly evaluated by the `reverse`.  But yes, it is (#26722):
    
    3115
    +       g s True  t = f s t t
    
    3116
    +       g s False t = g s True t
    
    3117
    +
    
    3118
    +       f True  (MkT xs) t = f False (MkT xs) t
    
    3119
    +       f False (MkT xs) _ = xs
    
    3120
    +
    
    3121
    +   After worker/wrapper we get:
    
    3122
    +       g s b t = case t of MkT ww -> $wg s b ww
    
    3123
    +       $wg s ds ww = case ds of {
    
    3124
    +                      False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
    
    3125
    +                      True  -> let { t1 = MkT ww } in f s t1 t1 }
    
    3126
    +
    
    3127
    +   We must make `f` inline inside `$wg`, because `f` too is ww'd, and we
    
    3128
    +   don't want to rebox `t1` before passing it to `f`.  BUT while `t1`
    
    3129
    +   looks like a HNF, `exprIsHNF` will say False because `MkT` is strict
    
    3130
    +   and `ww` isn't evaluated.  So `f` doesn't inline and we get lots of
    
    3131
    +   reboxing.
    
    3132
    +
    
    3133
    +   The Right Thing to to is to add the eval for the data con argument:
    
    3134
    +       $wg s ds ww = case ww of ww' { DEFAULT ->
    
    3135
    +                     case ds of {
    
    3136
    +                      False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
    
    3137
    +                      True  -> let { t1 = MkT ww' } in f s t1 t1 } }
    
    3138
    +
    
    3139
    +   Now `t1` will be a HNF, and `f` will inline, and we get
    
    3140
    +       $wg s ds ww = case ww of ww' { DEFAULT ->
    
    3141
    +                     case ds of {
    
    3142
    +                      False -> Bar.$wg s True ww'
    
    3143
    +                      True  -> $wf s ww'
    
    3144
    +
    
    3145
    +  (Ultimately `$wg` will be a CBV function, so that `case ww` will be a
    
    3146
    +  no-op: see (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt.)
    
    3091 3147
     
    
    3092 3148
     -}
    
    3093
    --- | Do we expect there to be any benefit if we make this var strict
    
    3094
    --- in order for it to get treated as as cbv argument?
    
    3095
    --- See Note [Which Ids should be strictified]
    
    3096
    --- See Note [CBV Function Ids] for more background.
    
    3097
    -shouldStrictifyIdForCbv :: Var -> Bool
    
    3098
    -shouldStrictifyIdForCbv = wantCbvForId False
    
    3099
    -
    
    3100
    --- Like shouldStrictifyIdForCbv but also wants to use cbv for strict args.
    
    3101
    -shouldUseCbvForId :: Var -> Bool
    
    3102
    -shouldUseCbvForId = wantCbvForId True
    
    3103 3149
     
    
    3104 3150
     -- When we strictify we want to skip strict args otherwise the logic is the same
    
    3151
    +-- as for wantCbvForId so we common up the logic here.
    
    3105 3152
     -- Basically returns true if it would be beneficial for runtime to pass this argument
    
    3106 3153
     -- as CBV independent of weither or not it's correct. E.g. it might return true for lazy args
    
    3107 3154
     -- we are not allowed to force.
    
    3108
    --- we are not allowed to force.
    
    3109
    -wantCbvForId :: Bool -> Var -> Bool
    
    3155
    +wantCbvForId :: Var -> Bool
    
    3156
    +wantCbvForId v
    
    3110 3157
       -- Must be a runtime var.
    
    3111 3158
       -- See Note [Which Ids should be strictified] point 1)
    
    3112 3159
       | isId v
    
    ... ... @@ -3121,9 +3167,6 @@ wantCbvForId cbv_for_strict v
    3121 3167
       , not $ isFunTy ty
    
    3122 3168
       -- If the var is strict already a seq is redundant.
    
    3123 3169
       -- See Note [Which Ids should be strictified] point 4)
    
    3124
    -  , not (isStrictDmd dmd) || cbv_for_strict
    
    3125
    -  -- If the var is absent a seq is almost always useless.
    
    3126
    -  -- See Note [Which Ids should be strictified] point 3)
    
    3127 3170
       , not (isAbsDmd dmd)
    
    3128 3171
       = True
    
    3129 3172
       | otherwise
    

  • compiler/GHC/CoreToStg/Prep.hs
    ... ... @@ -1582,7 +1582,8 @@ maybeSaturate fn expr n_args unsat_ticks
    1582 1582
         -- See Note [Eta expansion of hasNoBinding things in CorePrep]
    
    1583 1583
       = return $ wrapLamBody (\body -> foldr mkTick body unsat_ticks) sat_expr
    
    1584 1584
     
    
    1585
    -  | mark_arity > 0 -- A call-by-value function. See Note [CBV Function Ids]
    
    1585
    +  | mark_arity > 0 -- A call-by-value function.
    
    1586
    +                   -- See Note [CBV Function Ids: overview]
    
    1586 1587
       , not applied_marks
    
    1587 1588
       = assertPpr
    
    1588 1589
           ( not (isJoinId fn)) -- See Note [Do not eta-expand join points]
    

  • compiler/GHC/Stg/EnforceEpt.hs
    ... ... @@ -66,13 +66,13 @@ SG thinks it would be good to fix this; see #21792.
    66 66
     Note [EPT enforcement]
    
    67 67
     ~~~~~~~~~~~~~~~~~~~~~~
    
    68 68
     The goal of EnforceEPT pass is to mark as many binders as possible as EPT
    
    69
    -(see Note [Evaluated and Properly Tagged]).
    
    70
    -To find more EPT binders, it establishes the following
    
    69
    +(see Note [Evaluated and Properly Tagged]). It establishes the following
    
    70
    +invariant:
    
    71 71
     
    
    72 72
     EPT INVARIANT:
    
    73 73
     > Any binder of
    
    74 74
     >   * a strict field (see Note [Strict fields in Core]), or
    
    75
    ->   * a CBV argument (see Note [CBV Function Ids])
    
    75
    +>   * a CBV argument (see Note [CBV Function Ids: overview])
    
    76 76
     > is EPT.
    
    77 77
     
    
    78 78
     (Note that prior to EPT enforcement, this invariant may *not* always be upheld.
    
    ... ... @@ -105,7 +105,7 @@ however, we presently only promote worker functions such as $wf to CBV because
    105 105
     we see all its call sites and can use the proper by-value calling convention.
    
    106 106
     More precisely, with -O0, we guarantee that no CBV functions are visible in
    
    107 107
     the interface file, so that naïve clients do not need to know how to call CBV
    
    108
    -functions. See Note [CBV Function Ids] for more details.
    
    108
    +functions. See Note [CBV Function Ids: overview] for more details.
    
    109 109
     
    
    110 110
     Specification
    
    111 111
     -------------
    
    ... ... @@ -140,9 +140,11 @@ Afterwards, the *EPT rewriter* inserts the actual evals realising Upcasts.
    140 140
     Implementation
    
    141 141
     --------------
    
    142 142
     
    
    143
    -* EPT analysis is implemented in GHC.Stg.EnforceEpt.inferTags.
    
    143
    +(EPT-anal) EPT analysis is implemented in `GHC.Stg.EnforceEpt.inferTags.`
    
    144 144
       It attaches its result to /binders/, not occurrence sites.
    
    145
    -* The EPT rewriter establishes the EPT invariant by inserting evals. That is, if
    
    145
    +
    
    146
    +(EPT-rewrite) The EPT rewriter, `GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds`,
    
    147
    +   establishes the EPT invariant by inserting evals. That is, if
    
    146 148
         (a) a binder x is used to
    
    147 149
               * construct a strict field (`SP x y`), or
    
    148 150
               * passed as a CBV argument (`$wf x`),
    
    ... ... @@ -152,17 +154,27 @@ Implementation
    152 154
         case x of x' { __ DEFAULT -> SP x' y }.
    
    153 155
         case x of x' { __ DEFAULT -> $wf x' }.
    
    154 156
       (Recall that the case binder x' is always EPT.)
    
    155
    -  This is implemented in GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds.
    
    157
    +
    
    156 158
       This pass also propagates the EPTness from binders to occurrences.
    
    159
    +
    
    157 160
       It is sound to insert evals on strict fields (Note [Strict fields in Core]),
    
    158
    -  and on CBV arguments as well (Note [CBV Function Ids]).
    
    159
    -* We also export the EPTness of top level bindings to allow this optimisation
    
    161
    +  and on CBV arguments as well (Note [CBV Function Ids: overview]).
    
    162
    +
    
    163
    +(EPT-codegen) Finally, code generation for (case x of alts) skips the thunk check
    
    164
    +  when `x` is EPT. This is done (a bit indirectly) thus:
    
    165
    +   * GHC.StgToCmm.Expr.cgCase: builds a `sequel`, and recurses into `cgExpr` on `x`.
    
    166
    +   * When `cgExpr` sees a `x` goes to `cgIdApp`, which uses `getCallMethod`.
    
    167
    +   * Then `getCallMethod` sees that `x` is EPT (via `idTagSigMaybe`), and
    
    168
    +     returns `InferredReturnIt`.
    
    169
    +   * Now `cgIdApp` can jump straight to the case-alternative switch in the `sequel`
    
    170
    +     constructed by `cgCase`.
    
    171
    +
    
    172
    +(EPT-export) We also export the EPTness of top level bindings to allow this optimisation
    
    160 173
       to work across module boundaries.
    
    174
    +
    
    161 175
       NB: The EPT Invariant *must* be upheld, regardless of the optimisation level;
    
    162 176
       hence EPTness is practically part of the internal ABI of a strict data
    
    163
    -  constructor or CBV function. Note [CBV Function Ids] contains the details.
    
    164
    -* Finally, code generation skips the thunk check when branching on binders that
    
    165
    -  are EPT. This is done by `cgExpr`/`cgCase` in the backend.
    
    177
    +  constructor or CBV function. Note [CBV Function Ids: overview] has the details.
    
    166 178
     
    
    167 179
     Evaluation
    
    168 180
     ----------
    

  • compiler/GHC/Stg/Lint.hs
    ... ... @@ -420,7 +420,7 @@ lintAppCbvMarks e@(StgApp fun args) = do
    420 420
       when (lf_unarised lf) $ do
    
    421 421
         -- A function which expects a unlifted argument as n'th argument
    
    422 422
         -- always needs to be applied to n arguments.
    
    423
    -    -- See Note [CBV Function Ids].
    
    423
    +    -- See Note [CBV Function Ids: overview].
    
    424 424
         let marks = fromMaybe [] $ idCbvMarks_maybe fun
    
    425 425
         when (length (dropWhileEndLE (not . isMarkedCbv) marks) > length args) $ do
    
    426 426
           addErrL $ hang (text "Undersatured cbv marked ID in App" <+> ppr e ) 2 $
    

  • compiler/GHC/StgToCmm/Closure.hs
    ... ... @@ -617,12 +617,15 @@ getCallMethod cfg name id (LFThunk _ _ updatable std_form_info is_fun)
    617 617
     getCallMethod cfg name id (LFUnknown might_be_a_function) n_args _cg_locs _self_loop_info
    
    618 618
       | n_args == 0
    
    619 619
       , Just sig <- idTagSig_maybe id
    
    620
    -  , isTaggedSig sig -- Infered to be already evaluated by EPT analysis
    
    621
    -  -- When profiling we must enter all potential functions to make sure we update the SCC
    
    622
    -  -- even if the function itself is already evaluated.
    
    620
    +  , isTaggedSig sig -- This `id` is evaluated and properly tagged; no need to enter it
    
    621
    +                    -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
    
    622
    +
    
    623
    +  -- When profiling we must enter all potential functions to make sure we update
    
    624
    +  -- the SCC even if the function itself is already evaluated.
    
    623 625
       -- See Note [Evaluating functions with profiling] in rts/Apply.cmm
    
    624 626
       , not (profileIsProfiling (stgToCmmProfile cfg) && might_be_a_function)
    
    625
    -  = InferedReturnIt -- See Note [EPT enforcement]
    
    627
    +
    
    628
    +  = InferedReturnIt -- See (EPT-codegen) in Note [EPT enforcement]
    
    626 629
     
    
    627 630
       | might_be_a_function = SlowCall
    
    628 631
     
    

  • compiler/GHC/StgToCmm/Expr.hs
    ... ... @@ -1053,6 +1053,7 @@ cgIdApp fun_id args = do
    1053 1053
               | otherwise                -> emitReturn [fun]
    
    1054 1054
     
    
    1055 1055
             -- A value infered to be in WHNF, so we can just return it.
    
    1056
    +        -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
    
    1056 1057
             InferedReturnIt
    
    1057 1058
               | isZeroBitTy (idType fun_id) -> trace >> emitReturn []
    
    1058 1059
               | otherwise                   -> trace >> assertTag >>
    

  • compiler/GHC/Types/Id.hs
    ... ... @@ -117,7 +117,7 @@ module GHC.Types.Id (
    117 117
             setIdCbvMarks,
    
    118 118
             idCbvMarks_maybe,
    
    119 119
             idCbvMarkArity,
    
    120
    -        asWorkerLikeId, asNonWorkerLikeId,
    
    120
    +        setCbvCandidate, removeCbvCandidate,
    
    121 121
     
    
    122 122
             idDemandInfo,
    
    123 123
             idDmdSig,
    
    ... ... @@ -563,7 +563,7 @@ isDataConId id = case Var.idDetails id of
    563 563
     
    
    564 564
     -- | An Id for which we might require all callers to pass strict arguments properly tagged + evaluated.
    
    565 565
     --
    
    566
    --- See Note [CBV Function Ids]
    
    566
    +-- See Note [CBV Function Ids: overview]
    
    567 567
     isWorkerLikeId :: Id -> Bool
    
    568 568
     isWorkerLikeId id = case Var.idDetails id of
    
    569 569
       WorkerLikeId _  -> True
    
    ... ... @@ -668,19 +668,20 @@ idJoinArity id = case idJoinPointHood id of
    668 668
                        NotJoinPoint -> pprPanic "idJoinArity" (ppr id)
    
    669 669
     
    
    670 670
     asJoinId :: Id -> JoinArity -> JoinId
    
    671
    -asJoinId id arity = warnPprTrace (not (isLocalId id))
    
    672
    -                         "global id being marked as join var"  (ppr id) $
    
    673
    -                    warnPprTrace (not (is_vanilla_or_join id))
    
    674
    -                         "asJoinId"
    
    675
    -                         (ppr id <+> pprIdDetails (idDetails id)) $
    
    676
    -                    id `setIdDetails` JoinId arity (idCbvMarks_maybe id)
    
    671
    +asJoinId id arity
    
    672
    +  = warnPprTrace (not (isLocalId id))
    
    673
    +      "global id being marked as join var"  (ppr id) $
    
    674
    +    id `setIdDetails` JoinId arity cbv_info
    
    677 675
       where
    
    678
    -    is_vanilla_or_join id = case Var.idDetails id of
    
    679
    -                              VanillaId -> True
    
    680
    -                              -- Can workers become join ids? Yes!
    
    681
    -                              WorkerLikeId {} -> pprTraceDebug "asJoinId (call by value function)" (ppr id) True
    
    682
    -                              JoinId {} -> True
    
    683
    -                              _         -> False
    
    676
    +   cbv_info = case Var.idDetails id of
    
    677
    +                 VanillaId          -> Nothing
    
    678
    +                 WorkerLikeId marks -> Just marks
    
    679
    +                 JoinId _ mb_marks  -> mb_marks
    
    680
    +                 _ -> pprTraceDebug "asJoinId"
    
    681
    +                         (ppr id <+> pprIdDetails (idDetails id)) $
    
    682
    +                      Nothing
    
    683
    +   -- Can workers become join ids? Yes!
    
    684
    +   -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
    
    684 685
     
    
    685 686
     zapJoinId :: Id -> Id
    
    686 687
     -- May be a regular id already
    
    ... ... @@ -691,7 +692,7 @@ zapJoinId jid | isJoinId jid = zapIdTailCallInfo (newIdDetails `seq` jid `setIdD
    691 692
                   where
    
    692 693
                     newIdDetails = case idDetails jid of
    
    693 694
                       -- We treat join points as CBV functions. Even after they are floated out.
    
    694
    -                  -- See Note [Use CBV semantics only for join points and workers]
    
    695
    +                  -- See Note [Which Ids should be CBV candidates?]
    
    695 696
                       JoinId _ (Just marks) -> WorkerLikeId marks
    
    696 697
                       JoinId _ Nothing      -> WorkerLikeId []
    
    697 698
                       _                     -> panic "zapJoinId: newIdDetails can only be used if Id was a join Id."
    
    ... ... @@ -840,7 +841,7 @@ setIdCbvMarks id marks
    840 841
           -- Perhaps that's sensible but for now be conservative.
    
    841 842
           -- Similarly we don't need any lazy marks at the end of the list.
    
    842 843
           -- This way the length of the list is always exactly number of arguments
    
    843
    -      -- that must be visible to CodeGen. See See Note [CBV Function Ids]
    
    844
    +      -- that must be visible to CodeGen. See Note [CBV Function Ids: overview]
    
    844 845
           -- for more details.
    
    845 846
           trimmedMarks = dropWhileEndLE (not . isMarkedCbv) $ take (idArity id) marks
    
    846 847
     
    
    ... ... @@ -855,18 +856,10 @@ idCbvMarks_maybe id = case idDetails id of
    855 856
     idCbvMarkArity :: Id -> Arity
    
    856 857
     idCbvMarkArity fn = maybe 0 length (idCbvMarks_maybe fn)
    
    857 858
     
    
    858
    --- | Remove any cbv marks on arguments from a given Id.
    
    859
    -asNonWorkerLikeId :: Id -> Id
    
    860
    -asNonWorkerLikeId id =
    
    861
    -  let details = case idDetails id of
    
    862
    -        WorkerLikeId{}      -> Just $ VanillaId
    
    863
    -        JoinId arity Just{} -> Just $ JoinId arity Nothing
    
    864
    -        _                   -> Nothing
    
    865
    -  in maybeModifyIdDetails details id
    
    866
    -
    
    867
    --- | Turn this id into a WorkerLikeId if possible.
    
    868
    -asWorkerLikeId :: Id -> Id
    
    869
    -asWorkerLikeId id =
    
    859
    +-- | Make this Id into a candidate for CBV treatment, if possible.
    
    860
    +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
    
    861
    +setCbvCandidate :: Id -> Id
    
    862
    +setCbvCandidate id =
    
    870 863
       let details = case idDetails id of
    
    871 864
             WorkerLikeId{}        -> Nothing
    
    872 865
             JoinId _arity Just{}  -> Nothing
    
    ... ... @@ -875,6 +868,16 @@ asWorkerLikeId id =
    875 868
             _                     -> Nothing
    
    876 869
       in maybeModifyIdDetails details id
    
    877 870
     
    
    871
    +-- | Remove any CBV-candidate info from a given Id.
    
    872
    +-- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info
    
    873
    +removeCbvCandidate :: Id -> Id
    
    874
    +removeCbvCandidate id =
    
    875
    +  let details = case idDetails id of
    
    876
    +        WorkerLikeId{}      -> Just $ VanillaId
    
    877
    +        JoinId arity Just{} -> Just $ JoinId arity Nothing
    
    878
    +        _                   -> Nothing
    
    879
    +  in maybeModifyIdDetails details id
    
    880
    +
    
    878 881
     setCaseBndrEvald :: StrictnessMark -> Id -> Id
    
    879 882
     -- Used for variables bound by a case expressions, both the case-binder
    
    880 883
     -- itself, and any pattern-bound variables that are argument of a
    

  • compiler/GHC/Types/Id/Info.hs
    ... ... @@ -208,14 +208,14 @@ data IdDetails
    208 208
             -- ^ An 'Id' for a join point taking n arguments
    
    209 209
             -- Note [Join points] in "GHC.Core"
    
    210 210
             -- Can also work as a WorkerLikeId if given `CbvMark`s.
    
    211
    -        -- See Note [CBV Function Ids]
    
    211
    +        -- See Note [CBV Function Ids: overview]
    
    212 212
             -- The [CbvMark] is always empty (and ignored) until after Tidy.
    
    213 213
       | WorkerLikeId [CbvMark]
    
    214 214
             -- ^ An 'Id' for a worker like function, which might expect some arguments to be
    
    215 215
             -- passed both evaluated and tagged.
    
    216 216
             -- Worker like functions are create by W/W and SpecConstr and we can expect that they
    
    217 217
             -- aren't used unapplied.
    
    218
    -        -- See Note [CBV Function Ids]
    
    218
    +        -- See Note [CBV Function Ids: overview]
    
    219 219
             -- See Note [EPT enforcement]
    
    220 220
             -- The [CbvMark] is always empty (and ignored) until after Tidy for ids from the current
    
    221 221
             -- module.
    
    ... ... @@ -244,85 +244,114 @@ conLikesRecSelInfo con_likes lbls
    244 244
         has_fld dc lbl = any (\ fl -> flLabel fl == lbl) (conLikeFieldLabels dc)
    
    245 245
     
    
    246 246
     
    
    247
    -{- Note [CBV Function Ids]
    
    248
    -~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    249
    -A WorkerLikeId essentially allows us to constrain the calling convention
    
    250
    -for the given Id. Each such Id carries with it a list of CbvMarks
    
    251
    -with each element representing a value argument. Arguments who have
    
    252
    -a matching `MarkedCbv` entry in the list need to be passed evaluated+*properly tagged*.
    
    253
    -
    
    254
    -CallByValueFunIds give us additional expressiveness which we use to improve
    
    255
    -runtime. This is all part of the EPT enforcement work. See also Note [EPT enforcement].
    
    256
    -
    
    257
    -They allows us to express the fact that an argument is not only evaluated to WHNF once we
    
    258
    -entered it's RHS but also that an lifted argument is already *properly tagged* once we jump
    
    259
    -into the RHS.
    
    260
    -This means when e.g. branching on such an argument the RHS doesn't needed to perform
    
    261
    -an eval check to ensure the argument isn't an indirection. All seqs on such an argument in
    
    262
    -the functions body become no-ops as well.
    
    263
    -
    
    264
    -The invariants around the arguments of call by value function like Ids are then:
    
    265
    -
    
    266
    -* In any call `(f e1 .. en)`, if `f`'s i'th argument is marked `MarkedCbv`,
    
    267
    -  then the caller must ensure that the i'th argument
    
    268
    -  * points directly to the value (and hence is certainly evaluated before the call)
    
    269
    -  * is a properly tagged pointer to that value
    
    270
    -
    
    271
    -* The following functions (and only these functions) have `CbvMarks`:
    
    272
    -  * Any `WorkerLikeId`
    
    273
    -  * Some `JoinId` bindings.
    
    274
    -
    
    275
    -This works analogous to the EPT Invariant. See also Note [EPT enforcement].
    
    276
    -
    
    277
    -To make this work what we do is:
    
    278
    -* During W/W and SpecConstr any worker/specialized binding we introduce
    
    279
    -  is marked as a worker binding by `asWorkerLikeId`.
    
    280
    -* W/W and SpecConstr further set OtherCon[] unfoldings on arguments which
    
    281
    -  represent contents of a strict fields.
    
    282
    -* During Tidy we look at all bindings.
    
    283
    -  For any callByValueLike Id and join point we mark arguments as cbv if they
    
    284
    -  Are strict. We don't do so for regular bindings.
    
    285
    -  See Note [Use CBV semantics only for join points and workers] for why.
    
    286
    -  We might have made some ids rhs *more* strict in order to make their arguments
    
    287
    -  be passed CBV. See Note [Call-by-value for worker args] for why.
    
    288
    -* During CorePrep calls to CallByValueFunIds are eta expanded.
    
    247
    +{- Note [CBV Function Ids: overview]
    
    248
    +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    249
    +GHC can decide to use a call-by-value (CBV) calling convention for
    
    250
    +(some arguments of) a function, implying that:
    
    251
    +
    
    252
    +* The caller /must/ pass an argument that is evaluated and properly
    
    253
    +  tagged (EPT).  See Note [Evaluated and Properly Tagged] in GHC.Stg.EnforceEpt.
    
    254
    +
    
    255
    +* The function may /assume/ that the argument is EPT, and thereby omit
    
    256
    +  evals that would otherwise be necessary.
    
    257
    +
    
    258
    +CBV-ness is part of the calling convention; it is not optional. If a function
    
    259
    +is compiled with CBV arguments, callers /must/ respect it, else seg-fault
    
    260
    +beckon.
    
    261
    +
    
    262
    +Apart from the more efficent calling convention, a compelling reason for
    
    263
    +a CBV calling conventions is worker-functions for strict data types.
    
    264
    +Example:
    
    265
    +   data T a = MkT ![a]
    
    266
    +   f :: T Int -> blah
    
    267
    +   f (MkT y) = ...
    
    268
    +We get a w/w split
    
    269
    +   $wf y = let x = MkT y in ...
    
    270
    +   f x = case x of MkT y -> $wf y
    
    271
    +But in `$wf`, in general, we'd need to evaluate `y`, becuase `MkT` is strict.
    
    272
    +With a CBV calling convention we can drop that stupid extra eval.
    
    273
    +
    
    274
    +Here's how it all works:
    
    275
    +
    
    276
    +* We identify some function Ids as "CBV candidates";
    
    277
    +  see Note [Which Ids should be CBV candidates?]
    
    278
    +
    
    279
    +* During W/W and SpecConstr: any worker/specialized binding we introduce
    
    280
    +  is marked as a CBV-candidate by `asCbvCandidate`.  This simply marks
    
    281
    +  the binding as a candidate for CBV-ness, using IdDetails `WorkerLikeId []`.
    
    282
    +  See Note [Which Ids should be CBV candidates?].
    
    283
    +
    
    284
    +  See also Note [Call-by-value for worker args] for how we build the worker RHS.
    
    285
    +
    
    286
    +* A CBV candidate may become a join point; we are careful to retain
    
    287
    +  its CBV-candidature; see `GHC.Types.Id.asJoinId`.  (Actually that hardly
    
    288
    +  matters because all join points are CBV-candidates.)  A join point can also
    
    289
    +  become an ordinary Id, due to floating (see `zapJoinId`); again we are
    
    290
    +  careful to retain CBV-candidature.
    
    291
    +
    
    292
    +* During Tidy, for CBV-candidate Ids, including join points, we mark any
    
    293
    +  /strict/ arguments as CBV. This is the point at which the CbvMarks inside a
    
    294
    +  WorkerLikeId are set.  See `GHC.Core.Tidy.computeCbvInfo`, and
    
    295
    +
    
    296
    +  This step is informed by a late demand analysis, performed just before tidying
    
    297
    +  to identify strict arguments.  See Note [Call-by-value for worker args] for
    
    298
    +  how a worker guarantees to be strict in strict datacon fields.
    
    299
    +
    
    300
    +  TODO: We currently don't do this for arguments that are unboxed sums or tuples,
    
    301
    +  because then we'd have to predict the result of unarisation. But it would be nice to
    
    302
    +  do so. See `computeCbvInfo`.
    
    303
    +
    
    304
    +* During CorePrep calls to CBV Ids are eta expanded.
    
    305
    +  See `GHC.CoreToStg.Prep.maybeSaturate`.
    
    306
    +
    
    289 307
     * During Stg CodeGen:
    
    290 308
       * When we see a call to a callByValueLike Id:
    
    291 309
         * We check if all arguments marked to be passed unlifted are already tagged.
    
    292 310
         * If they aren't we will wrap the call in case expressions which will evaluate+tag
    
    293 311
           these arguments before jumping to the function.
    
    312
    +  See (EPT-rewrite) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
    
    313
    +
    
    294 314
     * During Cmm codeGen:
    
    295 315
       * When generating code for the RHS of a StrictWorker binding
    
    296 316
         we omit tag checks when using arguments marked as tagged.
    
    317
    +  See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
    
    318
    +
    
    319
    +* Imported functions may be CBV, and then there is no point in eta-reducing
    
    320
    +  them; we'll just have to eta-expand later; see GHC.Core.Opt.Arity.cantEtaReduceFun.
    
    297 321
     
    
    322
    +*** SPJ really? Andreas? ****
    
    298 323
     We only use this for workers and specialized versions of SpecConstr
    
    299 324
     But we also check other functions during tidy and potentially turn some of them into
    
    300 325
     call by value functions and mark some of their arguments as call-by-value by looking at
    
    301 326
     argument unfoldings.
    
    302 327
     
    
    303
    -NB: I choose to put the information into a new Id constructor since these are loaded
    
    304
    -at all optimization levels. This makes it trivial to ensure the additional
    
    305
    -calling convention demands are available at all call sites. Putting it into
    
    306
    -IdInfo would require us at the very least to always decode the IdInfo
    
    328
    +
    
    329
    +NB: I choose to put the CBV information into the IdDetails since these are
    
    330
    +loaded at all optimization levels. This makes it trivial to ensure the
    
    331
    +additional calling convention demands are available at all call sites. Putting
    
    332
    +it into IdInfo would require us at the very least to always decode the IdInfo
    
    307 333
     just to decide if we need to throw it away or not after.
    
    308 334
     
    
    309
    -Note [Use CBV semantics only for join points and workers]
    
    310
    -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    311
    -A function with cbv-semantics requires arguments to be visible
    
    312
    -and if no arguments are visible requires us to eta-expand it's
    
    313
    -call site. That is for a binding with three cbv arguments like
    
    314
    -`w[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated
    
    315
    -occurrences like `map w xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs.
    
    316
    -
    
    317
    -In experiments it turned out that the code size increase of doing so
    
    318
    -can outweigh the performance benefits of doing so.
    
    319
    -So we only do this for join points, workers and
    
    320
    -specialized functions (from SpecConstr).
    
    321
    -Join points are naturally always called saturated so
    
    322
    -this problem can't occur for them.
    
    323
    -For workers and specialized functions there are also always at least
    
    324
    -some applied arguments as we won't inline the wrapper/apply their rule
    
    325
    -if there are unapplied occurrences like `map f xs`.
    
    335
    +Note [Which Ids should be CBV candidates?]
    
    336
    +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    337
    +In principle, we could use a CBV calling convention for /any/ strict function.
    
    338
    +But when we use CBV semantics the caller must obey the EPT calling convention,
    
    339
    +and that may mean eta-expansion. For example, for a binding with three CBV
    
    340
    +arguments like `foo[WorkerLikeId[!,!,!]]` we would need to eta expand undersaturated
    
    341
    +occurrences like `map foo xs` into `map (\x1 x2 x3 -> w x1 x2 x3) xs.
    
    342
    +
    
    343
    +In experiments it turned out that the code size increase of doing so can
    
    344
    +outweigh the performance benefits of doing so.
    
    345
    +
    
    346
    +So we treat only certain functions as candidates for CBV treatment:
    
    347
    +  * Workers created by worker/wrapper.
    
    348
    +  * Specialised functions from SpecConstr
    
    349
    +  * Join points
    
    350
    +
    
    351
    +Reason:
    
    352
    +  * All of these are always called saturated (at birth anyway)
    
    353
    +  * For workers in particular we want to use CBV for strict
    
    354
    +    fields of data constructors
    
    326 355
     -}
    
    327 356
     
    
    328 357
     -- | Parent of a record selector function.
    

  • compiler/GHC/Types/Id/Make.hs
    ... ... @@ -1038,7 +1038,7 @@ until the final simplifier phase; see Note [Activation for data
    1038 1038
     constructor wrappers].
    
    1039 1039
     
    
    1040 1040
     For further reading, see:
    
    1041
    -  * Note [Conlike is interesting] in GHC.Core.Op.Simplify.Utils
    
    1041
    +  * (IA1) in Note [Interesting arguments] in GHC.Core.Op.Simplify.Utils
    
    1042 1042
       * Note [Lone variables] in GHC.Core.Unfold
    
    1043 1043
       * Note [exprIsConApp_maybe on data constructors with wrappers]
    
    1044 1044
         in GHC.Core.SimpleOpt
    

  • testsuite/tests/simplCore/should_compile/T18013.stderr
    ... ... @@ -143,14 +143,14 @@ T18013.$wmapMaybeRule [InlPrag=NOINLINE]
    143 143
      Unf=OtherCon []]
    
    144 144
     T18013.$wmapMaybeRule
    
    145 145
       = \ (@a) (@b) (@s) (ww :: s) (ww1 :: s -> a -> IO (Result s b)) ->
    
    146
    +      case ww of ww2 { __DEFAULT ->
    
    146 147
           case ww1 of wild { __DEFAULT ->
    
    147
    -      case ww of wild1 { __DEFAULT ->
    
    148 148
           T18013a.Rule
    
    149 149
             @IO
    
    150 150
             @(Maybe a)
    
    151 151
             @(Maybe b)
    
    152 152
             @s
    
    153
    -        wild1
    
    153
    +        ww2
    
    154 154
             ((\ (s2 :: s)
    
    155 155
                 (a1 :: Maybe a)
    
    156 156
                 (s1 :: GHC.Internal.Prim.State# GHC.Internal.Prim.RealWorld) ->
    
    ... ... @@ -158,7 +158,7 @@ T18013.$wmapMaybeRule
    158 158
                   Nothing ->
    
    159 159
                     (# s1,
    
    160 160
                        T18013a.Result
    
    161
    -                     @s @(Maybe b) wild1 (GHC.Internal.Maybe.Nothing @b) #);
    
    161
    +                     @s @(Maybe b) ww2 (GHC.Internal.Maybe.Nothing @b) #);
    
    162 162
                   Just x ->
    
    163 163
                     case ((wild s2 x)
    
    164 164
                           `cast` <Co:4> :: IO (Result s b)
    

  • testsuite/tests/simplCore/should_compile/T26722.hs
    1
    +module T26722 where
    
    2
    +
    
    3
    +data T = MkT ![Int]
    
    4
    +
    
    5
    +g s True  t = f s t t
    
    6
    +g s False t = g s True t
    
    7
    +
    
    8
    +f True  (MkT xs) t = f False (MkT xs) t
    
    9
    +f False (MkT xs) _ = xs

  • testsuite/tests/simplCore/should_compile/T26722.stderr
    1
    + 
    \ No newline at end of file

  • testsuite/tests/simplCore/should_compile/all.T
    ... ... @@ -575,3 +575,6 @@ test('T26682', normal, multimod_compile, ['T26682', '-O -v0'])
    575 575
     # In the bug report #26615, the overloaded calls were signalled by a dictionary
    
    576 576
     #    argument like fEqList_xxxx, so we grep for that. Not a very robust test
    
    577 577
     test('T26615',  [grep_errmsg(r'fEqList')], multimod_compile, ['T26615', '-O -fspec-constr -ddump-simpl -dsuppress-uniques'])
    
    578
    +
    
    579
    +# T26722: there should be no reboxing in $wg
    
    580
    +test('T26722', [grep_errmsg(r'SPEC')], compile, ['-O -dno-typeable-binds'])