[Git][ghc/ghc][wip/T26722] Add evals for strict data-con args in worker-functions

6 Jan 2026

Simon Peyton Jones pushed to branch wip/T26722 at Glasgow Haskell Compiler / GHC


Commits:
ee290e4b by Simon Peyton Jones at 2026-01-06T17:41:33+00:00
Add evals for strict data-con args in worker-functions

This fixes #26722, by adding an eval in a worker for
arguments of strict data constructors, even if the
function body uses them strictly.

See (WIS1) in Note [Which Ids should be strictified]

- - - - -


10 changed files:

- compiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Core/Utils.hs
- compiler/GHC/Stg/EnforceEpt.hs
- compiler/GHC/StgToCmm/Closure.hs
- compiler/GHC/StgToCmm/Expr.hs
- hadrian/src/Flavour.hs
- + testsuite/tests/simplCore/should_compile/T26722.hs
- + testsuite/tests/simplCore/should_compile/T26722.stderr
- testsuite/tests/simplCore/should_compile/all.T


Changes:

=====================================
compiler/GHC/Core/Opt/SpecConstr.hs
=====================================
@@ -2062,7 +2062,7 @@ mkSeqs seqees res_ty rhs =
       addEval :: Var -> CoreExpr -> CoreExpr
       addEval arg_id rhs
         -- Argument representing strict field and it's worth passing via cbv
-        | shouldStrictifyIdForCbv arg_id
+        | wantCbvForId arg_id
         = Case (Var arg_id)
                (localiseId arg_id)  -- See (SCF1) in Note [SpecConstr and strict fields]
                res_ty


=====================================
compiler/GHC/Core/Tidy.hs
=====================================
@@ -38,7 +38,7 @@ import GHC.Utils.Outputable
 import GHC.Types.RepType (typePrimRep)
 import GHC.Utils.Panic
 import GHC.Types.Basic (isMarkedCbv, CbvMark (..))
-import GHC.Core.Utils (shouldUseCbvForId)
+import GHC.Core.Utils ( wantCbvForId )
 
 {-
 ************************************************************************
@@ -197,7 +197,7 @@ computeCbvInfo fun_id rhs
         isSimplePrimRep _   = False
 
     mkMark arg
-      | not $ shouldUseCbvForId arg = NotMarkedCbv
+      | not $ wantCbvForId arg = NotMarkedCbv
       -- We can only safely use cbv for strict arguments
       | (isStrUsedDmd (idDemandInfo arg))
       , not (isDeadEndId fun_id) = MarkedCbv


=====================================
compiler/GHC/Core/Utils.hs
=====================================
@@ -58,7 +58,7 @@ module GHC.Core.Utils (
         isJoinBind,
 
         -- * Tag inference
-        mkStrictFieldSeqs, shouldStrictifyIdForCbv, shouldUseCbvForId,
+        mkStrictFieldSeqs, wantCbvForId,
 
         -- * unsafeEqualityProof
         isUnsafeEqualityCase,
@@ -2902,18 +2902,24 @@ Here comes the tricky part: If we make $wloop strict in both x/y and we get:
             };
     end Rec }
 
-Here both x and y are known to be tagged in the function body since we pass strict worker args using unlifted cbv.
-This means the seqs on x and y both become no-ops and compared to the first version the seq on `y` disappears at runtime.
-
-The downside is that the caller of $wfoo potentially has to evaluate `y` once if we can't prove it isn't already evaluated.
-But y coming out of a strict field is in WHNF so safe to evaluated. And most of the time it will be properly tagged+evaluated
-already at the call site because of the EPT Invariant! See Note [EPT enforcement] for more in this.
-This makes GHC itself around 1% faster despite doing slightly more work! So this is generally quite good.
-
-We only apply this when we think there is a benefit in doing so however. There are a number of cases in which
-it would be useless to insert an extra seq. ShouldStrictifyIdForCbv tries to identify these to avoid churn in the
+Here both x and y are known to be tagged in the function body since we pass
+strict worker args using unlifted cbv.  This means the seqs on x and y both
+become no-ops (via (EPT-codegen) in Not [EPT enforcement]) and, compared to the
+first version, the seq on `y` disappears at runtime.
+
+The downside is that the caller of $wfoo potentially has to evaluate `y` once if
+we can't prove it isn't already evaluated.  But y coming out of a strict field
+is in WHNF so safe to evaluated. And most of the time it will be properly
+tagged+evaluated already at the call site because of the EPT Invariant! See Note
+[EPT enforcement] for more in this.  This makes GHC itself around 1% faster
+despite doing slightly more work! So this is generally quite good.
+
+We only apply this when we think there is a benefit in doing so however. There
+are a number of cases in which it would be useless to insert an extra
+seq. `wantCbvForId` tries to identify these to avoid churn in the
 simplifier. See Note [Which Ids should be strictified] for details on this.
 -}
+
 mkStrictFieldSeqs :: [(Id,StrictnessMark)] -> CoreExpr -> (CoreExpr)
 mkStrictFieldSeqs args rhs =
   foldr addEval rhs args
@@ -2923,7 +2929,7 @@ mkStrictFieldSeqs args rhs =
       addEval (arg_id,arg_cbv) (rhs)
         -- Argument representing strict field.
         | isMarkedStrict arg_cbv
-        , shouldStrictifyIdForCbv arg_id
+        , wantCbvForId arg_id
         -- Make sure to remove unfoldings here to avoid the simplifier dropping those for OtherCon[] unfoldings.
         = Case (Var $! zapIdUnfolding arg_id) arg_id case_ty ([Alt DEFAULT [] rhs])
         -- Normal argument
@@ -2943,87 +2949,99 @@ There are multiple reasons why we might not want to insert a seq in the rhs to
 strictify a functions argument:
 
 1) The argument doesn't exist at runtime.
-
-For zero width types (like Types) there is no benefit as we don't operate on them
-at runtime at all. This includes things like void#, coercions and state tokens.
+   For zero width types (like Types) there is no benefit as we don't operate on them
+   at runtime at all. This includes things like void#, coercions and state tokens.
 
 2) The argument is a unlifted type.
-
-If the argument is a unlifted type the calling convention already is explicitly
-cbv. This means inserting a seq on this argument wouldn't do anything as the seq
-would be a no-op *and* it wouldn't affect the calling convention.
+   If the argument is a unlifted type the calling convention already is explicitly
+   cbv. This means inserting a seq on this argument wouldn't do anything as the seq
+   would be a no-op *and* it wouldn't affect the calling convention.
 
 3) The argument is absent.
+   If the argument is absent in the body there is no advantage to it being passed as
+   cbv to the function. The function won't ever look at it so we don't save any work.
 
-If the argument is absent in the body there is no advantage to it being passed as
-cbv to the function. The function won't ever look at it so we don't safe any work.
-
-This mostly happens for join point. For example we might have:
-
-    data T = MkT ![Int] [Char]
-    f t = case t of MkT xs{strict} ys-> snd (xs,ys)
-
-and abstract the case alternative to:
+   This mostly happens for join points. For example we might have:
 
-    f t = join j1 = \xs ys -> snd (xs,ys)
-          in case t of MkT xs{strict} ys-> j1 xs xy
+       data T = MkT ![Int] [Char]
+       f t = case t of MkT xs{strict} ys-> snd (xs,ys)
 
-While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
-In short a absent demand means neither our RHS, nor any function we pass the argument
-to will inspect it. So there is no work to be saved by forcing `xs` early.
+   and abstract the case alternative to:
 
-NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
-Note [Absent fillers] has an example of this. However this is so rare it's not worth
-caring about here.
+       f t = join j1 = \xs ys -> snd (xs,ys)
+             in case t of MkT xs{strict} ys-> j1 xs xy
 
-4) The argument is already strict.
+   While we "use" xs inside `j1` it's not used inside the function `snd` we pass it to.
+   In short a absent demand means neither our RHS, nor any function we pass the argument
+   to will inspect it. So there is no work to be saved by forcing `xs` early.
 
-Consider this code:
-
-    data T = MkT ![Int]
-    f t = case t of MkT xs{strict} -> reverse xs
-
-The `xs{strict}` indicates that `xs` is used strictly by the `reverse xs`.
-If we do a w/w split, and add the extra eval on `xs`, we'll get
-
-    $wf xs =
-        case xs of xs1 ->
-            let t = MkT xs1 in
-            case t of MkT xs2 -> reverse xs2
-
-That's not wrong; but the w/w body will simplify to
-
-    $wf xs = case xs of xs1 -> reverse xs1
-
-and now we'll drop the `case xs` because `xs1` is used strictly in its scope.
-Adding that eval was a waste of time.  So don't add it for strictly-demanded Ids.
+   NB: There is an edge case where if we rebox we *can* end up seqing an absent value.
+   Note [Absent fillers] has an example of this. However this is so rare it's not worth
+   caring about here.
 
 5) Functions
-
-Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
-But the gist of it even if we make a higher order function argument strict
-we can't avoid the tag check when it's used later in the body.
-So there is no benefit.
+   Functions are tricky (see Note [TagInfo of functions] in EnforceEpt).
+   But the gist of it even if we make a higher order function argument strict
+   we can't avoid the tag check when it's used later in the body.
+   So there is no benefit.
+
+Wrinkles:
+
+(WIS1) You might have thought that we can omit the eval if the argument is used
+   strictly demanded in the body.  But you'd be wrong. Consider this code:
+          data T = MkT ![Int]
+          f t = case t of MkT xs{Dmd=STR} -> reverse xs
+
+   The `xs{Dmd=STR}` indicates that `xs` is used strictly by the `reverse xs`.
+   If we do a w/w split, and add the extra eval on `xs`, we'll get
+       $wf xs = case xs of xs1 ->
+                  let t = MkT xs1 in
+                  case t of MkT xs2 -> reverse xs2
+
+   That's not wrong; but you might wonder if the eval on `xs` is needed
+   when it is certainly evaluated by the `reverse`.  But yes, it is (#26722):
+       g s True  t = f s t t
+       g s False t = g s True t
+
+       f True  (MkT xs) t = f False (MkT xs) t
+       f False (MkT xs) _ = xs
+
+   After worker/wrapper we get:
+       g s b t = case t of MkT ww -> $wg s b ww
+       $wg s ds ww = case ds of {
+                      False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
+                      True  -> let { t1 = MkT ww } in f s t1 t1 }
+
+   We must make `f` inline inside `$wg`, because `f` too is ww'd, and we
+   don't want to rebox `t1` before passing it to `f`.  BUT while `t1`
+   looks like a HNF, `exprIsHNF` will say False because `MkT` is strict
+   and `ww` isn't evaluated.  So `f` doesn't inline and we get lots of
+   reboxing.
+
+   The Right Thing to to is to add the eval for the data con argument:
+       $wg s ds ww = case ww of ww' { DEFAULT ->
+                     case ds of {
+                      False -> case ww of wg { __DEFAULT -> Bar.$wg s True wg }
+                      True  -> let { t1 = MkT ww' } in f s t1 t1 } }
+
+   Now `t1` will be a HNF, and `f` will inline, and we get
+       $wg s ds ww = case ww of ww' { DEFAULT ->
+                     case ds of {
+                      False -> Bar.$wg s True ww'
+                      True  -> $wf s ww'
+
+  (Ultimately `$wg` will be a CBV function, so that `case ww` will be a
+  no-op: see (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt.)
 
 -}

--- | Do we expect there to be any benefit if we make this var strict
--- in order for it to get treated as as cbv argument?
--- See Note [Which Ids should be strictified]
--- See Note [CBV Function Ids] for more background.
-shouldStrictifyIdForCbv :: Var -> Bool
-shouldStrictifyIdForCbv = wantCbvForId False
-
--- Like shouldStrictifyIdForCbv but also wants to use cbv for strict args.
-shouldUseCbvForId :: Var -> Bool
-shouldUseCbvForId = wantCbvForId True
 
 -- When we strictify we want to skip strict args otherwise the logic is the same
--- as for shouldUseCbvForId so we common up the logic here.
+-- as for wantCbvForId so we common up the logic here.
 -- Basically returns true if it would be beneficial for runtime to pass this argument
 -- as CBV independent of weither or not it's correct. E.g. it might return true for lazy args
 -- we are not allowed to force.
-wantCbvForId :: Bool -> Var -> Bool
-wantCbvForId cbv_for_strict v
+wantCbvForId :: Var -> Bool
+wantCbvForId v
   -- Must be a runtime var.
   -- See Note [Which Ids should be strictified] point 1)
   | isId v
@@ -3037,9 +3055,6 @@ wantCbvForId cbv_for_strict v
   , not $ isFunTy ty
   -- If the var is strict already a seq is redundant.
   -- See Note [Which Ids should be strictified] point 4)
-  , not (isStrictDmd dmd) || cbv_for_strict
-  -- If the var is absent a seq is almost always useless.
-  -- See Note [Which Ids should be strictified] point 3)
   , not (isAbsDmd dmd)
   = True
   | otherwise


=====================================
compiler/GHC/Stg/EnforceEpt.hs
=====================================
@@ -140,9 +140,11 @@ Afterwards, the *EPT rewriter* inserts the actual evals realising Upcasts.
 Implementation
 --------------
 
-* EPT analysis is implemented in GHC.Stg.EnforceEpt.inferTags.
+(EPT-anal) EPT analysis is implemented in `GHC.Stg.EnforceEpt.inferTags.`
   It attaches its result to /binders/, not occurrence sites.
-* The EPT rewriter establishes the EPT invariant by inserting evals. That is, if
+
+(EPT-rewrite) The EPT rewriter, `GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds`,
+   establishes the EPT invariant by inserting evals. That is, if
     (a) a binder x is used to
           * construct a strict field (`SP x y`), or
           * passed as a CBV argument (`$wf x`),
@@ -152,17 +154,27 @@ Implementation
     case x of x' { __ DEFAULT -> SP x' y }.
     case x of x' { __ DEFAULT -> $wf x' }.
   (Recall that the case binder x' is always EPT.)
-  This is implemented in GHC.Stg.EnforceEpt.Rewrite.rewriteTopBinds.
+
   This pass also propagates the EPTness from binders to occurrences.
+
   It is sound to insert evals on strict fields (Note [Strict fields in Core]),
   and on CBV arguments as well (Note [CBV Function Ids]).
-* We also export the EPTness of top level bindings to allow this optimisation
+
+(EPT-codegen) Finally, code generation for (case x of alts) skips the thunk check
+  when `x` is EPT. This is done (a bit indirectly) thus:
+   * GHC.StgToCmm.Expr.cgCase: builds a `sequel`, and recurses into `cgExpr` on `x`.
+   * When `cgExpr` sees a `x` goes to `cgIdApp`, which uses `getCallMethod`.
+   * Then `getCallMethod` sees that `x` is EPT (via `idTagSigMaybe`), and
+     returns `InferredReturnIt`.
+   * Now `cgIdApp` can jump straight to the case-alternative switch in the `sequel`
+     constructed by `cgCase`.
+
+(EPT-export) We also export the EPTness of top level bindings to allow this optimisation
   to work across module boundaries.
+
   NB: The EPT Invariant *must* be upheld, regardless of the optimisation level;
   hence EPTness is practically part of the internal ABI of a strict data
   constructor or CBV function. Note [CBV Function Ids] contains the details.
-* Finally, code generation skips the thunk check when branching on binders that
-  are EPT. This is done by `cgExpr`/`cgCase` in the backend.
 
 Evaluation
 ----------


=====================================
compiler/GHC/StgToCmm/Closure.hs
=====================================
@@ -617,12 +617,15 @@ getCallMethod cfg name id (LFThunk _ _ updatable std_form_info is_fun)
 getCallMethod cfg name id (LFUnknown might_be_a_function) n_args _cg_locs _self_loop_info
   | n_args == 0
   , Just sig <- idTagSig_maybe id
-  , isTaggedSig sig -- Infered to be already evaluated by EPT analysis
-  -- When profiling we must enter all potential functions to make sure we update the SCC
-  -- even if the function itself is already evaluated.
+  , isTaggedSig sig -- This `id` is evaluated and properly tagged; no need to enter it
+                    -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
+
+  -- When profiling we must enter all potential functions to make sure we update
+  -- the SCC even if the function itself is already evaluated.
   -- See Note [Evaluating functions with profiling] in rts/Apply.cmm
   , not (profileIsProfiling (stgToCmmProfile cfg) && might_be_a_function)
-  = InferedReturnIt -- See Note [EPT enforcement]
+
+  = InferedReturnIt -- See (EPT-codegen) in Note [EPT enforcement]
 
   | might_be_a_function = SlowCall
 


=====================================
compiler/GHC/StgToCmm/Expr.hs
=====================================
@@ -1053,6 +1053,7 @@ cgIdApp fun_id args = do
           | otherwise                -> emitReturn [fun]
 
         -- A value infered to be in WHNF, so we can just return it.
+        -- See (EPT-codegen) in Note [EPT enforcement] in GHC.Stg.EnforceEpt
         InferedReturnIt
           | isZeroBitTy (idType fun_id) -> trace >> emitReturn []
           | otherwise                   -> trace >> assertTag >>


=====================================
hadrian/src/Flavour.hs
=====================================
@@ -70,7 +70,8 @@ flavourTransformers = M.fromList
     , "fully_static"     =: fullyStatic
     , "host_fully_static" =: hostFullyStatic
     , "collect_timings"  =: collectTimings
-    , "assertions"       =: enableAssertions
+    , "assertions"        =: enableAssertions Stage2
+    , "assertions_stage1" =: enableAssertions Stage1
     , "debug_ghc"        =: debugGhc Stage2
     , "debug_stage1_ghc" =: debugGhc Stage1
     , "lint"             =: enableLinting
@@ -394,11 +395,11 @@ enableLateCCS = addArgs
   ? arg "-fprof-late"
 
 -- | Enable assertions for the stage2 compiler
-enableAssertions :: Flavour -> Flavour
-enableAssertions flav = flav { ghcDebugAssertions = f }
+enableAssertions :: Stage -> Flavour -> Flavour
+enableAssertions stage flav = flav { ghcDebugAssertions = f }
   where
-    f Stage2 = True
-    f st = ghcDebugAssertions flav st
+    f s | s == stage = True
+        | otherwise  = ghcDebugAssertions flav s
 
 -- | Build the stage3 compiler using the non-moving GC.
 enableBootNonmovingGc :: Flavour -> Flavour


=====================================
testsuite/tests/simplCore/should_compile/T26722.hs
=====================================
@@ -0,0 +1,9 @@
+module T26722 where
+
+data T = MkT ![Int]
+
+g s True  t = f s t t
+g s False t = g s True t
+
+f True  (MkT xs) t = f False (MkT xs) t
+f False (MkT xs) _ = xs


=====================================
testsuite/tests/simplCore/should_compile/T26722.stderr
=====================================
@@ -0,0 +1 @@
+ 
\ No newline at end of file


=====================================
testsuite/tests/simplCore/should_compile/all.T
=====================================
@@ -563,3 +563,6 @@ test('T26115', [grep_errmsg(r'DFun')], compile, ['-O -ddump-simpl -dsuppress-uni
 test('T26116', normal, compile, ['-O -ddump-rules'])
 test('T26117', [grep_errmsg(r'==')], compile, ['-O -ddump-simpl -dsuppress-uniques'])
 test('T26349',  normal, compile, ['-O -ddump-rules'])
+
+# T26722: there should be no reboxing in $wg
+test('T26722', [grep_errmsg(r'SPEC')], compile, ['-O -dno-typeable-binds'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/ee290e4bfc327e60346f927f618cce42...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/ee290e4bfc327e60346f927f618cce42...
You're receiving this email because of your account on gitlab.haskell.org.

    

Simon Peyton Jones (＠simonpj)

tags

participants (1)