[Git][ghc/ghc][wip/T26425] 2 commits: Fix a performance hole in the occurrence analyser

5 Nov 2025

Simon Peyton Jones pushed to branch wip/T26425 at Glasgow Haskell Compiler / GHC


Commits:
eba3bda4 by Simon Peyton Jones at 2025-11-05T08:32:37+00:00
Fix a performance hole in the occurrence analyser

As #26425 showed, the clever stuff in

   Note [Occurrence analysis for join points]

does a lot of duplication of usage details.  This patch
improved matters with a little fancy footwork.  It is
described in the new (W5) of the same Note.

- - - - -
fd2bb45c by Simon Peyton Jones at 2025-11-05T08:34:52+00:00
Add a fast-path for args=[] to occAnalApp

In the common case of having not arguments, occAnalApp
was doing redundant work.

- - - - -


5 changed files:

- compiler/GHC/Core/Opt/OccurAnal.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Types/Unique/FM.hs
- compiler/GHC/Types/Unique/Set.hs
- compiler/GHC/Types/Var/Env.hs


Changes:

=====================================
compiler/GHC/Core/Opt/OccurAnal.hs
=====================================
@@ -664,14 +664,14 @@ through A, so it should have ManyOcc.  Bear this case in mind!
     * maps a OneOcc to OneOcc{ occ_n_br = 0 }
   In our example, occ_join_points will be extended with
       [j :-> [v :-> OneOcc{occ_n_br=0}]]
-  See addJoinPoint.
+  See `addJoinPoint` and (W5) below.
 
 * At an occurrence of a join point, we do everything as normal, but add in the
   UsageDetails from the occ_join_points.  See mkOneOcc.
 
 * Crucially, at the NonRec binding of the join point, in `occAnalBind`, we use
-  `orUDs`, not `andUDs` to combine the usage from the RHS with the usage from
-  the body.
+  `combineJoinPointUDs`, not `andUDs` to combine the usage from the RHS with
+  the usage from the body.
 
 Here are the consequences
 
@@ -688,7 +688,7 @@ Here are the consequences
   These are `andUDs` together in `addOccInfo`, and hence
   `v` gets ManyOccs, just as it should.  Clever!
 
-There are a couple of tricky wrinkles
+There are, of course, some tricky wrinkles
 
 (W1) Consider this example which shadows `j`:
           join j = rhs in
@@ -718,6 +718,8 @@ There are a couple of tricky wrinkles
      * In `postprcess_uds`, we add the chucked-out join points to the
        returned UsageDetails, with `andUDs`.
 
+Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
+
 (W3) Consider this example, which shadows `j`, but this time in an argument
               join j = rhs
               in f (case x of { K j -> ...; ... })
@@ -734,10 +736,38 @@ There are a couple of tricky wrinkles
 
 (W4) What if the join point binding has a stable unfolding, or RULES?
      They are just alternative right-hand sides, and at each call site we
-     will use only one of them. So again, we can use `orUDs` to combine
-     usage info from all these alternatives RHSs.
-
-Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
+     will use only one of them. So again, we can use `combineJoinPointUDs`
+     to combine usage info from all these alternatives RHSs.
+
+(W5) Other things being equal, we want keep the OccInfoEnv in the range of
+  `occ_join_points` as small as possible, because it is /duplicated/ at
+  /every occurrence/ of the join point.  We really only want to include
+  OccInfo for
+       * Local, non-recursive let-bound Ids
+       * that occur just once in the RHS of the join point
+  particularly including
+       * thunks (that's the original point) and
+       * join points (so that the trick works recursively).
+  We call these the "tracked Ids of j".
+
+  Including lambda binders is pointless, and slows down the occurrence analyser.
+
+  e.g.    \x. let y = x+1 in
+              join j v = ..x..y..(f z z)..
+              in ...
+  In the `occ_join_points` binding for `j`, we want to track `y`, but
+  not `x` (lambda bound) nor `z` (occurs many times).
+
+  To exploit this:
+     * `occ_local_lets` tracks which Ids are local, non-recursive lets
+     * `addJoinPoint` only populates j's entry with occ-info for the "tracked Ids"
+       of `j`; that is, that are (a) in occ_local_lets and (b) have OneOcc.
+     * `combineJoinPointUDs` uses
+          orLocalOcc  for local-let Ids
+          andLocalOcc for non-local-let Ids
+
+  This fancy footwork can matter in extreme cases: it gave a 25% reduction in
+  total compiler allocation in #26425..
 
 Note [Finding join points]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -759,62 +789,62 @@ rest of 'OccInfo' until it goes on the binder.
 
 Note [Join arity prediction based on joinRhsArity]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In general, the join arity from tail occurrences of a join point (O) may be
-higher or lower than the manifest join arity of the join body (M). E.g.,
+In general, the join arity from tail occurrences of a join point (OAr) may be
+higher or lower than the manifest join arity of the join body (MAr). E.g.,
 
-  -- M > O:
-  let f x y = x + y              -- M = 2
-  in if b then f 1 else f 2      -- O = 1
+  -- MAr > Oar:
+  let f x y = x + y              -- MAr = 2
+  in if b then f 1 else f 2      -- OAr = 1
   ==> { Contify for join arity 1 }
   join f x = \y -> x + y
   in if b then jump f 1 else jump f 2
 
-  -- M < O
-  let f = id                     -- M = 0
-  in if ... then f 12 else f 13  -- O = 1
+  -- MAr < Oar
+  let f = id                     -- MAr = 0
+  in if ... then f 12 else f 13  -- OAr = 1
   ==> { Contify for join arity 1, eta-expand f }
   join f x = id x
   in if b then jump f 12 else jump f 13
 
-But for *recursive* let, it is crucial that both arities match up, consider
+But for *recursive* let, it is crucial MAr=OAr.  Consider:
 
   letrec f x y = if ... then f x else True
   in f 42
 
-Here, M=2 but O=1. If we settled for a joinrec arity of 1, the recursive jump
+Here, MAr=2 but OAr=1. If we settled for a joinrec arity of 1, the recursive jump
 would not happen in a tail context! Contification is invalid here.
-So indeed it is crucial to demand that M=O.
+So indeed it is crucial to demand that MAr=OAr.
 
-(Side note: Actually, we could be more specific: Let O1 be the join arity of
-occurrences from the letrec RHS and O2 the join arity from the let body. Then
-we need M=O1 and M<=O2 and could simply eta-expand the RHS to match O2 later.
-M=O is the specific case where we don't want to eta-expand. Neither the join
+(Side note: Actually, we could be more specific: Let OAr1 be the join arity of
+occurrences from the letrec RHS and OAr2 the join arity from the let body. Then
+we need MAr=OAr1 and MAr<=OAr2 and could simply eta-expand the RHS to match OAr2 later.
+MAr=OAr is the specific case where we don't want to eta-expand. Neither the join
 points paper nor GHC does this at the moment.)
 
 We can capitalise on this observation and conclude that *if* f could become a
-joinrec (without eta-expansion), it will have join arity M.
-Now, M is just the result of 'joinRhsArity', a rather simple, local analysis.
+joinrec (without eta-expansion), it will have join arity MAr.
+Now, MAr is just the result of 'joinRhsArity', a rather simple, local analysis.
 It is also the join arity inside the 'TailUsageDetails' returned by
 'occAnalLamTail', so we can predict join arity without doing any fixed-point
 iteration or really doing any deep traversal of let body or RHS at all.
-We check for M in the 'adjustTailUsage' call inside 'tagRecBinders'.
+We check for MAr in the 'adjustTailUsage' call inside 'tagRecBinders'.
 
 All this is quite apparent if you look at the contification transformation in
 Fig. 5 of "Compiling without Continuations" (which does not account for
 eta-expansion at all, mind you). The letrec case looks like this
-
+n
   letrec f = /\as.\xs. L[us] in L'[es]
     ... and a bunch of conditions establishing that f only occurs
         in app heads of join arity (len as + len xs) inside us and es ...
 
-The syntactic form `/\as.\xs. L[us]` forces M=O iff `f` occurs in `us`. However,
+The syntactic form `/\as.\xs. L[us]` forces MAr=OAr iff `f` occurs in `us`. However,
 for non-recursive functions, this is the definition of contification from the
 paper:
 
   let f = /\as.\xs.u in L[es]     ... conditions ...
 
-Note that u could be a lambda itself, as we have seen. No relationship between M
-and O to exploit here.
+Note that u could be a lambda itself, as we have seen. No relationship between MAr
+and OAr to exploit here.
 
 Note [Join points and unfoldings/rules]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -992,23 +1022,29 @@ occAnalBind !env lvl ire (NonRec bndr rhs) thing_inside combine
   = -- Analyse the RHS and /then/ the body
     let -- Analyse the rhs first, generating rhs_uds
         !(rhs_uds_s, bndr', rhs') = occAnalNonRecRhs env lvl ire mb_join bndr rhs
-        rhs_uds = foldl1' orUDs rhs_uds_s   -- NB: orUDs.  See (W4) of
-                                           -- Note [Occurrence analysis for join points]
+        rhs_uds = foldl1' (combineJoinPointUDs env)
+                          rhs_uds_s   -- NB: combineJoinPointUDs.  See (W4) of
+                                      -- Note [Occurrence analysis for join points]
 
         -- Now analyse the body, adding the join point
         -- into the environment with addJoinPoint
-        !(WUD body_uds (occ, body)) = occAnalNonRecBody env bndr' $ \env ->
+        env_body = addLocalLet env lvl bndr
+        !(WUD body_uds (occ, body)) = occAnalNonRecBody env_body bndr' $ \env ->
                                       thing_inside (addJoinPoint env bndr' rhs_uds)
     in
     if isDeadOcc occ     -- Drop dead code; see Note [Dead code]
     then WUD body_uds body
-    else WUD (rhs_uds `orUDs` body_uds)    -- Note `orUDs`
+    else -- pprTrace "occAnal-nonrec" (vcat [ ppr bndr <+> ppr occ
+         --                                , text "rhs_uds" <+> ppr rhs_uds
+         --                                , text "body_uds" <+> ppr body_uds ]) $
+         WUD (combineJoinPointUDs env rhs_uds body_uds)    -- Note `orUDs`
              (combine [NonRec (fst (tagNonRecBinder lvl occ bndr')) rhs']
                       body)
 
   -- The normal case, including newly-discovered join points
   -- Analyse the body and /then/ the RHS
-  | WUD body_uds (occ,body) <- occAnalNonRecBody env bndr thing_inside
+  | let env_body = addLocalLet env lvl bndr
+  , WUD body_uds (occ,body) <- occAnalNonRecBody env_body bndr thing_inside
   = if isDeadOcc occ   -- Drop dead code; see Note [Dead code]
     then WUD body_uds body
     else let
@@ -1054,7 +1090,7 @@ occAnalNonRecRhs !env lvl imp_rule_edges mb_join bndr rhs
     rhs_ctxt = mkNonRecRhsCtxt lvl bndr unf
 
     -- See Note [Join arity prediction based on joinRhsArity]
-    -- Match join arity O from mb_join_arity with manifest join arity M as
+    -- Match join arity OAr from mb_join_arity with manifest join arity MAr as
     -- returned by of occAnalLamTail. It's totally OK for them to mismatch;
     -- hence adjust the UDs from the RHS
 
@@ -1764,7 +1800,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs)
                                -- here because that is what we are setting!
     WTUD unf_tuds unf' = occAnalUnfolding rhs_env unf
     adj_unf_uds = adjustTailArity (JoinPoint rhs_ja) unf_tuds
-      -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source M
+      -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source MAr
       -- of Note [Join arity prediction based on joinRhsArity]
 
     --------- IMP-RULES --------
@@ -1775,7 +1811,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs)
 
     --------- All rules --------
     -- See Note [Join points and unfoldings/rules]
-    -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source M
+    -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source MAr
     -- of Note [Join arity prediction based on joinRhsArity]
     rules_w_uds :: [(CoreRule, UsageDetails, UsageDetails)]
     rules_w_uds = [ (r,l,adjustTailArity (JoinPoint rhs_ja) rhs_wuds)
@@ -2177,7 +2213,9 @@ occAnalLamTail :: OccEnv -> CoreExpr -> WithTailUsageDetails CoreExpr
 -- See Note [Adjusting right-hand sides]
 occAnalLamTail env expr
   = let !(WUD usage expr') = occ_anal_lam_tail env expr
-    in WTUD (TUD (joinRhsArity expr) usage) expr'
+    in WTUD (TUD (joinRhsArity expr') usage) expr'
+       -- If expr looks like (\x. let dead = e in \y. blah), where `dead` is dead
+       -- then joinRhsArity expr' might exceed joinRhsArity expr
 
 occ_anal_lam_tail :: OccEnv -> CoreExpr -> WithUsageDetails CoreExpr
 -- Does not markInsideLam etc for the outmost batch of lambdas
@@ -2598,7 +2636,7 @@ occAnalArgs :: OccEnv -> CoreExpr -> [CoreExpr]
             -> WithUsageDetails CoreExpr
 -- The `fun` argument is just an accumulating parameter,
 -- the base for building the application we return
-occAnalArgs !env fun args !one_shots
+occAnalArgs env fun args one_shots
   = go emptyDetails fun args one_shots
   where
     env_args = setNonTailCtxt encl env
@@ -2657,8 +2695,19 @@ Constructors are rather like lambdas in this way.
 occAnalApp :: OccEnv
            -> (Expr CoreBndr, [Arg CoreBndr], [CoreTickish])
            -> WithUsageDetails (Expr CoreBndr)

--- Naked variables (not applied) end up here too
-occAnalApp !env (Var fun, args, ticks)
+occAnalApp !env (Var fun_id, [], ticks)
+  = -- Naked variables (not applied) end up here too, and it's worth giving
+    -- this common case special treatment, because there is so much less to do.
+    -- This is just a specialised copy of the (Var fun_id) case below
+    WUD fun_uds (mkTicks ticks fun')
+  where
+    !(fun', fun_id')  = lookupBndrSwap env fun_id
+    !fun_uds = mkOneOcc env fun_id' int_cxt 0
+    !int_cxt = case occ_encl env of
+                   OccScrut -> IsInteresting
+                   _other   -> NotInteresting
+
+occAnalApp env (Var fun, args, ticks)
   -- Account for join arity of runRW# continuation
   -- See Note [Simplification of runRW#]
   --
@@ -2863,7 +2912,11 @@ data OccEnv
              -- Invariant: no Id maps to an empty OccInfoEnv
              -- See Note [Occurrence analysis for join points]
            , occ_join_points :: !JoinPointInfo
-    }
+
+           , occ_local_lets :: IdSet    -- Non-top-level non-rec-bound lets
+                -- I tried making this field strict, but
+                -- doing so slightly increased allocation
+           }
 
 type JoinPointInfo = IdEnv OccInfoEnv
 
@@ -2914,7 +2967,8 @@ initOccEnv
 
            , occ_join_points = emptyVarEnv
            , occ_bs_env = emptyVarEnv
-           , occ_bs_rng = emptyVarSet }
+           , occ_bs_rng = emptyVarSet
+           , occ_local_lets = emptyVarSet }
 
 noBinderSwaps :: OccEnv -> Bool
 noBinderSwaps (OccEnv { occ_bs_env = bs_env }) = isEmptyVarEnv bs_env
@@ -3154,23 +3208,26 @@ postprocess_uds bndrs bad_joins uds
       | uniq `elemVarEnvByKey` env = plusVarEnv_C andLocalOcc env join_env
       | otherwise                  = env
 
+addLocalLet :: OccEnv -> TopLevelFlag -> Id -> OccEnv
+addLocalLet env@(OccEnv { occ_local_lets = ids }) top_lvl id
+  | isTopLevel top_lvl = env
+  | otherwise          = env { occ_local_lets = ids `extendVarSet` id }
+
 addJoinPoint :: OccEnv -> Id -> UsageDetails -> OccEnv
-addJoinPoint env bndr rhs_uds
+addJoinPoint env@(OccEnv { occ_join_points = join_points, occ_local_lets = local_lets })
+             join_bndr (UD { ud_env = rhs_occs })
   | isEmptyVarEnv zeroed_form
   = env
   | otherwise
-  = env { occ_join_points = extendVarEnv (occ_join_points env) bndr zeroed_form }
+  = env { occ_join_points = extendVarEnv join_points join_bndr zeroed_form }
   where
-    zeroed_form = mkZeroedForm rhs_uds
+    zeroed_form = mapMaybeUniqSetToUFM do_one local_lets
+     -- See Note [Occurrence analysis for join points] for "zeroed form"
 
-mkZeroedForm :: UsageDetails -> OccInfoEnv
--- See Note [Occurrence analysis for join points] for "zeroed form"
-mkZeroedForm (UD { ud_env = rhs_occs })
-  = mapMaybeUFM do_one rhs_occs
-  where
-    do_one :: LocalOcc -> Maybe LocalOcc
-    do_one (ManyOccL {})    = Nothing
-    do_one occ@(OneOccL {}) = Just (occ { lo_n_br = 0 })
+    do_one :: Var -> Maybe LocalOcc
+    do_one bndr = case lookupVarEnv rhs_occs bndr of
+                    Just occ@(OneOccL {}) -> Just (occ { lo_n_br = 0 })
+                    _                     -> Nothing
 
 --------------------
 transClosureFV :: VarEnv VarSet -> VarEnv VarSet
@@ -3628,7 +3685,12 @@ data LocalOcc  -- See Note [LocalOcc]
                    -- Combining (AlwaysTailCalled 2) and (AlwaysTailCalled 3)
                    -- gives NoTailCallInfo
               , lo_int_cxt :: !InterestingCxt }
+
     | ManyOccL !TailCallInfo
+       -- Why do we need TailCallInfo on ManyOccL?
+       -- Answer: recursive bindings are entered many times:
+       --    rec { j x = ...j x'... } in j y
+       -- See the uses of `andUDs` in `tagRecBinders`
 
 instance Outputable LocalOcc where
   ppr (OneOccL { lo_n_br = n, lo_tail = tci })
@@ -3663,7 +3725,7 @@ instance Outputable UsageDetails where
 -- | TailUsageDetails captures the result of applying 'occAnalLamTail'
 --   to a function `\xyz.body`. The TailUsageDetails pairs together
 --   * the number of lambdas (including type lambdas: a JoinArity)
---   * UsageDetails for the `body` of the lambda, unadjusted by `adjustTailUsage`.
+--   * UsageDetails for the `body` of the lambda, /unadjusted/ by `adjustTailUsage`.
 -- If the binding turns out to be a join point with the indicated join
 -- arity, this unadjusted usage details is just what we need; otherwise we
 -- need to discard tail calls. That's what `adjustTailUsage` does.
@@ -3681,8 +3743,17 @@ data WithTailUsageDetails a = WTUD !TailUsageDetails !a
 
 andUDs:: UsageDetails -> UsageDetails -> UsageDetails
 orUDs :: UsageDetails -> UsageDetails -> UsageDetails
-andUDs = combineUsageDetailsWith andLocalOcc
-orUDs  = combineUsageDetailsWith orLocalOcc
+andUDs = combineUsageDetailsWith (\_uniq -> andLocalOcc)
+orUDs  = combineUsageDetailsWith (\_uniq -> orLocalOcc)
+
+combineJoinPointUDs :: OccEnv -> UsageDetails -> UsageDetails -> UsageDetails
+-- See (W5) in Note [Occurrence analysis for join points]
+combineJoinPointUDs (OccEnv { occ_local_lets = local_lets }) uds1 uds2
+  = combineUsageDetailsWith combine uds1 uds2
+  where
+    combine uniq occ1 occ2
+      | uniq `elemVarSetByKey` local_lets = orLocalOcc  occ1 occ2
+      | otherwise                         = andLocalOcc occ1 occ2
 
 mkOneOcc :: OccEnv -> Id -> InterestingCxt -> JoinArity -> UsageDetails
 mkOneOcc !env id int_cxt arity
@@ -3699,7 +3770,8 @@ mkOneOcc !env id int_cxt arity
   = mkSimpleDetails (unitVarEnv id occ)
 
   where
-    occ = OneOccL { lo_n_br = 1, lo_int_cxt = int_cxt
+    occ = OneOccL { lo_n_br = 1
+                  , lo_int_cxt = int_cxt
                   , lo_tail = AlwaysTailCalled arity }
 
 -- Add several occurrences, assumed not to be tail calls
@@ -3786,7 +3858,7 @@ restrictFreeVars bndrs fvs = restrictUniqSetToUFM bndrs fvs
 -------------------
 -- Auxiliary functions for UsageDetails implementation
 
-combineUsageDetailsWith :: (LocalOcc -> LocalOcc -> LocalOcc)
+combineUsageDetailsWith :: (Unique -> LocalOcc -> LocalOcc -> LocalOcc)
                         -> UsageDetails -> UsageDetails -> UsageDetails
 {-# INLINE combineUsageDetailsWith #-}
 combineUsageDetailsWith plus_occ_info
@@ -3796,9 +3868,9 @@ combineUsageDetailsWith plus_occ_info
   | isEmptyVarEnv env2 = uds1
   | otherwise
   -- See Note [Strictness in the occurrence analyser]
-  -- Using strictPlusVarEnv here speeds up the test T26425 by about 10% by avoiding
-  -- intermediate thunks.
-  = UD { ud_env       = strictPlusVarEnv_C plus_occ_info env1 env2
+  -- Using strictPlusVarEnv here speeds up the test T26425
+  -- by about 10% by avoiding intermediate thunks.
+  = UD { ud_env       = strictPlusVarEnv_C_Directly plus_occ_info env1 env2
        , ud_z_many    = strictPlusVarEnv z_many1   z_many2
        , ud_z_in_lam  = plusVarEnv z_in_lam1 z_in_lam2
        , ud_z_tail    = strictPlusVarEnv z_tail1   z_tail2 }
@@ -3842,8 +3914,6 @@ lookupOccInfoByUnique (UD { ud_env       = env
         | uniq `elemVarEnvByKey` z_tail = NoTailCallInfo
         | otherwise                     = ti
 
-
-
 -------------------
 -- See Note [Adjusting right-hand sides]
 
@@ -3853,21 +3923,22 @@ adjustNonRecRhs :: JoinPointHood
 -- ^ This function concentrates shared logic between occAnalNonRecBind and the
 -- AcyclicSCC case of occAnalRec.
 -- It returns the adjusted rhs UsageDetails combined with the body usage
-adjustNonRecRhs mb_join_arity rhs_wuds@(WTUD _ rhs)
-  = WUD (adjustTailUsage mb_join_arity rhs_wuds) rhs
-
+adjustNonRecRhs mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
+  = WUD (adjustTailUsage exact_join rhs uds) rhs
+  where
+    exact_join = mb_join_arity == JoinPoint rhs_ja
 
-adjustTailUsage :: JoinPointHood
-                -> WithTailUsageDetails CoreExpr    -- Rhs usage, AFTER occAnalLamTail
+adjustTailUsage :: Bool        -- True <=> Exactly-matching join point; don't do markNonTail
+                -> CoreExpr    -- Rhs usage, AFTER occAnalLamTail
                 -> UsageDetails
-adjustTailUsage mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
+                -> UsageDetails
+adjustTailUsage exact_join rhs uds
   = -- c.f. occAnal (Lam {})
     markAllInsideLamIf (not one_shot) $
     markAllNonTailIf (not exact_join) $
     uds
   where
     one_shot   = isOneShotFun rhs
-    exact_join = mb_join_arity == JoinPoint rhs_ja
 
 adjustTailArity :: JoinPointHood -> TailUsageDetails -> UsageDetails
 adjustTailArity mb_rhs_ja (TUD ja usage)
@@ -3914,8 +3985,9 @@ tagNonRecBinder lvl occ bndr
 tagRecBinders :: TopLevelFlag           -- At top level?
               -> UsageDetails           -- Of body of let ONLY
               -> [NodeDetails]
-              -> WithUsageDetails       -- Adjusted details for whole scope,
-                                        -- with binders removed
+              -> WithUsageDetails       -- Adjusted details for whole scope
+                                        -- still including the binders;
+                                        -- (they are removed by `addInScope`)
                   [IdWithOccInfo]       -- Tagged binders
 -- Substantially more complicated than non-recursive case. Need to adjust RHS
 -- details *before* tagging binders (because the tags depend on the RHSes).
@@ -3925,32 +3997,21 @@ tagRecBinders lvl body_uds details_s
 
      -- 1. See Note [Join arity prediction based on joinRhsArity]
      --    Determine possible join-point-hood of whole group, by testing for
-     --    manifest join arity M.
-     --    This (re-)asserts that makeNode had made tuds for that same arity M!
+     --    manifest join arity MAr.
+     --    This (re-)asserts that makeNode had made tuds for that same arity MAr!
      unadj_uds = foldr (andUDs . test_manifest_arity) body_uds details_s
-     test_manifest_arity ND{nd_rhs = WTUD tuds rhs}
-       = adjustTailArity (JoinPoint (joinRhsArity rhs)) tuds
+     test_manifest_arity ND{nd_rhs = WTUD (TUD rhs_ja uds) rhs}
+       = assertPpr (rhs_ja == joinRhsArity rhs) (ppr rhs_ja $$ ppr uds $$ ppr rhs) $
+         uds
 
+     will_be_joins :: Bool
      will_be_joins = decideRecJoinPointHood lvl unadj_uds bndrs
 
-     mb_join_arity :: Id -> JoinPointHood
-     -- mb_join_arity: See Note [Join arity prediction based on joinRhsArity]
-     -- This is the source O
-     mb_join_arity bndr
-         -- Can't use willBeJoinId_maybe here because we haven't tagged
-         -- the binder yet (the tag depends on these adjustments!)
-       | will_be_joins
-       , AlwaysTailCalled arity <- lookupTailCallInfo unadj_uds bndr
-       = JoinPoint arity
-       | otherwise
-       = assert (not will_be_joins) -- Should be AlwaysTailCalled if
-         NotJoinPoint               -- we are making join points!
-
      -- 2. Adjust usage details of each RHS, taking into account the
      --    join-point-hood decision
-     rhs_udss' = [ adjustTailUsage (mb_join_arity bndr) rhs_wuds
+     rhs_udss' = [ adjustTailUsage will_be_joins rhs rhs_uds
                      -- Matching occAnalLamTail in makeNode
-                 | ND { nd_bndr = bndr, nd_rhs = rhs_wuds } <- details_s ]
+                 | ND { nd_rhs = WTUD (TUD _ rhs_uds) rhs } <- details_s ]
 
      -- 3. Compute final usage details from adjusted RHS details
      adj_uds = foldr andUDs body_uds rhs_udss'
@@ -3969,9 +4030,9 @@ setBinderOcc occ_info bndr
   | otherwise                  = setIdOccInfo bndr occ_info
 
 -- | Decide whether some bindings should be made into join points or not, based
--- on its occurrences. This is
+-- on its occurrences.
 -- Returns `False` if they can't be join points. Note that it's an
--- all-or-nothing decision, as if multiple binders are given, they're
+-- all-or-nothing decision: if multiple binders are given, they are
 -- assumed to be mutually recursive.
 --
 -- It must, however, be a final decision. If we say `True` for 'f',


=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -4595,13 +4595,21 @@ mkLetUnfolding :: SimplEnv -> TopLevelFlag -> UnfoldingSource
                -> InId -> Bool    -- True <=> this is a join point
                -> OutExpr -> SimplM Unfolding
 mkLetUnfolding env top_lvl src id is_join new_rhs
-  = return (mkUnfolding uf_opts src is_top_lvl is_bottoming is_join new_rhs Nothing)
-            -- We make an  unfolding *even for loop-breakers*.
-            -- Reason: (a) It might be useful to know that they are WHNF
-            --         (b) In GHC.Iface.Tidy we currently assume that, if we want to
-            --             expose the unfolding then indeed we *have* an unfolding
-            --             to expose.  (We could instead use the RHS, but currently
-            --             we don't.)  The simple thing is always to have one.
+--  | is_join
+--  , UnfNever <- guidance
+--  = -- For large join points, don't keep an unfolding at all if it is large
+--    -- This is just an attempt to keep residency under control in
+--    -- deeply-nested join-point such as those arising in #26425
+--    return NoUnfolding
+
+  | otherwise
+  = return (mkCoreUnfolding src is_top_lvl new_rhs Nothing guidance)
+    -- We make an  unfolding *even for loop-breakers*.
+    -- Reason: (a) It might be useful to know that they are WHNF
+    --         (b) In GHC.Iface.Tidy we currently assume that, if we want to
+    --             expose the unfolding then indeed we *have* an unfolding
+    --             to expose.  (We could instead use the RHS, but currently
+    --             we don't.)  The simple thing is always to have one.
   where
     -- !opts: otherwise, we end up retaining all the SimpleEnv
     !uf_opts = seUnfoldingOpts env
@@ -4612,6 +4620,9 @@ mkLetUnfolding env top_lvl src id is_join new_rhs
     -- See Note [Force bottoming field]
     !is_bottoming = isDeadEndId id
 
+    is_top_bottoming = is_top_lvl && is_bottoming
+    guidance         = calcUnfoldingGuidance uf_opts is_top_bottoming is_join new_rhs
+
 -------------------
 simplStableUnfolding :: SimplEnv -> BindContext
                      -> InId


=====================================
compiler/GHC/Types/Unique/FM.hs
=====================================
@@ -53,7 +53,7 @@ module GHC.Types.Unique.FM (
         plusUFM,
         strictPlusUFM,
         plusUFM_C,
-        strictPlusUFM_C,
+        strictPlusUFM_C, strictPlusUFM_C_Directly,
         plusUFM_CD,
         plusUFM_CD2,
         mergeUFM,
@@ -281,6 +281,9 @@ plusUFM_C f (UFM x) (UFM y) = UFM (M.unionWith f x y)
 strictPlusUFM_C :: (elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt
 strictPlusUFM_C f (UFM x) (UFM y) = UFM (MS.unionWith f x y)
 
+strictPlusUFM_C_Directly :: (Unique -> elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt
+strictPlusUFM_C_Directly f (UFM x) (UFM y) = UFM (MS.unionWithKey (f . mkUniqueGrimily) x y)
+
 -- | `plusUFM_CD f m1 d1 m2 d2` merges the maps using `f` as the
 -- combinding function and `d1` resp. `d2` as the default value if
 -- there is no entry in `m1` reps. `m2`. The domain is the union of


=====================================
compiler/GHC/Types/Unique/Set.hs
=====================================
@@ -40,6 +40,7 @@ module GHC.Types.Unique.Set (
         lookupUniqSet_Directly,
         partitionUniqSet,
         mapUniqSet,
+        mapUniqSetToUFM, mapMaybeUniqSetToUFM,
         unsafeUFMToUniqSet,
         nonDetEltsUniqSet,
         nonDetKeysUniqSet,
@@ -211,6 +212,14 @@ mapUniqSet f = mkUniqSet . map f . nonDetEltsUniqSet
 mapMaybeUniqSet_sameUnique :: (a -> Maybe b) -> UniqSet a -> UniqSet b
 mapMaybeUniqSet_sameUnique f (UniqSet a) = UniqSet $ mapMaybeUFM_sameUnique f a
 
+mapUniqSetToUFM :: (a -> b) -> UniqSet a -> UniqFM a b
+-- Same keys, new values
+mapUniqSetToUFM f (UniqSet ufm) = mapUFM f ufm
+
+mapMaybeUniqSetToUFM :: (a -> Maybe b) -> UniqSet a -> UniqFM a b
+-- Same keys, new values
+mapMaybeUniqSetToUFM f (UniqSet ufm) = mapMaybeUFM f ufm
+
 -- Two 'UniqSet's are considered equal if they contain the same
 -- uniques.
 instance Eq (UniqSet a) where


=====================================
compiler/GHC/Types/Var/Env.hs
=====================================
@@ -12,7 +12,8 @@ module GHC.Types.Var.Env (
         elemVarEnv, disjointVarEnv, anyVarEnv,
         extendVarEnv, extendVarEnv_C, extendVarEnv_Acc,
         extendVarEnvList,
-        strictPlusVarEnv, plusVarEnv, plusVarEnv_C, strictPlusVarEnv_C,
+        strictPlusVarEnv, plusVarEnv, plusVarEnv_C,
+        strictPlusVarEnv_C, strictPlusVarEnv_C_Directly,
         plusVarEnv_CD, plusMaybeVarEnv_C,
         plusVarEnvList, alterVarEnv,
         delVarEnvList, delVarEnv,
@@ -525,6 +526,7 @@ delVarEnv         :: VarEnv a -> Var -> VarEnv a
 minusVarEnv       :: VarEnv a -> VarEnv b -> VarEnv a
 plusVarEnv_C      :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
 strictPlusVarEnv_C :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
+strictPlusVarEnv_C_Directly :: (Unique -> a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
 plusVarEnv_CD     :: (a -> a -> a) -> VarEnv a -> a -> VarEnv a -> a -> VarEnv a
 plusMaybeVarEnv_C :: (a -> a -> Maybe a) -> VarEnv a -> VarEnv a -> VarEnv a
 mapVarEnv         :: (a -> b) -> VarEnv a -> VarEnv b
@@ -552,6 +554,7 @@ extendVarEnv_Acc = addToUFM_Acc
 extendVarEnvList = addListToUFM
 plusVarEnv_C     = plusUFM_C
 strictPlusVarEnv_C = strictPlusUFM_C
+strictPlusVarEnv_C_Directly = strictPlusUFM_C_Directly
 plusVarEnv_CD    = plusUFM_CD
 plusMaybeVarEnv_C = plusMaybeUFM_C
 delVarEnvList    = delListFromUFM



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ec080d05d53af029018ee11db9aeab1...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ec080d05d53af029018ee11db9aeab1...
You're receiving this email because of your account on gitlab.haskell.org.

    

Simon Peyton Jones (＠simonpj)

tags

participants (1)