[Git][ghc/ghc][wip/T26425] 2 commits: Fix a performance hole in the occurrence analyser
Simon Peyton Jones pushed to branch wip/T26425 at Glasgow Haskell Compiler / GHC Commits: eba3bda4 by Simon Peyton Jones at 2025-11-05T08:32:37+00:00 Fix a performance hole in the occurrence analyser As #26425 showed, the clever stuff in Note [Occurrence analysis for join points] does a lot of duplication of usage details. This patch improved matters with a little fancy footwork. It is described in the new (W5) of the same Note. - - - - - fd2bb45c by Simon Peyton Jones at 2025-11-05T08:34:52+00:00 Add a fast-path for args=[] to occAnalApp In the common case of having not arguments, occAnalApp was doing redundant work. - - - - - 5 changed files: - compiler/GHC/Core/Opt/OccurAnal.hs - compiler/GHC/Core/Opt/Simplify/Iteration.hs - compiler/GHC/Types/Unique/FM.hs - compiler/GHC/Types/Unique/Set.hs - compiler/GHC/Types/Var/Env.hs Changes: ===================================== compiler/GHC/Core/Opt/OccurAnal.hs ===================================== @@ -664,14 +664,14 @@ through A, so it should have ManyOcc. Bear this case in mind! * maps a OneOcc to OneOcc{ occ_n_br = 0 } In our example, occ_join_points will be extended with [j :-> [v :-> OneOcc{occ_n_br=0}]] - See addJoinPoint. + See `addJoinPoint` and (W5) below. * At an occurrence of a join point, we do everything as normal, but add in the UsageDetails from the occ_join_points. See mkOneOcc. * Crucially, at the NonRec binding of the join point, in `occAnalBind`, we use - `orUDs`, not `andUDs` to combine the usage from the RHS with the usage from - the body. + `combineJoinPointUDs`, not `andUDs` to combine the usage from the RHS with + the usage from the body. Here are the consequences @@ -688,7 +688,7 @@ Here are the consequences These are `andUDs` together in `addOccInfo`, and hence `v` gets ManyOccs, just as it should. Clever! -There are a couple of tricky wrinkles +There are, of course, some tricky wrinkles (W1) Consider this example which shadows `j`: join j = rhs in @@ -718,6 +718,8 @@ There are a couple of tricky wrinkles * In `postprcess_uds`, we add the chucked-out join points to the returned UsageDetails, with `andUDs`. +Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3). + (W3) Consider this example, which shadows `j`, but this time in an argument join j = rhs in f (case x of { K j -> ...; ... }) @@ -734,10 +736,38 @@ There are a couple of tricky wrinkles (W4) What if the join point binding has a stable unfolding, or RULES? They are just alternative right-hand sides, and at each call site we - will use only one of them. So again, we can use `orUDs` to combine - usage info from all these alternatives RHSs. - -Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3). + will use only one of them. So again, we can use `combineJoinPointUDs` + to combine usage info from all these alternatives RHSs. + +(W5) Other things being equal, we want keep the OccInfoEnv in the range of + `occ_join_points` as small as possible, because it is /duplicated/ at + /every occurrence/ of the join point. We really only want to include + OccInfo for + * Local, non-recursive let-bound Ids + * that occur just once in the RHS of the join point + particularly including + * thunks (that's the original point) and + * join points (so that the trick works recursively). + We call these the "tracked Ids of j". + + Including lambda binders is pointless, and slows down the occurrence analyser. + + e.g. \x. let y = x+1 in + join j v = ..x..y..(f z z).. + in ... + In the `occ_join_points` binding for `j`, we want to track `y`, but + not `x` (lambda bound) nor `z` (occurs many times). + + To exploit this: + * `occ_local_lets` tracks which Ids are local, non-recursive lets + * `addJoinPoint` only populates j's entry with occ-info for the "tracked Ids" + of `j`; that is, that are (a) in occ_local_lets and (b) have OneOcc. + * `combineJoinPointUDs` uses + orLocalOcc for local-let Ids + andLocalOcc for non-local-let Ids + + This fancy footwork can matter in extreme cases: it gave a 25% reduction in + total compiler allocation in #26425.. Note [Finding join points] ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -759,62 +789,62 @@ rest of 'OccInfo' until it goes on the binder. Note [Join arity prediction based on joinRhsArity] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In general, the join arity from tail occurrences of a join point (O) may be -higher or lower than the manifest join arity of the join body (M). E.g., +In general, the join arity from tail occurrences of a join point (OAr) may be +higher or lower than the manifest join arity of the join body (MAr). E.g., - -- M > O: - let f x y = x + y -- M = 2 - in if b then f 1 else f 2 -- O = 1 + -- MAr > Oar: + let f x y = x + y -- MAr = 2 + in if b then f 1 else f 2 -- OAr = 1 ==> { Contify for join arity 1 } join f x = \y -> x + y in if b then jump f 1 else jump f 2 - -- M < O - let f = id -- M = 0 - in if ... then f 12 else f 13 -- O = 1 + -- MAr < Oar + let f = id -- MAr = 0 + in if ... then f 12 else f 13 -- OAr = 1 ==> { Contify for join arity 1, eta-expand f } join f x = id x in if b then jump f 12 else jump f 13 -But for *recursive* let, it is crucial that both arities match up, consider +But for *recursive* let, it is crucial MAr=OAr. Consider: letrec f x y = if ... then f x else True in f 42 -Here, M=2 but O=1. If we settled for a joinrec arity of 1, the recursive jump +Here, MAr=2 but OAr=1. If we settled for a joinrec arity of 1, the recursive jump would not happen in a tail context! Contification is invalid here. -So indeed it is crucial to demand that M=O. +So indeed it is crucial to demand that MAr=OAr. -(Side note: Actually, we could be more specific: Let O1 be the join arity of -occurrences from the letrec RHS and O2 the join arity from the let body. Then -we need M=O1 and M<=O2 and could simply eta-expand the RHS to match O2 later. -M=O is the specific case where we don't want to eta-expand. Neither the join +(Side note: Actually, we could be more specific: Let OAr1 be the join arity of +occurrences from the letrec RHS and OAr2 the join arity from the let body. Then +we need MAr=OAr1 and MAr<=OAr2 and could simply eta-expand the RHS to match OAr2 later. +MAr=OAr is the specific case where we don't want to eta-expand. Neither the join points paper nor GHC does this at the moment.) We can capitalise on this observation and conclude that *if* f could become a -joinrec (without eta-expansion), it will have join arity M. -Now, M is just the result of 'joinRhsArity', a rather simple, local analysis. +joinrec (without eta-expansion), it will have join arity MAr. +Now, MAr is just the result of 'joinRhsArity', a rather simple, local analysis. It is also the join arity inside the 'TailUsageDetails' returned by 'occAnalLamTail', so we can predict join arity without doing any fixed-point iteration or really doing any deep traversal of let body or RHS at all. -We check for M in the 'adjustTailUsage' call inside 'tagRecBinders'. +We check for MAr in the 'adjustTailUsage' call inside 'tagRecBinders'. All this is quite apparent if you look at the contification transformation in Fig. 5 of "Compiling without Continuations" (which does not account for eta-expansion at all, mind you). The letrec case looks like this - +n letrec f = /\as.\xs. L[us] in L'[es] ... and a bunch of conditions establishing that f only occurs in app heads of join arity (len as + len xs) inside us and es ... -The syntactic form `/\as.\xs. L[us]` forces M=O iff `f` occurs in `us`. However, +The syntactic form `/\as.\xs. L[us]` forces MAr=OAr iff `f` occurs in `us`. However, for non-recursive functions, this is the definition of contification from the paper: let f = /\as.\xs.u in L[es] ... conditions ... -Note that u could be a lambda itself, as we have seen. No relationship between M -and O to exploit here. +Note that u could be a lambda itself, as we have seen. No relationship between MAr +and OAr to exploit here. Note [Join points and unfoldings/rules] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -992,23 +1022,29 @@ occAnalBind !env lvl ire (NonRec bndr rhs) thing_inside combine = -- Analyse the RHS and /then/ the body let -- Analyse the rhs first, generating rhs_uds !(rhs_uds_s, bndr', rhs') = occAnalNonRecRhs env lvl ire mb_join bndr rhs - rhs_uds = foldl1' orUDs rhs_uds_s -- NB: orUDs. See (W4) of - -- Note [Occurrence analysis for join points] + rhs_uds = foldl1' (combineJoinPointUDs env) + rhs_uds_s -- NB: combineJoinPointUDs. See (W4) of + -- Note [Occurrence analysis for join points] -- Now analyse the body, adding the join point -- into the environment with addJoinPoint - !(WUD body_uds (occ, body)) = occAnalNonRecBody env bndr' $ \env -> + env_body = addLocalLet env lvl bndr + !(WUD body_uds (occ, body)) = occAnalNonRecBody env_body bndr' $ \env -> thing_inside (addJoinPoint env bndr' rhs_uds) in if isDeadOcc occ -- Drop dead code; see Note [Dead code] then WUD body_uds body - else WUD (rhs_uds `orUDs` body_uds) -- Note `orUDs` + else -- pprTrace "occAnal-nonrec" (vcat [ ppr bndr <+> ppr occ + -- , text "rhs_uds" <+> ppr rhs_uds + -- , text "body_uds" <+> ppr body_uds ]) $ + WUD (combineJoinPointUDs env rhs_uds body_uds) -- Note `orUDs` (combine [NonRec (fst (tagNonRecBinder lvl occ bndr')) rhs'] body) -- The normal case, including newly-discovered join points -- Analyse the body and /then/ the RHS - | WUD body_uds (occ,body) <- occAnalNonRecBody env bndr thing_inside + | let env_body = addLocalLet env lvl bndr + , WUD body_uds (occ,body) <- occAnalNonRecBody env_body bndr thing_inside = if isDeadOcc occ -- Drop dead code; see Note [Dead code] then WUD body_uds body else let @@ -1054,7 +1090,7 @@ occAnalNonRecRhs !env lvl imp_rule_edges mb_join bndr rhs rhs_ctxt = mkNonRecRhsCtxt lvl bndr unf -- See Note [Join arity prediction based on joinRhsArity] - -- Match join arity O from mb_join_arity with manifest join arity M as + -- Match join arity OAr from mb_join_arity with manifest join arity MAr as -- returned by of occAnalLamTail. It's totally OK for them to mismatch; -- hence adjust the UDs from the RHS @@ -1764,7 +1800,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs) -- here because that is what we are setting! WTUD unf_tuds unf' = occAnalUnfolding rhs_env unf adj_unf_uds = adjustTailArity (JoinPoint rhs_ja) unf_tuds - -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source M + -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source MAr -- of Note [Join arity prediction based on joinRhsArity] --------- IMP-RULES -------- @@ -1775,7 +1811,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs) --------- All rules -------- -- See Note [Join points and unfoldings/rules] - -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source M + -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source MAr -- of Note [Join arity prediction based on joinRhsArity] rules_w_uds :: [(CoreRule, UsageDetails, UsageDetails)] rules_w_uds = [ (r,l,adjustTailArity (JoinPoint rhs_ja) rhs_wuds) @@ -2177,7 +2213,9 @@ occAnalLamTail :: OccEnv -> CoreExpr -> WithTailUsageDetails CoreExpr -- See Note [Adjusting right-hand sides] occAnalLamTail env expr = let !(WUD usage expr') = occ_anal_lam_tail env expr - in WTUD (TUD (joinRhsArity expr) usage) expr' + in WTUD (TUD (joinRhsArity expr') usage) expr' + -- If expr looks like (\x. let dead = e in \y. blah), where `dead` is dead + -- then joinRhsArity expr' might exceed joinRhsArity expr occ_anal_lam_tail :: OccEnv -> CoreExpr -> WithUsageDetails CoreExpr -- Does not markInsideLam etc for the outmost batch of lambdas @@ -2598,7 +2636,7 @@ occAnalArgs :: OccEnv -> CoreExpr -> [CoreExpr] -> WithUsageDetails CoreExpr -- The `fun` argument is just an accumulating parameter, -- the base for building the application we return -occAnalArgs !env fun args !one_shots +occAnalArgs env fun args one_shots = go emptyDetails fun args one_shots where env_args = setNonTailCtxt encl env @@ -2657,8 +2695,19 @@ Constructors are rather like lambdas in this way. occAnalApp :: OccEnv -> (Expr CoreBndr, [Arg CoreBndr], [CoreTickish]) -> WithUsageDetails (Expr CoreBndr) --- Naked variables (not applied) end up here too -occAnalApp !env (Var fun, args, ticks) +occAnalApp !env (Var fun_id, [], ticks) + = -- Naked variables (not applied) end up here too, and it's worth giving + -- this common case special treatment, because there is so much less to do. + -- This is just a specialised copy of the (Var fun_id) case below + WUD fun_uds (mkTicks ticks fun') + where + !(fun', fun_id') = lookupBndrSwap env fun_id + !fun_uds = mkOneOcc env fun_id' int_cxt 0 + !int_cxt = case occ_encl env of + OccScrut -> IsInteresting + _other -> NotInteresting + +occAnalApp env (Var fun, args, ticks) -- Account for join arity of runRW# continuation -- See Note [Simplification of runRW#] -- @@ -2863,7 +2912,11 @@ data OccEnv -- Invariant: no Id maps to an empty OccInfoEnv -- See Note [Occurrence analysis for join points] , occ_join_points :: !JoinPointInfo - } + + , occ_local_lets :: IdSet -- Non-top-level non-rec-bound lets + -- I tried making this field strict, but + -- doing so slightly increased allocation + } type JoinPointInfo = IdEnv OccInfoEnv @@ -2914,7 +2967,8 @@ initOccEnv , occ_join_points = emptyVarEnv , occ_bs_env = emptyVarEnv - , occ_bs_rng = emptyVarSet } + , occ_bs_rng = emptyVarSet + , occ_local_lets = emptyVarSet } noBinderSwaps :: OccEnv -> Bool noBinderSwaps (OccEnv { occ_bs_env = bs_env }) = isEmptyVarEnv bs_env @@ -3154,23 +3208,26 @@ postprocess_uds bndrs bad_joins uds | uniq `elemVarEnvByKey` env = plusVarEnv_C andLocalOcc env join_env | otherwise = env +addLocalLet :: OccEnv -> TopLevelFlag -> Id -> OccEnv +addLocalLet env@(OccEnv { occ_local_lets = ids }) top_lvl id + | isTopLevel top_lvl = env + | otherwise = env { occ_local_lets = ids `extendVarSet` id } + addJoinPoint :: OccEnv -> Id -> UsageDetails -> OccEnv -addJoinPoint env bndr rhs_uds +addJoinPoint env@(OccEnv { occ_join_points = join_points, occ_local_lets = local_lets }) + join_bndr (UD { ud_env = rhs_occs }) | isEmptyVarEnv zeroed_form = env | otherwise - = env { occ_join_points = extendVarEnv (occ_join_points env) bndr zeroed_form } + = env { occ_join_points = extendVarEnv join_points join_bndr zeroed_form } where - zeroed_form = mkZeroedForm rhs_uds + zeroed_form = mapMaybeUniqSetToUFM do_one local_lets + -- See Note [Occurrence analysis for join points] for "zeroed form" -mkZeroedForm :: UsageDetails -> OccInfoEnv --- See Note [Occurrence analysis for join points] for "zeroed form" -mkZeroedForm (UD { ud_env = rhs_occs }) - = mapMaybeUFM do_one rhs_occs - where - do_one :: LocalOcc -> Maybe LocalOcc - do_one (ManyOccL {}) = Nothing - do_one occ@(OneOccL {}) = Just (occ { lo_n_br = 0 }) + do_one :: Var -> Maybe LocalOcc + do_one bndr = case lookupVarEnv rhs_occs bndr of + Just occ@(OneOccL {}) -> Just (occ { lo_n_br = 0 }) + _ -> Nothing -------------------- transClosureFV :: VarEnv VarSet -> VarEnv VarSet @@ -3628,7 +3685,12 @@ data LocalOcc -- See Note [LocalOcc] -- Combining (AlwaysTailCalled 2) and (AlwaysTailCalled 3) -- gives NoTailCallInfo , lo_int_cxt :: !InterestingCxt } + | ManyOccL !TailCallInfo + -- Why do we need TailCallInfo on ManyOccL? + -- Answer: recursive bindings are entered many times: + -- rec { j x = ...j x'... } in j y + -- See the uses of `andUDs` in `tagRecBinders` instance Outputable LocalOcc where ppr (OneOccL { lo_n_br = n, lo_tail = tci }) @@ -3663,7 +3725,7 @@ instance Outputable UsageDetails where -- | TailUsageDetails captures the result of applying 'occAnalLamTail' -- to a function `\xyz.body`. The TailUsageDetails pairs together -- * the number of lambdas (including type lambdas: a JoinArity) --- * UsageDetails for the `body` of the lambda, unadjusted by `adjustTailUsage`. +-- * UsageDetails for the `body` of the lambda, /unadjusted/ by `adjustTailUsage`. -- If the binding turns out to be a join point with the indicated join -- arity, this unadjusted usage details is just what we need; otherwise we -- need to discard tail calls. That's what `adjustTailUsage` does. @@ -3681,8 +3743,17 @@ data WithTailUsageDetails a = WTUD !TailUsageDetails !a andUDs:: UsageDetails -> UsageDetails -> UsageDetails orUDs :: UsageDetails -> UsageDetails -> UsageDetails -andUDs = combineUsageDetailsWith andLocalOcc -orUDs = combineUsageDetailsWith orLocalOcc +andUDs = combineUsageDetailsWith (\_uniq -> andLocalOcc) +orUDs = combineUsageDetailsWith (\_uniq -> orLocalOcc) + +combineJoinPointUDs :: OccEnv -> UsageDetails -> UsageDetails -> UsageDetails +-- See (W5) in Note [Occurrence analysis for join points] +combineJoinPointUDs (OccEnv { occ_local_lets = local_lets }) uds1 uds2 + = combineUsageDetailsWith combine uds1 uds2 + where + combine uniq occ1 occ2 + | uniq `elemVarSetByKey` local_lets = orLocalOcc occ1 occ2 + | otherwise = andLocalOcc occ1 occ2 mkOneOcc :: OccEnv -> Id -> InterestingCxt -> JoinArity -> UsageDetails mkOneOcc !env id int_cxt arity @@ -3699,7 +3770,8 @@ mkOneOcc !env id int_cxt arity = mkSimpleDetails (unitVarEnv id occ) where - occ = OneOccL { lo_n_br = 1, lo_int_cxt = int_cxt + occ = OneOccL { lo_n_br = 1 + , lo_int_cxt = int_cxt , lo_tail = AlwaysTailCalled arity } -- Add several occurrences, assumed not to be tail calls @@ -3786,7 +3858,7 @@ restrictFreeVars bndrs fvs = restrictUniqSetToUFM bndrs fvs ------------------- -- Auxiliary functions for UsageDetails implementation -combineUsageDetailsWith :: (LocalOcc -> LocalOcc -> LocalOcc) +combineUsageDetailsWith :: (Unique -> LocalOcc -> LocalOcc -> LocalOcc) -> UsageDetails -> UsageDetails -> UsageDetails {-# INLINE combineUsageDetailsWith #-} combineUsageDetailsWith plus_occ_info @@ -3796,9 +3868,9 @@ combineUsageDetailsWith plus_occ_info | isEmptyVarEnv env2 = uds1 | otherwise -- See Note [Strictness in the occurrence analyser] - -- Using strictPlusVarEnv here speeds up the test T26425 by about 10% by avoiding - -- intermediate thunks. - = UD { ud_env = strictPlusVarEnv_C plus_occ_info env1 env2 + -- Using strictPlusVarEnv here speeds up the test T26425 + -- by about 10% by avoiding intermediate thunks. + = UD { ud_env = strictPlusVarEnv_C_Directly plus_occ_info env1 env2 , ud_z_many = strictPlusVarEnv z_many1 z_many2 , ud_z_in_lam = plusVarEnv z_in_lam1 z_in_lam2 , ud_z_tail = strictPlusVarEnv z_tail1 z_tail2 } @@ -3842,8 +3914,6 @@ lookupOccInfoByUnique (UD { ud_env = env | uniq `elemVarEnvByKey` z_tail = NoTailCallInfo | otherwise = ti - - ------------------- -- See Note [Adjusting right-hand sides] @@ -3853,21 +3923,22 @@ adjustNonRecRhs :: JoinPointHood -- ^ This function concentrates shared logic between occAnalNonRecBind and the -- AcyclicSCC case of occAnalRec. -- It returns the adjusted rhs UsageDetails combined with the body usage -adjustNonRecRhs mb_join_arity rhs_wuds@(WTUD _ rhs) - = WUD (adjustTailUsage mb_join_arity rhs_wuds) rhs - +adjustNonRecRhs mb_join_arity (WTUD (TUD rhs_ja uds) rhs) + = WUD (adjustTailUsage exact_join rhs uds) rhs + where + exact_join = mb_join_arity == JoinPoint rhs_ja -adjustTailUsage :: JoinPointHood - -> WithTailUsageDetails CoreExpr -- Rhs usage, AFTER occAnalLamTail +adjustTailUsage :: Bool -- True <=> Exactly-matching join point; don't do markNonTail + -> CoreExpr -- Rhs usage, AFTER occAnalLamTail -> UsageDetails -adjustTailUsage mb_join_arity (WTUD (TUD rhs_ja uds) rhs) + -> UsageDetails +adjustTailUsage exact_join rhs uds = -- c.f. occAnal (Lam {}) markAllInsideLamIf (not one_shot) $ markAllNonTailIf (not exact_join) $ uds where one_shot = isOneShotFun rhs - exact_join = mb_join_arity == JoinPoint rhs_ja adjustTailArity :: JoinPointHood -> TailUsageDetails -> UsageDetails adjustTailArity mb_rhs_ja (TUD ja usage) @@ -3914,8 +3985,9 @@ tagNonRecBinder lvl occ bndr tagRecBinders :: TopLevelFlag -- At top level? -> UsageDetails -- Of body of let ONLY -> [NodeDetails] - -> WithUsageDetails -- Adjusted details for whole scope, - -- with binders removed + -> WithUsageDetails -- Adjusted details for whole scope + -- still including the binders; + -- (they are removed by `addInScope`) [IdWithOccInfo] -- Tagged binders -- Substantially more complicated than non-recursive case. Need to adjust RHS -- details *before* tagging binders (because the tags depend on the RHSes). @@ -3925,32 +3997,21 @@ tagRecBinders lvl body_uds details_s -- 1. See Note [Join arity prediction based on joinRhsArity] -- Determine possible join-point-hood of whole group, by testing for - -- manifest join arity M. - -- This (re-)asserts that makeNode had made tuds for that same arity M! + -- manifest join arity MAr. + -- This (re-)asserts that makeNode had made tuds for that same arity MAr! unadj_uds = foldr (andUDs . test_manifest_arity) body_uds details_s - test_manifest_arity ND{nd_rhs = WTUD tuds rhs} - = adjustTailArity (JoinPoint (joinRhsArity rhs)) tuds + test_manifest_arity ND{nd_rhs = WTUD (TUD rhs_ja uds) rhs} + = assertPpr (rhs_ja == joinRhsArity rhs) (ppr rhs_ja $$ ppr uds $$ ppr rhs) $ + uds + will_be_joins :: Bool will_be_joins = decideRecJoinPointHood lvl unadj_uds bndrs - mb_join_arity :: Id -> JoinPointHood - -- mb_join_arity: See Note [Join arity prediction based on joinRhsArity] - -- This is the source O - mb_join_arity bndr - -- Can't use willBeJoinId_maybe here because we haven't tagged - -- the binder yet (the tag depends on these adjustments!) - | will_be_joins - , AlwaysTailCalled arity <- lookupTailCallInfo unadj_uds bndr - = JoinPoint arity - | otherwise - = assert (not will_be_joins) -- Should be AlwaysTailCalled if - NotJoinPoint -- we are making join points! - -- 2. Adjust usage details of each RHS, taking into account the -- join-point-hood decision - rhs_udss' = [ adjustTailUsage (mb_join_arity bndr) rhs_wuds + rhs_udss' = [ adjustTailUsage will_be_joins rhs rhs_uds -- Matching occAnalLamTail in makeNode - | ND { nd_bndr = bndr, nd_rhs = rhs_wuds } <- details_s ] + | ND { nd_rhs = WTUD (TUD _ rhs_uds) rhs } <- details_s ] -- 3. Compute final usage details from adjusted RHS details adj_uds = foldr andUDs body_uds rhs_udss' @@ -3969,9 +4030,9 @@ setBinderOcc occ_info bndr | otherwise = setIdOccInfo bndr occ_info -- | Decide whether some bindings should be made into join points or not, based --- on its occurrences. This is +-- on its occurrences. -- Returns `False` if they can't be join points. Note that it's an --- all-or-nothing decision, as if multiple binders are given, they're +-- all-or-nothing decision: if multiple binders are given, they are -- assumed to be mutually recursive. -- -- It must, however, be a final decision. If we say `True` for 'f', ===================================== compiler/GHC/Core/Opt/Simplify/Iteration.hs ===================================== @@ -4595,13 +4595,21 @@ mkLetUnfolding :: SimplEnv -> TopLevelFlag -> UnfoldingSource -> InId -> Bool -- True <=> this is a join point -> OutExpr -> SimplM Unfolding mkLetUnfolding env top_lvl src id is_join new_rhs - = return (mkUnfolding uf_opts src is_top_lvl is_bottoming is_join new_rhs Nothing) - -- We make an unfolding *even for loop-breakers*. - -- Reason: (a) It might be useful to know that they are WHNF - -- (b) In GHC.Iface.Tidy we currently assume that, if we want to - -- expose the unfolding then indeed we *have* an unfolding - -- to expose. (We could instead use the RHS, but currently - -- we don't.) The simple thing is always to have one. +-- | is_join +-- , UnfNever <- guidance +-- = -- For large join points, don't keep an unfolding at all if it is large +-- -- This is just an attempt to keep residency under control in +-- -- deeply-nested join-point such as those arising in #26425 +-- return NoUnfolding + + | otherwise + = return (mkCoreUnfolding src is_top_lvl new_rhs Nothing guidance) + -- We make an unfolding *even for loop-breakers*. + -- Reason: (a) It might be useful to know that they are WHNF + -- (b) In GHC.Iface.Tidy we currently assume that, if we want to + -- expose the unfolding then indeed we *have* an unfolding + -- to expose. (We could instead use the RHS, but currently + -- we don't.) The simple thing is always to have one. where -- !opts: otherwise, we end up retaining all the SimpleEnv !uf_opts = seUnfoldingOpts env @@ -4612,6 +4620,9 @@ mkLetUnfolding env top_lvl src id is_join new_rhs -- See Note [Force bottoming field] !is_bottoming = isDeadEndId id + is_top_bottoming = is_top_lvl && is_bottoming + guidance = calcUnfoldingGuidance uf_opts is_top_bottoming is_join new_rhs + ------------------- simplStableUnfolding :: SimplEnv -> BindContext -> InId ===================================== compiler/GHC/Types/Unique/FM.hs ===================================== @@ -53,7 +53,7 @@ module GHC.Types.Unique.FM ( plusUFM, strictPlusUFM, plusUFM_C, - strictPlusUFM_C, + strictPlusUFM_C, strictPlusUFM_C_Directly, plusUFM_CD, plusUFM_CD2, mergeUFM, @@ -281,6 +281,9 @@ plusUFM_C f (UFM x) (UFM y) = UFM (M.unionWith f x y) strictPlusUFM_C :: (elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt strictPlusUFM_C f (UFM x) (UFM y) = UFM (MS.unionWith f x y) +strictPlusUFM_C_Directly :: (Unique -> elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt +strictPlusUFM_C_Directly f (UFM x) (UFM y) = UFM (MS.unionWithKey (f . mkUniqueGrimily) x y) + -- | `plusUFM_CD f m1 d1 m2 d2` merges the maps using `f` as the -- combinding function and `d1` resp. `d2` as the default value if -- there is no entry in `m1` reps. `m2`. The domain is the union of ===================================== compiler/GHC/Types/Unique/Set.hs ===================================== @@ -40,6 +40,7 @@ module GHC.Types.Unique.Set ( lookupUniqSet_Directly, partitionUniqSet, mapUniqSet, + mapUniqSetToUFM, mapMaybeUniqSetToUFM, unsafeUFMToUniqSet, nonDetEltsUniqSet, nonDetKeysUniqSet, @@ -211,6 +212,14 @@ mapUniqSet f = mkUniqSet . map f . nonDetEltsUniqSet mapMaybeUniqSet_sameUnique :: (a -> Maybe b) -> UniqSet a -> UniqSet b mapMaybeUniqSet_sameUnique f (UniqSet a) = UniqSet $ mapMaybeUFM_sameUnique f a +mapUniqSetToUFM :: (a -> b) -> UniqSet a -> UniqFM a b +-- Same keys, new values +mapUniqSetToUFM f (UniqSet ufm) = mapUFM f ufm + +mapMaybeUniqSetToUFM :: (a -> Maybe b) -> UniqSet a -> UniqFM a b +-- Same keys, new values +mapMaybeUniqSetToUFM f (UniqSet ufm) = mapMaybeUFM f ufm + -- Two 'UniqSet's are considered equal if they contain the same -- uniques. instance Eq (UniqSet a) where ===================================== compiler/GHC/Types/Var/Env.hs ===================================== @@ -12,7 +12,8 @@ module GHC.Types.Var.Env ( elemVarEnv, disjointVarEnv, anyVarEnv, extendVarEnv, extendVarEnv_C, extendVarEnv_Acc, extendVarEnvList, - strictPlusVarEnv, plusVarEnv, plusVarEnv_C, strictPlusVarEnv_C, + strictPlusVarEnv, plusVarEnv, plusVarEnv_C, + strictPlusVarEnv_C, strictPlusVarEnv_C_Directly, plusVarEnv_CD, plusMaybeVarEnv_C, plusVarEnvList, alterVarEnv, delVarEnvList, delVarEnv, @@ -525,6 +526,7 @@ delVarEnv :: VarEnv a -> Var -> VarEnv a minusVarEnv :: VarEnv a -> VarEnv b -> VarEnv a plusVarEnv_C :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a strictPlusVarEnv_C :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a +strictPlusVarEnv_C_Directly :: (Unique -> a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a plusVarEnv_CD :: (a -> a -> a) -> VarEnv a -> a -> VarEnv a -> a -> VarEnv a plusMaybeVarEnv_C :: (a -> a -> Maybe a) -> VarEnv a -> VarEnv a -> VarEnv a mapVarEnv :: (a -> b) -> VarEnv a -> VarEnv b @@ -552,6 +554,7 @@ extendVarEnv_Acc = addToUFM_Acc extendVarEnvList = addListToUFM plusVarEnv_C = plusUFM_C strictPlusVarEnv_C = strictPlusUFM_C +strictPlusVarEnv_C_Directly = strictPlusUFM_C_Directly plusVarEnv_CD = plusUFM_CD plusMaybeVarEnv_C = plusMaybeUFM_C delVarEnvList = delListFromUFM View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ec080d05d53af029018ee11db9aeab1... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ec080d05d53af029018ee11db9aeab1... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Simon Peyton Jones (@simonpj)