[Git][ghc/ghc][master] Clean up join points, casts & ticks
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 08bc245b by sheaf at 2026-03-01T11:11:54-05:00 Clean up join points, casts & ticks This commit shores up the logic dealing with casts and ticks occurring in between a join point binding and a jump. Fixes #26642 #26929 #26693 Makes progress on #14610 #26157 #26422 Changes: - Remove 'GHC.Types.Tickish.TickishScoping' in favour of simpler predicates 'tickishHasNoScope'/'tickishHasSoftScope', as things were before commit 993975d3. This makes the code easier to read and document (fewer indirections). - Introduce 'canCollectArgsThroughTick' for consistent handling of ticks around PrimOps and other 'Id's that cannot be eta-reduced. See overhauled Note [Ticks and mandatory eta expansion]. - New Note [JoinId vs TailCallInfo] in GHC.Core.SimpleOpt that explains robustness of JoinId vs fragility of TailCallInfo. - Allow casts/non-soft-scoped ticks to occur in between a join point binder and a jump, but only in Core Prep. See Note [Join points, casts, and ticks] and Note [Join points, casts, and ticks... in Core Prep] in GHC.Core.Opt.Simplify.Iteration. Also update Core Lint to account for this. See Note [Linting join points with casts or ticks] in GHC.Core.Lint. - Update 'GHC.Core.Utils.mergeCaseAlts' to avoid pushing a cast in between a join point binding and its jumps. This fixes #26642. See the new (MC5) and (MC6) in Note [Merge Nested Cases]. - Update float out to properly handle source note ticks. They are now properly floated out instead of being discarded. This increases the number of ticks in certain tests with -g. Test cases: T26642 and TrickyJoins. Metric increase due to more source note ticks with -g: ------------------------- Metric Increase: libdir size_hello_artifact size_hello_unicode ------------------------- - - - - - 21 changed files: - compiler/GHC/Cmm/Node.hs - compiler/GHC/Core.hs - compiler/GHC/Core/Lint.hs - compiler/GHC/Core/Opt/Arity.hs - compiler/GHC/Core/Opt/FloatIn.hs - compiler/GHC/Core/Opt/FloatOut.hs - compiler/GHC/Core/Opt/OccurAnal.hs - compiler/GHC/Core/Opt/Simplify/Iteration.hs - compiler/GHC/Core/SimpleOpt.hs - compiler/GHC/Core/Utils.hs - compiler/GHC/CoreToStg/Prep.hs - compiler/GHC/Driver/Config/Core/Lint.hs - compiler/GHC/Iface/Tidy.hs - compiler/GHC/StgToCmm/Expr.hs - compiler/GHC/Types/Basic.hs - compiler/GHC/Types/Id/Info.hs - compiler/GHC/Types/Tickish.hs - testsuite/tests/codeGen/should_compile/debug.stdout - + testsuite/tests/simplCore/should_compile/T26642.hs - + testsuite/tests/simplCore/should_compile/TrickyJoins.hs - testsuite/tests/simplCore/should_compile/all.T Changes: ===================================== compiler/GHC/Cmm/Node.hs ===================================== @@ -819,8 +819,8 @@ data CmmTickScope | SubScope !U.Unique CmmTickScope -- ^ Constructs a new sub-scope to an existing scope. This allows - -- us to translate Core-style scoping rules (see @tickishScoped@) - -- into the Cmm world. Suppose the following code: + -- us to translate Core-style scoping rules (see Note [Scoping ticks and counting ticks] + -- in GHC.Types.Tickish) into the Cmm world. Suppose the following code: -- -- tick<1> case ... of -- A -> tick<2> ... ===================================== compiler/GHC/Core.hs ===================================== @@ -1035,6 +1035,143 @@ tail position: A cast changes the type, but the type must be the same. But operationally, casts are vacuous, so this is a bit unfortunate! See #14610 for ideas how to fix this. +Note [Join points, casts, and ticks] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Point (1) of Note [Invariants on join points] says that a join point +must always be tail called. But what precisely does "tail called" mean +in the presence of (a) casts and (b) ticks? + +Example (CAST) + let j x = rhs in + case y of { True -> j 1 |> co; False -> j 2 } + +Example (TICK) + let j x = rhs in + case y of { True -> <tick t> (j 1); False -> j 2 } + +Answer: in Core: + + (JCT1) A tail call cannot be under a cast. + + Thus, in (CAST), `j` is not a join point. + + (JCT2) A tail call cannot be under a cost-centre-scoped tick. + + Thus, in (TICK), `j` is a join point only if tick `t` has soft scope + (as per Note [Scoping ticks and counting ticks] in GHC.Tickish). + +The Big Reason for these choices is that the Simplifier moves the continuation +into the RHS of a join point, as explained in Note [Join points and case-of-case] +in GHC.Core.Opt.Simplify.Iteration: + + K[ join j x = rhs in body ] --> join j x = K[rhs] in K[body] + +and K then evaporates when it encounters the tail call: + + K[jump j v] --> jump j v + +These transformations: + * Are ill-typed if the tail is under a cast, hence (JCT1) + * Change cost semantics if the tick has cost-centre scope, hence (JCT2) + +The occurrence analyser is careful not to treat an occurrence as a tail call if +it falls under (JCT1) or (JCT2), by using 'markAllNonTail'. + +However, during /code generation/ the key thing about a join point is that + * The binding does no allocation + * A tail call can be implemented by "adjust stack pointer and jump". + +This code-gen strategy works fine even if the "tail call" occurs under +/arbitrary/ ticks and casts. Hence: + +(JCT3) In CorePrep, the occurrence analyser is called with a special flag that + /does/ treat `j` as tail-called in Example (CAST) and Example (TICK). + Core Prep then uses 'joinPointBinding_maybe', which turns always-tail-called + let bindings into join points, thus recovering join-point-hood. + +See also Note [Linting join points with casts or ticks] in GHC.Core.Lint. + +Examples +======== + + Join point jumps under ticks (#14242, #26157, #26642, #26693) + ============================ + In #26693 we had: + + join { j :: Bool -> Int -> IO (); j _ = guts } + in case b of + False -> scc<foo> jump j True + True -> jump j False + + If we try to push the application to an argument 'arg :: Int' into this + expression, we first get: + + join { j :: Bool -> IO (); j _ = guts arg ] } + in case b of + False -> (scc<foo> jump j True) arg + True -> jump j False arg + + We then rely on 'trimJoinCont' to remove the argument. In this case, this fails + for the first branch, because 'trimJoinCont' doesn't look through profiling + ticks. Were we to address this, it's still not clear what code we would want to + end up with, as we don't want to misattribute profiling costs. + We could plausibly transform to the following: + + join { j :: Bool -> IO (); j scc_or_null _ = (setSCC# scc_or_null guts) arg ] } + in case b of + False -> jump j <foo> True + True -> jump j null False + + where `setSCC#` is a new primop that would set the current cost centre pointer + (or no-op if the given pointer is null). However: + - this primop doesn't exist today, + - it requires adding an argument to the join point (hence changing its arity) + + Note that soft scope ticks are floated out by the simplifier (see the + 'tickishHasSoftScope' guard in 'GHC.Core.Opt.Simplify.Iteration.simplTick'), + so don't suffer from the same problem. + + Join point jumps under casts (#14610, #21716, #26422) + ============================ + Consider: + + newtype Age = MkAge Int -- axAge :: Age ~ Int + f :: Int -> ... + + f (join j :: Bool -> Age + j x = (rhs1 :: Age) + in case v of + Just x -> ((j x) |> axAge) :: Int + Nothing -> rhs2) + + If we try to use the case of case transformation to push 'f' inwards, we would + get: + + join j' x = f (rhs1 :: Age) + in case v of + Just x -> (j' x |> axAge) + Nothing -> f rhs2 + + which is utterly bogus, as we are now passing an argument of type 'Age' to + 'f', which expects an 'Int'. + + The alternative would be to implement a transformation of the form + + join { j x = blah } + in case e of + False -> j True |> co1 + True -> j False |> co2 + + ====> + + join { j x co = blah |> co } + in case e of + False -> j True co1 + True -> j False co2 + + by adding a coercion argument to the join point. We don't do this currently. + + Note [Strict fields in Core] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In Core, evaluating a data constructor worker evaluates its strict fields. ===================================== compiler/GHC/Core/Lint.hs ===================================== @@ -106,6 +106,7 @@ import Data.List.NonEmpty ( NonEmpty(..), groupWith, nonEmpty ) import Data.Maybe import Data.IntMap.Strict ( IntMap ) import qualified Data.IntMap.Strict as IntMap ( lookup, keys, empty, fromList ) +import GHC.Types.Unique.Map {- Note [Core Lint guarantee] @@ -914,8 +915,8 @@ lintCoreExpr (Lit lit) ; return (literalType lit, zeroUE) } lintCoreExpr (Cast expr co) - = do { (expr_ty, ue) <- markAllJoinsBad (lintCoreExpr expr) - -- markAllJoinsBad: see Note [Join points and casts] + = do { (expr_ty, ue) <- markAllJoinsUnderCast (lintCoreExpr expr) + -- markAllJoinsUnderCast: see Note [Linting join points with casts or ticks] ; lintCoercion co ; lintRole co Representational (coercionRole co) @@ -929,14 +930,7 @@ lintCoreExpr (Tick tickish expr) = do { case tickish of Breakpoint _ _ ids -> forM_ ids $ \id -> lintIdOcc id 0 _ -> return () - ; markAllJoinsBadIf block_joins $ lintCoreExpr expr } - where - block_joins = not (tickish `tickishScopesLike` SoftScope) - -- TODO Consider whether this is the correct rule. It is consistent with - -- the simplifier's behaviour - cost-centre-scoped ticks become part of - -- the continuation, and thus they behave like part of an evaluation - -- context, but soft-scoped and non-scoped ticks simply wrap the result - -- (see Simplify.simplTick). + ; markAllJoinsUnderTick tickish $ lintCoreExpr expr } lintCoreExpr (Let (NonRec tv (Type ty)) body) | isTyVar tv @@ -1017,22 +1011,16 @@ lintCoreExpr e@(App _ _) ; return app_pair} where - skipTick t = case collectFunSimple e of - (Var v) -> etaExpansionTick v t - _ -> tickishFloatable t - (fun, args, _source_ticks) = collectArgsTicks skipTick e - -- We must look through source ticks to avoid #21152, for example: - -- - -- reallyUnsafePtrEquality - -- = \ @a -> - -- (src<loc> reallyUnsafePtrEquality#) - -- @Lifted @a @Lifted @a + skipTick t = + case collectFunSimple e of + Var v -> canCollectArgsThroughTick v t + _ -> tickishFloatable t + (fun, args, _ticks) = collectArgsTicks skipTick e + -- We must look through ticks, otherwise we may fail to spot a + -- saturated application. We use 'canCollectArgsThroughTicks', which is + -- the same predicate that Core Prep uses. -- - -- To do this, we use `collectArgsTicks tickishFloatable` to match - -- the eta expansion behaviour, as per Note [Eta expansion and source notes] - -- in GHC.Core.Opt.Arity. - -- Sadly this was not quite enough. So we now also accept things that CorePrep will allow. - -- See Note [Ticks and mandatory eta expansion] + -- See Note [Ticks and mandatory eta expansion] in GHC.CoreToStg.Prep. lintCoreExpr (Lam var expr) = markAllJoinsBad $ @@ -1131,7 +1119,7 @@ checkDeadIdOcc id ------------------ lintJoinBndrType :: OutType -- Type of the body -> OutId -- Possibly a join Id - -> LintM () + -> LintM () -- Checks that the return type of a join Id matches the body -- E.g. join j x = rhs in body -- The type of 'rhs' must be the same as the type of 'body' @@ -1139,13 +1127,29 @@ lintJoinBndrType body_ty bndr | JoinPoint arity <- idJoinPointHood bndr , let bndr_ty = idType bndr , (bndrs, res) <- splitPiTys bndr_ty - = do let msg = - hang (text "Join point returns different type than body") - 2 (vcat [ text "Join bndr:" <+> ppr bndr <+> dcolon <+> ppr (idType bndr) - , text "Join arity:" <+> ppr arity - , text "Body type:" <+> ppr body_ty ]) - checkL (length bndrs >= arity) msg - ensureEqTys body_ty (mkPiTys (drop arity bndrs) res) msg + = do let + ty_msg = + hang (text "Join point returns different type than body") + 2 (vcat [ text "Join bndr:" <+> ppr bndr <+> dcolon <+> ppr (idType bndr) + , text "Join arity:" <+> ppr arity + , text "Body type:" <+> ppr body_ty ]) + arity_msg = + hang (text "Join point is not saturated") + 2 (vcat [ text "Join bndr:" <+> ppr bndr <+> dcolon <+> ppr (idType bndr) + , text "Join arity:" <+> ppr arity + , text "Arguments:" <+> ppr bndrs ]) + + mb_join_info <- lookupJoinId bndr + case mb_join_info of + Nothing -> + pprPanic "lintJoinBndrType: valid join marked bad" (ppr bndr) + Just (_, occ_info) -> do + checkL (length bndrs >= arity) arity_msg + + -- See Note [Linting join points with casts or ticks] for why + -- we skip this check if there is an intervening cast. + unless (occ_info == JoinOccUnderCast) $ + ensureEqTys body_ty (mkPiTys (drop arity bndrs) res) ty_msg | otherwise = return () @@ -1156,11 +1160,11 @@ checkJoinOcc var n_args | JoinPoint join_arity_occ <- idJoinPointHood var = do { mb_join_arity_bndr <- lookupJoinId var ; case mb_join_arity_bndr of { - NotJoinPoint -> do { join_set <- getValidJoins - ; addErrL (text "join set " <+> ppr join_set $$ - invalidJoinOcc var) } ; + Nothing -> do { valid_joins <- getValidJoins + ; addErrL (text "valid joins:" <+> ppr valid_joins $$ + invalidJoinOcc var) } ; - JoinPoint join_arity_bndr -> + Just (join_arity_bndr, _join_occ) -> do { checkL (join_arity_bndr == join_arity_occ) $ -- Arity differs at binding site and occurrence @@ -1333,39 +1337,34 @@ checkLinearity body_ue lam_var = return body_ue' Nothing -> return body_ue -- A type variable -{- Note [Join points and casts] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You might think that this should be OK: - join j x = rhs - in (case e of - A -> alt1 - B x -> (jump j x) |> co) +{- Note [Linting join points with casts or ticks] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +As per Note [Join points, casts, and ticks] in GHC.Core, we have to be careful +when a cast or tick occurs in between a join point binding and a corresponding +join point occurrence. -You might think that, since the cast is ultimately erased, the jump to -`j` should still be OK as a join point. But no! See #21716. Suppose +Generally speaking: - newtype Age = MkAge Int -- axAge :: Age ~ Int - f :: Int -> ... -- f strict in it's first argument + - The simplifier cannot handle intervening casts or non-soft-scope ticks, so + we must check for that to avoid producing invalid Core. + - However, as per (JCT3), Core Prep **can** produce join points with + intervening casts or non-soft-scope ticks, which means we must expect them. -and consider the expression +Casts present an additional challenge. Consider for example: - f (join j :: Bool -> Age - j x = (rhs1 :: Age) - in case v of - Just x -> (j x |> axAge :: Int) - Nothing -> rhs2) + join { j :: Bool -> Age; j x = (blah :: Age) } + in case e of + False -> j True |> (co1 :: Age ~ Int) + True -> other :: Int -Then, if the Simplifier pushes the strict call into the join points -and alternatives we'll get +It is **not** the case that the type of 'blah' is the same as the type of +the body of the join point binding! Indeed: - join j' x = f (rhs1 :: Age) - in case v of - Just x -> j' x |> axAge - Nothing -> f rhs2 + - RHS of the join-point binding: blah :: Age + - The body of the join point has type Int. -Utterly bogus. `f` expects an `Int` and we are giving it an `Age`. -No no no. Casts destroy the tail-call property. Henc markAllJoinsBad -in the (Cast expr co) case of lintCoreExpr. +So we skip the 'exprType(join_rhs) == exprType(join_body)' check when casts +occur in between. Note [No alternatives lint check] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2977,9 +2976,10 @@ data LintEnv -- type variables, and coercion variables) -- Used at an occurrence of the InVar - , le_joins :: IdSet -- Join points in scope that are valid - -- A subset of the InScopeSet in le_subst - -- See Note [Join points] + , le_joins :: UniqMap Id JoinOcc + -- ^ Join points in scope that are valid + -- A subset of the InScopeSet in le_subst + -- See Note [Join points] , le_ue_aliases :: NameEnv UsageEnv -- See Note [Linting linearity] @@ -2999,6 +2999,7 @@ data LintFlags , lf_check_linearity :: Bool -- ^ See Note [Linting linearity] , lf_check_fixed_rep :: Bool -- ^ See Note [Checking for representation polymorphism] , lf_check_rubbish_lits :: Bool -- ^ See Note [Checking for rubbish literals] + , lf_allow_weak_joins :: Bool -- ^ See Note [Linting join points with casts or ticks] } -- See Note [Checking StaticPtrs] @@ -3307,6 +3308,20 @@ data LintLocInfo | InCo Coercion -- Inside a coercion | InAxiom (CoAxiom Branched) -- Inside a CoAxiom +-- | Does this join point 'Id' occur inside a cast? +-- +-- See Note [Linting join points with casts or ticks]. +data JoinOcc + -- | A normal occurrence of a 'JoinId'. + = NormalJoinOcc + -- | An occurrence of a 'JoinId' with an intervening cast between the + -- join point binder definition and the jump. + | JoinOccUnderCast + deriving stock Eq +instance Outputable JoinOcc where + ppr NormalJoinOcc = text "Normal" + ppr JoinOccUnderCast = text "UnderCast" + data LintConfig = LintConfig { l_diagOpts :: !DiagOpts -- ^ Diagnostics opts , l_platform :: !Platform -- ^ Target platform @@ -3328,7 +3343,7 @@ initL cfg m env = LE { le_flags = l_flags cfg , le_subst = mkEmptySubst (mkInScopeSetList vars) , le_in_vars = mkVarEnv [ (v,(v, varType v)) | v <- vars ] - , le_joins = emptyVarSet + , le_joins = emptyUniqMap , le_loc = [] , le_ue_aliases = emptyNameEnv , le_platform = l_platform cfg @@ -3428,11 +3443,11 @@ addInScopeId in_id out_ty thing_inside in unLintM (thing_inside out_id) env' errs where - add env@(LE { le_in_vars = id_vars, le_joins = join_set + add env@(LE { le_in_vars = id_vars, le_joins = valid_joins , le_ue_aliases = aliases, le_subst = subst }) = (out_id, env1) where - env1 = env { le_in_vars = in_vars', le_joins = join_set', le_ue_aliases = aliases' } + env1 = env { le_in_vars = in_vars', le_joins = valid_joins', le_ue_aliases = aliases' } in_vars' = extendVarEnv id_vars in_id (in_id, out_ty) aliases' = delFromNameEnv aliases (idName in_id) @@ -3446,9 +3461,9 @@ addInScopeId in_id out_ty thing_inside out_id | isEmptyTCvSubst subst = in_id | otherwise = setIdType in_id out_ty - join_set' - | isJoinId out_id = extendVarSet join_set in_id -- Overwrite with new arity - | otherwise = delVarSet join_set in_id -- Remove any existing binding + valid_joins' + | isJoinId out_id = addToUniqMap valid_joins in_id NormalJoinOcc -- Overwrite with new arity + | otherwise = delFromUniqMap valid_joins in_id -- Remove any existing binding addInScopeTyCoVar :: InTyCoVar -> OutType -> (OutTyCoVar -> LintM a) -> LintM a -- This function clones to avoid shadowing of TyCoVars @@ -3485,13 +3500,35 @@ extendTvSubstL tv ty m markAllJoinsBad :: LintM a -> LintM a markAllJoinsBad m - = LintM $ \ env errs -> unLintM m (env { le_joins = emptyVarSet }) errs + = LintM $ \ env errs -> unLintM m (env { le_joins = emptyUniqMap }) errs + +-- | Mark all join points as occurring under a tick. +-- +-- See Note [Linting join points with casts or ticks]. +markAllJoinsUnderTick :: CoreTickish -> LintM a -> LintM a +markAllJoinsUnderTick tick m + = LintM $ \ env errs -> + let env' = if tickishHasSoftScope tick || lf_allow_weak_joins (le_flags env) + then env + else env { le_joins = emptyUniqMap } + in unLintM m env' errs + +-- | Mark all join points as occurring under a cast. +-- +-- See Note [Linting join points with casts or ticks]. +markAllJoinsUnderCast :: LintM a -> LintM a +markAllJoinsUnderCast m + = LintM $ \ env errs -> + let !env' = if lf_allow_weak_joins (le_flags env) + then env { le_joins = fmap (const JoinOccUnderCast) (le_joins env) } + else env { le_joins = emptyUniqMap } + in unLintM m env' errs markAllJoinsBadIf :: Bool -> LintM a -> LintM a markAllJoinsBadIf True m = markAllJoinsBad m markAllJoinsBadIf False m = m -getValidJoins :: LintM IdSet +getValidJoins :: LintM (UniqMap Id JoinOcc) getValidJoins = LintM (\ env errs -> fromBoxedLResult (Just (le_joins env), errs)) getSubst :: LintM Subst @@ -3552,14 +3589,14 @@ lintVarOcc v_occ | otherwise = return () -lookupJoinId :: Id -> LintM JoinPointHood +lookupJoinId :: Id -> LintM (Maybe (JoinArity, JoinOcc)) -- Look up an Id which should be a join point, valid here -- If so, return its arity, if not return Nothing lookupJoinId id - = do { join_set <- getValidJoins - ; case lookupVarSet join_set id of - Just id' -> return (idJoinPointHood id') - Nothing -> return NotJoinPoint } + = do { valid_joins <- getValidJoins + ; case lookupUniqMap valid_joins id of + Just join_occ -> return $ Just (idJoinArity id, join_occ) + Nothing -> return Nothing } addAliasUE :: OutId -> UsageEnv -> LintM a -> LintM a addAliasUE id ue thing_inside = LintM $ \ env errs -> ===================================== compiler/GHC/Core/Opt/Arity.hs ===================================== @@ -90,7 +90,6 @@ import GHC.Utils.Misc import Data.List.NonEmpty ( nonEmpty ) import qualified Data.List.NonEmpty as NE -import Data.Maybe( isJust ) {- ************************************************************************ @@ -2835,21 +2834,6 @@ tryEtaReduce rec_ids bndrs body eval_sd ok_arg _ _ _ _ = Nothing --- | Can we eta-reduce the given function --- See Note [Eta reduction soundness], criteria (B), (J), and (W). -cantEtaReduceFun :: Id -> Bool -cantEtaReduceFun fun - = hasNoBinding fun -- (B) - -- Don't undersaturate functions with no binding. - - || isJoinId fun -- (J) - -- Don't undersaturate join points. - -- See Note [Invariants on join points] in GHC.Core, and #20599 - - || (isJust (idCbvMarks_maybe fun)) -- (W) - -- Don't undersaturate StrictWorkerIds. - -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info. - {- ********************************************************************* * * ===================================== compiler/GHC/Core/Opt/FloatIn.hs ===================================== @@ -375,7 +375,7 @@ We don't float lets inwards past an SCC. -} fiExpr platform to_drop (_, AnnTick tickish expr) - | tickish `tickishScopesLike` SoftScope + | tickishHasSoftScope tickish = Tick tickish (fiExpr platform to_drop expr) | otherwise -- Wimp out for now - we could push values in ===================================== compiler/GHC/Core/Opt/FloatOut.hs ===================================== @@ -365,25 +365,28 @@ floatExpr lam@(Lam (TB _ lam_spec) _) (add_to_stats fs floats, floats, mkLams bndrs body') } floatExpr (Tick tickish expr) - | tickish `tickishScopesLike` SoftScope -- not scoped, can just float + -- If possible, float out past the tick + | let float_out_of_tick + -- See Note [Floating past breakpoints] + | Breakpoint{} <- tickish + = True + | otherwise + -- We can float code out of non-scoped ticks + = tickishHasNoScope tickish + , float_out_of_tick = case (floatExpr expr) of { (fs, floating_defns, expr') -> (fs, floating_defns, Tick tickish expr') } - | not (tickishCounts tickish) || tickishCanSplit tickish - = case (floatExpr expr) of { (fs, floating_defns, expr') -> - let -- Annotate bindings floated outwards past an scc expression - -- with the cc. We mark that cc as "duplicated", though. - annotated_defns = wrapTick (mkNoCount tickish) floating_defns + -- We can't move code out of the tick + | otherwise + = assert (not (tickishCounts tickish) || tickishCanSplit tickish) $ + case (floatExpr expr) of { (fs, floating_defns, expr') -> + -- Wrap floated code with the correct tick scope, but using 'mkNoCount' + -- to ensure we don't duplicate counters. + let annotated_defns = wrapTick (mkNoCount tickish) floating_defns in (fs, annotated_defns, Tick tickish expr') } - -- See Note [Floating past breakpoints] - | Breakpoint{} <- tickish - = case (floatExpr expr) of { (fs, floating_defns, expr') -> - (fs, floating_defns, Tick tickish expr') } - - | otherwise - = pprPanic "floatExpr tick" (ppr tickish) floatExpr (Cast expr co) = case (floatExpr expr) of { (fs, floating_defns, expr') -> @@ -661,7 +664,8 @@ partitionByLevel (Level major minor) (FB tops defns) wrapTick :: CoreTickish -> FloatBinds -> FloatBinds wrapTick t (FB tops defns) - = FB (mapBag wrap_bind tops) + = assert (not $ tickishCounts t) $ + FB (mapBag wrap_bind tops) (M.map (M.map wrap_defns) defns) where wrap_defns = mapBag wrap_one @@ -672,10 +676,13 @@ wrapTick t (FB tops defns) wrap_one (FloatLet bind) = FloatLet (wrap_bind bind) wrap_one (FloatCase e b c bs) = FloatCase (maybe_tick e) b c bs - maybe_tick e | exprIsHNF e = tickHNFArgs t e - | otherwise = mkTick t e - -- we don't need to wrap a tick around an HNF when we float it - -- outside a tick: that is an invariant of the tick semantics + maybe_tick + -- We don't need to wrap an SCC tick around HNFs that we floated out of + -- the SCC, as that is an invariant of the semantics for SCCs. -- Conversely, inlining of HNFs inside an SCC is allowed, and -- indeed the HNF we're floating here might well be inlined back -- again, and we don't want to end up with duplicate ticks. + | tickishPlace t == PlaceCostCentre + = mkTickNoHNF t + | otherwise + = mkTick t ===================================== compiler/GHC/Core/Opt/OccurAnal.hs ===================================== @@ -27,7 +27,7 @@ core expression with (hopefully) improved usage information. module GHC.Core.Opt.OccurAnal ( occurAnalysePgm, - occurAnalyseExpr, + occurAnalyseExpr, occurAnalyseExpr_Prep, zapLambdaBndrs ) where @@ -85,6 +85,15 @@ occurAnalyseExpr expr = expr' where WUD _ expr' = occAnal initOccEnv expr +-- | A version of 'occurAnalyseExpr' suitable for CorePrep. +-- +-- Different from 'occurAnalyseExpr' due to (JCT3) +-- in Note [Join points, casts, and ticks] in GHC.Core. +occurAnalyseExpr_Prep :: CoreExpr -> CoreExpr +occurAnalyseExpr_Prep expr = expr' + where + WUD _ expr' = occAnal (initOccEnv { occ_allow_weak_joins = True }) expr + occurAnalysePgm :: Module -- Used only in debug output -> (Id -> Bool) -- Active unfoldings -> (ActivationGhc -> Bool) -- Active rules @@ -2300,12 +2309,8 @@ occ_anal_lam_tail env (Cast expr co) Var {} | isRhsEnv env -> markAllMany usage1 _ -> usage1 - -- usage3: you might think this was not necessary, because of - -- the markAllNonTail in adjustTailUsage; but not so! For a - -- join point, adjustTailUsage doesn't do this; yet if there is - -- a cast, we must! Also: why markAllNonTail? See - -- GHC.Core.Lint: Note Note [Join points and casts] - usage3 = markAllNonTail usage2 + -- usage3: see (JCT1) in Note [Join points, casts, and ticks] in GHC.Core. + usage3 = markAllNonTail_CastOrTick env usage2 in WUD usage3 (Cast expr' co) @@ -2587,42 +2592,39 @@ But it is not necessary to gather CoVars from the types of other binders. -} occAnal env (Tick tickish body) - = WUD usage' (Tick tickish body') + = WUD usage2 (Tick tickish body') where WUD usage body' = occAnal env body - usage' - | tickish `tickishScopesLike` SoftScope - = usage -- For soft-scoped ticks (including SourceNotes) we don't want - -- to lose join-point-hood, so we don't mess with `usage` (#24078) + usage1 + -- We don't want to lose join-point-hood. We can move soft-scoped ticks + -- out of the way, so don't mess with `usage` (#24078). + | tickishHasSoftScope tickish + = usage - -- For a non-soft tick scope, we can inline lambdas only, so we - -- abandon tail calls, and do markAllInsideLam too: usage_lam + -- Otherwise, we can inline lambdas only, so use 'markAllInsideLam'. + | otherwise + = markAllNonTail_CastOrTick env $ markAllInsideLam usage + -- markAllNonTail_CastOrTick: abandon tail calls. + -- See (JCT2) in Note [Join points, casts, and ticks] in GHC.Core. + usage2 | Breakpoint _ _ ids <- tickish = -- Never substitute for any of the Ids in a Breakpoint - addManyOccs usage_lam (mkVarSet ids) + addManyOccs usage1 (mkVarSet ids) | otherwise - = usage_lam - - usage_lam = markAllNonTail (markAllInsideLam usage) - - -- TODO There may be ways to make ticks and join points play - -- nicer together, but right now there are problems: - -- let j x = ... in tick<t> (j 1) - -- Making j a join point may cause the simplifier to drop t - -- (if the tick is put into the continuation). So we don't - -- count j 1 as a tail call. - -- See #14242. + = usage1 occAnal env (Cast expr co) - = let (WUD usage expr') = occAnal env expr - usage1 = addManyOccs usage (coVarsOfCo co) - -- usage2: see Note [Gather occurrences of coercion variables] - usage2 = markAllNonTail usage1 - -- usage3: calls inside expr aren't tail calls any more - in WUD usage2 (Cast expr' co) + = let + WUD usage expr' = occAnal env expr + -- usage1: see Note [Gather occurrences of coercion variables] + usage1 = addManyOccs usage (coVarsOfCo co) + -- usage2: see (JCT1) in Note [Join points, casts, and ticks] in GHC.Core. + usage2 = markAllNonTail_CastOrTick env usage1 + in + WUD usage2 (Cast expr' co) occAnal env app@(App _ _) = occAnalApp env (collectArgsTicks tickishFloatable app) @@ -2936,6 +2938,11 @@ data OccEnv , occ_rule_act :: ActivationGhc -> Bool -- Which rules are active -- See Note [Finding rule RHS free vars] + , occ_allow_weak_joins :: !Bool + -- ^ Allow a join point jump to occur inside casts or profiling ticks? + -- + -- See (JCT3) in Note [Join points, casts, and ticks] in GHC.Core.Opt. + -- See Note [The binder-swap substitution] -- If x :-> (y, co) is in the env, -- then please replace x by (y |> mco) @@ -3003,6 +3010,8 @@ initOccEnv , occ_unf_act = \_ -> True , occ_rule_act = \_ -> True + , occ_allow_weak_joins = False + , occ_join_points = emptyVarEnv , occ_bs_env = emptyVarEnv , occ_bs_rng = emptyVarSet @@ -3026,6 +3035,15 @@ setScrutCtxt !env alts -- non-default alternative. That in turn influences -- pre/postInlineUnconditionally. Grep for "occ_int_cxt"! +-- | Mark occurrences under a cast/non-soft-scope tick as non-tail-called, +-- except if 'occ_allow_weak_joins = True'. +-- +-- See Note [Join points, casts, and ticks] in GHC.Core. +markAllNonTail_CastOrTick :: OccEnv -> UsageDetails -> UsageDetails +markAllNonTail_CastOrTick env = + markAllNonTailIf + (not $ occ_allow_weak_joins env) + {- Note [The OccEnv for a right hand side] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ How do we create the OccEnv for a RHS (in mkRhsOccEnv)? @@ -4075,7 +4093,10 @@ okForJoinPoint :: TopLevelFlag -> Id -> TailCallInfo -> Bool -- See Note [Invariants on join points]; invariants cited by number below. -- Invariant 2 is always satisfiable by the simplifier by eta expansion. okForJoinPoint lvl bndr tail_call_info - | isJoinId bndr -- A current join point should still be one! + -- A current join point should still be one! + -- + -- See Note [JoinId vs TailCallInfo] in GHC.Core.SimpleOpt. + | isJoinId bndr = warnPprTrace lost_join "Lost join point" lost_join_doc $ True | valid_join ===================================== compiler/GHC/Core/Opt/Simplify/Iteration.hs ===================================== @@ -814,9 +814,9 @@ prepareRhs env top_lvl occ rhs0 = return (emptyLetFloats, Var fun) anfise (Tick t rhs) - -- We want to be able to float bindings past this - -- tick. Non-scoping ticks don't care. - | tickishScoped t == NoScope + -- We want to be able to float bindings past this tick. + -- Non-scoping ticks don't care. + | tickishHasNoScope t = do { (floats, rhs') <- anfise rhs ; return (floats, Tick t rhs') } @@ -1413,7 +1413,7 @@ simplTick env tickish expr cont -- bottom, then rebuildCall will discard the continuation. -------------------------- --- | tickishScoped tickish && not (tickishCounts tickish) +-- | not (tickishHasNoScope tickish) && not (tickishCounts tickish) -- = simplExprF env expr (TickIt tickish cont) -- XXX: we cannot do this, because the simplifier assumes that -- the context can be pushed into a case with a single branch. e.g. @@ -1425,12 +1425,11 @@ simplTick env tickish expr cont -- simplifier iterations that necessary in some cases. -------------------------- - -- For unscoped or soft-scoped ticks, we are allowed to float in new - -- cost, so we simply push the continuation inside the tick. This - -- has the effect of moving the tick to the outside of a case or - -- application context, allowing the normal case and application - -- optimisations to fire. - | tickish `tickishScopesLike` SoftScope + -- For soft-scoped ticks, we are allowed to float in new cost, so we simply + -- push the continuation inside the tick. This has the effect of moving the + -- tick to the outside of a case or application context, allowing the normal + -- 'case' and 'application' optimisations to fire. + | tickishHasSoftScope tickish = do { (floats, expr') <- simplExprF env expr cont ; return (floats, mkTick tickish expr') } @@ -1459,14 +1458,14 @@ simplTick env tickish expr cont _other -> Nothing where (ticks, expr0) = stripTicksTop movable (Tick tickish expr) movable t = not (tickishCounts t) || - t `tickishScopesLike` NoScope || + tickishHasNoScope t || tickishCanSplit t tickScrut e = foldr mkTick e ticks -- Alternatives get annotated with all ticks that scope in some way, -- but we don't want to count entries. tickAlt (Alt c bs e) = Alt c bs (foldr mkTick e ts_scope) ts_scope = map mkNoCount $ - filter (not . (`tickishScopesLike` NoScope)) ticks + filter (not . tickishHasNoScope) ticks no_floating_past_tick = do { let (inc,outc) = splitCont cont @@ -2180,16 +2179,15 @@ evaluation context E): As is evident from the example, there are two components to this behavior: - 1. When entering the RHS of a join point, copy the context inside. - 2. When a join point is invoked, discard the outer context. + (wrapJoinCont) When entering the RHS of a join point, copy the context inside. + (trimJoinCont) When a join point is invoked, discard the outer context. We need to be very careful here to remain consistent---neither part is optional! -We need do make the continuation E duplicable (since we are duplicating it) +We need to make the continuation E duplicable (since we are duplicating it) with mkDupableCont. - Note [Join points with -fno-case-of-case] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Supose case-of-case is switched off, and we are simplifying @@ -2213,7 +2211,8 @@ case-of-case we may then end up with this totally bogus result This would be OK in the language of the paper, but not in GHC: j is no longer a join point. We can only do the "push continuation into the RHS of the join point j" if we also push the continuation right down to the /jumps/ to -j, so that it can evaporate there. If we are doing case-of-case, we'll get to +j, so that it can evaporate there (trimJoinCont). Then, if we are doing +case-of-case, we'll get to: join x = case <j-rhs> of <outer-alts> in case y of @@ -3656,9 +3655,11 @@ addBinderUnfolding env bndr unf = modifyInScope env (bndr `setIdUnfolding` unf) zapBndrOccInfo :: Bool -> Id -> Id --- Consider case e of b { (a,b) -> ... } --- Then if we bind b to (a,b) in "...", and b is not dead, --- then we must zap the deadness info on a,b +-- ^ Consider: +-- > case e of e' { (a,b) -> rhs } +-- +-- We bind @e'@ to @(a,b)@ in @rhs@. If @e'@ is not dead, +-- then we must zap the deadness info on @a@ and @b@. zapBndrOccInfo keep_occ_info pat_id | keep_occ_info = pat_id | otherwise = zapIdOccInfo pat_id ===================================== compiler/GHC/Core/SimpleOpt.hs ===================================== @@ -437,7 +437,7 @@ simple_app env e@(Lam {}) [] simple_app env (Tick t e) as -- Okay to do "(Tick t e) x ==> Tick t (e x)"? - | t `tickishScopesLike` SoftScope + | tickishHasSoftScope t = mkTick t $ simple_app env e as -- (let x = e in b) a1 .. an => let x = e in (b a1 .. an) @@ -1059,23 +1059,33 @@ and again its arity increases (#15517) -} --- | Returns Just (bndr,rhs) if the binding is a join point: --- If it's a JoinId, just return it --- If it's not yet a JoinId but is always tail-called, --- make it into a JoinId and return it. +-- | Returns @Just (bndr, rhs)@ if the binding is a join point, or can be made +-- into a join poin. Returns @Nothing@ otherwise. +-- +-- - If the input binder is a 'JoinId', just return it; +-- - if it's not yet a 'JoinId' but is always tail-called, +-- make it into a 'JoinId' and return that. +-- -- In the latter case, eta-expand the RHS if necessary, to make the --- lambdas explicit, as is required for join points +-- lambdas explicit, as is required for join points. +-- +-- Precondition: the 'TailCallInfo' of the 'InBndr' is conservative: -- --- Precondition: the InBndr has been occurrence-analysed, --- so its OccInfo is valid +-- - if it says 'AlwaysTailCalled', it is definitely always tail called, +-- - if it says 'NoTailCallInfo', then we're not sure. +-- +-- See Note [JoinId vs TailCallInfo]. joinPointBinding_maybe :: InBndr -> InExpr -> Maybe (InBndr, InExpr) joinPointBinding_maybe bndr rhs | not (isId bndr) = Nothing + -- Being a JoinId is robust: preserve that. See Note [JoinId vs TailCallInfo]. | isJoinId bndr = Just (bndr, rhs) + -- If the 'TailCallInfo' of 'bndr' says 'AlwaysTailCalled', then we know for + -- sure that it can be made into a join point. | AlwaysTailCalled join_arity <- tailCallInfo (idOccInfo bndr) , (bndrs, body) <- etaExpandToJoinPoint join_arity rhs , let str_sig = idDmdSig bndr @@ -1091,6 +1101,48 @@ joinPointBindings_maybe :: [(InBndr, InExpr)] -> Maybe [(InBndr, InExpr)] joinPointBindings_maybe bndrs = mapM (uncurry joinPointBinding_maybe) bndrs +{- Note [JoinId vs TailCallInfo] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* Occurrence information is /fundamentally fragile/; that is, it may + be invalidated by the Simplifier. + Example 1: + \y -> let x = y in ...x..x... + Here `y` is marked "occurs exactly once" but, after inlining `x`, + `y` now occurs many times. + Example 2: + f (let h x = ... in case y of { True -> h 1; False -> h 2 }) + Here `h` is tail-called; but if `f` is strict we could transform to + let h x = ... in + case y of { True -> f (h 1); False -> f (h 2) } + Now `h` is not tail called any more. + + Exception: Dead things (with no occurrences) usually stay dead. + There are exceptions e.g. + case x of y { (a,b) -> case y of (p,q) -> p } + Here `a` and `b` look dead, but we may well transform to + case x of y { (a,b) -> a } + + Because occurrence info is fragile, we recompute occurrence info + (including tail call info) before each run of the Simplifier. + + Whenever the simplifier performs a transformation that **might** invalidate + occurrence information, it calls 'zapFragileIdInfo'. This sets the + 'TailCallInfo' to 'NoTailCallInfo' (among other things). + +* Being a JoinId is /robust/, and is rigorously maintained by the + Simplifier. In Example 2 above, if `h` was marked as a JoinId, + that transformation would not have happened. Instead we'd have + transformed to + let h x = f (...) in + case y of { True -> h 1; False -> h 2 } + + The Simplifier takes an Id whose occurrences are marked as + `AlwaysTailCalled` and turns it into robust `JoinId`. This is + done by `joinPointBinding_maybe`. + + There is one exception: float-out, the only caller of 'zapJoinId'. + See Note [Zapping JoinId when floating]. +-} {- ********************************************************************* * * ===================================== compiler/GHC/Core/Utils.hs ===================================== @@ -34,7 +34,8 @@ module GHC.Core.Utils ( exprIsTickedString, exprIsTickedString_maybe, exprIsTopLevelBindable, exprIsUnaryClassFun, isUnaryClassId, - altsAreExhaustive, etaExpansionTick, + altsAreExhaustive, + canCollectArgsThroughTick, cantEtaReduceFun, -- * Equality cheapEqExpr, cheapEqExpr', diffBinds, @@ -680,7 +681,7 @@ mergeCaseAlts :: CoreExpr -> Id -> [CoreAlt] -> Maybe ([CoreBind], [CoreAlt]) mergeCaseAlts scrut outer_bndr (Alt DEFAULT _ deflt_rhs : outer_alts) | Just (joins, inner_alts) <- go deflt_rhs , Just aux_binds <- mk_aux_binds joins - = Just ( aux_binds ++ joins, mergeAlts outer_alts inner_alts ) + = Just (aux_binds ++ joins, mergeAlts outer_alts inner_alts ) -- NB: mergeAlts gives priority to the left -- case x of -- A -> e1 @@ -727,7 +728,7 @@ mergeCaseAlts scrut outer_bndr (Alt DEFAULT _ deflt_rhs : outer_alts) , Just tc <- tyConAppTyCon_maybe type_arg , Just (dc1:dcs) <- tyConDataCons_maybe tc -- At least one data constructor , dcs `lengthAtMost` 3 -- Arbitrary - = return ( [], mk_alts dc1 dcs) + = return ([], mk_alts dc1 dcs) where mk_lit dc = mkLitIntUnchecked $ toInteger $ dataConTagZ dc mk_rhs dc = Var (dataConWorkId dc) @@ -748,11 +749,16 @@ mergeCaseAlts scrut outer_bndr (Alt DEFAULT _ deflt_rhs : outer_alts) | otherwise = Nothing - -- We don't want ticks to get in the way; just push them inwards. - -- (This happens when you add SourceTicks e.g. GHC.Num.Integer.integerLt#) + -- Push ticks **inwards** (when possible). + -- See (MC5) in Note [Merge Nested Cases]. go (Tick t body) - = do { (joins, alts) <- go body - ; return (joins, [Alt con bs (Tick t rhs) | Alt con bs rhs <- alts]) } + = do { (joins, alts) <- go body -- (MC4): any join points inside are floated out of the tick. + + -- Abort if this would put a non-soft-scope tick in between + -- a join point binding and its jumps. See (MC6). + ; guard $ null joins || tickishHasSoftScope t + ; return (joins, [Alt con bs (mkTick t rhs) | Alt con bs rhs <- alts]) + } go _ = Nothing @@ -974,12 +980,74 @@ Wrinkles So `mergeCaseAlts` floats out any join points. It doesn't float out non-join-points unless the /outer/ case has just one alternative; doing - so would risk more allocation + so would risk more allocation. + + Note also that `mergeCaseAlts` floats join points out of ticks, for which + we need to be extra careful; see (MC6). Floating out join points isn't entirely straightforward. See Note [Floating join points out of DEFAULT alternatives] -(MC5) See Note [Cascading case merge] +(MC5) We want to move ticks out of the way if possible, to prevent them from + inhibiting optimisation. For example, say we have: + + case expensive of r { + C1 -> rhs1; -- happy path + _ -> scctick<doEdgeCase> (case r of { C2 -> rhs2; C3 -> rhs3 }) + } + + In this situation, we push the "doEdgeCase" tick **inwards** and proceed + to merge cases, like so: + + case expensive of + C1 -> rhs1 + C2 -> scctick<doEdgeCase> rhs2 + C3 -> scctick<doEdgeCase> rhs3 + + This preserves the tick semantics (see Note [Scoping ticks and counting ticks] + in GHC.Types.Tickish), because this transformation: + + 1. preserves counts, + 2. does not move cost in or out of the tick scope. + + (1) is clear: we will tick 'doEdgeCase' exactly in the C2/C3 alternatives, + and we won't otherwise. + For (2), recall that case is strict in Core. We already evaluated 'expensive', + so re-scrutinising 'r' is free. + + This means that, perhaps surprisingly, this transformation is valid for + **all** ticks, including non-floatable ones. + + In contrast, we would not want to move the tick outwards, because this: + + - will lead to additional counting of 'doEdgeCase' in the 'C1' (happy path) case, + - risks attributing the cost of evaluating 'expensive' to 'doEdgeCase'. + +(MC6) There is a dangerous interaction between (MC4) and (MC5), which can lead + to invalid Core (as reported in #26642, #26929). Suppose we have: + + case f x of r -> + scctick<foo> + join j y = rhs in + case r of { C1 -> j 1; C2 -> bar } + + If we naively carried out (MC4) and (MC5) together, this would result in: + + join j y = rhs in + case f x of + C1 -> scctick<foo> (j 1) + C2 -> scctick<foo> bar + + This has moved the tick in between the join point binding 'j' and the + join point jump, which is invalid as per Note [Join points, casts, and ticks] + in GHC.Core. The simplifier cannot deal with such Core, resulting in #26642. + + The solution: abort whenever we would position a non-soft-scope tick + inside a join point in this manner. + An alternative would be to float the tick outwards, but as we saw in (MC5) + this risks a grave misattribution of profiling costs, so we don't do that. + +(MC7) See Note [Cascading case merge] See also Note [Example of case-merging and caseRules] in GHC.Core.Opt.Simplify.Utils @@ -2076,14 +2144,31 @@ altsAreExhaustive (Alt con1 _ _ : alts) -- we behave conservatively here -- I don't think it's important -- enough to deserve special treatment --- | Should we look past this tick when eta-expanding the given function? +-- | Should we look past this tick when collecting arguments +-- for the given function? -- -- See Note [Ticks and mandatory eta expansion] --- Takes the function we are applying as argument. -etaExpansionTick :: Id -> GenTickish pass -> Bool -etaExpansionTick id t - = hasNoBinding id && - ( tickishFloatable t || isProfTick t ) +canCollectArgsThroughTick + :: Id -- ^ function at the head of the application + -> GenTickish pass -- ^ tick we want to collect arguments past + -> Bool +canCollectArgsThroughTick id t + = tickishFloatable t || cantEtaReduceFun id + +-- | Can we eta-reduce the given function? +-- See Note [Eta reduction soundness], criteria (B), (J), and (W). +cantEtaReduceFun :: Id -> Bool +cantEtaReduceFun fun + = hasNoBinding fun -- (B) + -- Don't undersaturate functions with no binding. + + || isJoinId fun -- (J) + -- Don't undersaturate join points. + -- See Note [Invariants on join points] in GHC.Core, and #20599 + + || isJust (idCbvMarks_maybe fun) -- (W) + -- Don't undersaturate StrictWorkerIds. + -- See Note [CBV Function Ids: overview] in GHC.Types.Id.Info. {- Note [exprOkForSpeculation and type classes] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ===================================== compiler/GHC/CoreToStg/Prep.hs ===================================== @@ -39,7 +39,8 @@ import GHC.Core.Type import GHC.Core.Coercion import GHC.Core.TyCon import GHC.Core.DataCon -import GHC.Core.Opt.OccurAnal +import GHC.Core.Opt.OccurAnal ( occurAnalyseExpr_Prep ) +import GHC.Core.SimpleOpt ( joinPointBinding_maybe, joinPointBindings_maybe ) import GHC.Data.Maybe import GHC.Data.OrdList @@ -575,7 +576,18 @@ cpeBind :: TopLevelFlag -> CorePrepEnv -> CoreBind Maybe CoreBind) -- Just bind' <=> returned new bind; no float -- Nothing <=> added bind' to floats instead cpeBind top_lvl env (NonRec bndr rhs) - | not (isJoinId bndr) + -- A join point. + -- NB: use 'joinPointBinding_maybe' instead of 'isJoinId' as per the plan + -- described in (JCT3) in Note [Join points, casts, and ticks]. + | Just (bndr, rhs) <- joinPointBinding_maybe bndr rhs + = assert (not (isTopLevel top_lvl)) $ -- can't have top-level join point; see Note [Join points and floating] + do { (_, bndr1) <- cpCloneBndr env bndr + ; (bndr2, rhs1) <- cpeJoinPair env bndr1 rhs + ; return (extendCorePrepEnv env bndr bndr2, + emptyFloats, + Just (NonRec bndr2 rhs1)) } + + | otherwise = do { (env1, bndr1) <- cpCloneBndr env bndr ; let dmd = idDemandInfo bndr lev = typeLevity (idType bndr) @@ -594,16 +606,23 @@ cpeBind top_lvl env (NonRec bndr rhs) ; return (env2, floats1, Nothing) } - | otherwise -- A join point; see Note [Join points and floating] - = assert (not (isTopLevel top_lvl)) $ -- can't have top-level join point - do { (_, bndr1) <- cpCloneBndr env bndr - ; (bndr2, rhs1) <- cpeJoinPair env bndr1 rhs - ; return (extendCorePrepEnv env bndr bndr2, - emptyFloats, - Just (NonRec bndr2 rhs1)) } - cpeBind top_lvl env (Rec pairs) - | not (isJoinId (head bndrs)) + -- A recursive join point. + -- NB: use 'joinPointBindings_maybe' instead of 'isJoinId' as per the plan + -- described in (JCT3) in Note [Join points, casts, and ticks]. + | Just pairs <- joinPointBindings_maybe pairs + , let (bndrs, rhss) = unzip pairs + = do { (env, bndrs1) <- cpCloneBndrs env bndrs + ; let env' = enterRecGroupRHSs env bndrs1 + ; pairs1 <- zipWithM (cpeJoinPair env') bndrs1 rhss + + ; let bndrs2 = map fst pairs1 + -- use env below, so that we reset cpe_rec_ids + ; return (extendCorePrepEnvList env (bndrs `zip` bndrs2), + emptyFloats, + Just (Rec pairs1)) } + | otherwise + , let (bndrs, rhss) = unzip pairs = do { (env, bndrs1) <- cpCloneBndrs env bndrs ; let env' = enterRecGroupRHSs env bndrs1 ; stuff <- zipWithM (cpePair top_lvl Recursive topDmd Lifted env') @@ -626,19 +645,9 @@ cpeBind top_lvl env (Rec pairs) (Float (Rec all_pairs) LetBound TopLvlFloatable), Nothing) } - | otherwise -- See Note [Join points and floating] - = do { (env, bndrs1) <- cpCloneBndrs env bndrs - ; let env' = enterRecGroupRHSs env bndrs1 - ; pairs1 <- zipWithM (cpeJoinPair env') bndrs1 rhss - - ; let bndrs2 = map fst pairs1 - -- use env below, so that we reset cpe_rec_ids - ; return (extendCorePrepEnvList env (bndrs `zip` bndrs2), - emptyFloats, - Just (Rec pairs1)) } where - (bndrs, rhss) = unzip pairs - + -- See Note [Join points and floating] + -- -- Flatten all the floats, and the current -- group into a single giant Rec add_float (Float bind bound _) prs2 @@ -653,7 +662,6 @@ cpeBind top_lvl env (Rec pairs) Rec prs1 -> prs1 ++ prs2 add_float f _ = pprPanic "cpeBind" (ppr f) - --------------- cpePair :: TopLevelFlag -> RecFlag -> Demand -> Levity -> CorePrepEnv -> OutId -> CoreExpr @@ -661,7 +669,7 @@ cpePair :: TopLevelFlag -> RecFlag -> Demand -> Levity -- Used for all bindings -- The binder is already cloned, hence an OutId cpePair top_lvl is_rec dmd lev env0 bndr rhs - = assert (not (isJoinId bndr)) $ -- those should use cpeJoinPair + = assert (isNothing $ joinPointBinding_maybe bndr rhs) $ -- those should use cpeJoinPair do { (floats1, rhs1) <- cpeRhsE env rhs -- See if we are allowed to float this stuff out of the RHS @@ -926,7 +934,7 @@ rhsToBody :: CorePrepEnv -> CpeRhs -> UniqSM (Floats, CpeBody) -- Remove top level lambdas by let-binding rhsToBody env (Tick t expr) - | tickishScoped t == NoScope -- only float out of non-scoped annotations + | tickishHasNoScope t -- only float out of non-scoped annotations = do { (floats, expr') <- rhsToBody env expr ; return (floats, mkTick t expr') } @@ -984,43 +992,74 @@ instance Outputable ArgInfo where {- Note [Ticks and mandatory eta expansion] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Something like - `foo x = ({-# SCC foo #-} tagToEnum#) x :: Bool` -caused a compiler panic in #20938. Why did this happen? -The simplifier will eta-reduce the rhs giving us a partial -application of tagToEnum#. The tick is then pushed inside the -type argument. That is we get - `(Tick<foo> tagToEnum#) @Bool` +We must look through ticks when they get in the way of seeing the arguments to +'Id's that cannot be eta-reduced. + +For example, we may have + + myReallyUnsafePtrEquality + = \ @a x y -> + (src<loc> reallyUnsafePtrEquality#) + @Lifted @a @Lifted @a x y + +If we don't move the SourceNote out of the way, this looks like an unsaturated +occurrence of the PrimOp "reallyUnsafePtrEquality#", which we cannot generate +code for. + +Moreover, we must also move out non-floatable ticks. Case in point: #20938, +of the form: + + foo x = ({-# SCC foo #-} tagToEnum#) x :: Bool + +If we don't look past the tick "foo", the simplifier will eta-reduce the RHS, +giving us a partial application of 'tagToEnum#'. The tick is then pushed inside +the type argument, resulting in: + + (Tick<foo> tagToEnum#) @Bool + CorePrep would go on to see a undersaturated tagToEnum# application -and eta expand the expression under the tick. Giving us: +and eta-expand the expression under the tick. Giving us: + (Tick<scc> (\forall a. x -> tagToEnum# @a x) @Bool -Suddenly tagToEnum# is applied to a polymorphic type and the code generator + +Suddenly, 'tagToEnum#' is applied to a polymorphic type and the code generator panics as it needs a concrete type to determine the representation. -The problem in my eyes was that the tick covers a partial application -of a primop. There is no clear semantic for such a construct as we can't -partially apply a primop since they do not have bindings. -We fix this by expanding the scope of such ticks slightly to cover the body -of the eta-expanded expression. - -We do this by: -* Checking if an application is headed by a primOpish thing. -* If so we collect floatable ticks and usually but also profiling ticks - along with regular arguments. -* When rebuilding the application we check if any profiling ticks appear - before the primop is fully saturated. -* If the primop isn't fully satured we eta expand the primop application - and scope the tick to scope over the body of the saturated expression. - -Going back to #20938 this means starting with - `(Tick<foo> tagToEnum#) @Bool` -we check if the function head is a primop (yes). This means we collect the -profiling tick like if it was floatable. Giving us - (tagToEnum#, [CpeTick foo, CpeApp @Bool]). +The problem was that the tick covered a partial application of a primop. +There is no clear semantic for such a construct: we can't partially apply a +primop, since primops do not have bindings. + +To fix this, we expand the scope of ticks slightly to cover the body +of the eta-expanded expression, even when the tick isn't normally floatable. + +This is achieved by using 'GHC.Core.Utils.canCollectArgsThroughTick', which +responds 'True' in the following two situations: + + - The tick is floatable (i.e. satisfies 'tickishFloatable'), meaning that it + is OK to float it out slightly, moving in more code under it. + See also Note [Eta expansion and source notes] in GHC.Core.Opt.Arity. + - The tick is around an application that is headed by an 'Id' that cannot be + undersaturated, such as a PrimOp (see 'GHC.Core.Utils.cantEtaReduceFun'). + +This solves #20938. Indeed, starting with + + (scctick<foo> tagToEnum#) @Bool + +we see that the head of the application is 'tagToEnum#', which is a PrimOp and +thus satisfies 'hasNoBinding = True'. As a result, we collect the profiling tick +as if it was floatable, resulting in + + (tagToEnum#, [CpeTick foo, CpeApp @Bool]) + cpe_app filters out the tick as a underscoped tick on the expression -`tagToEnum# @Bool`. During eta expansion we then put that tick back onto the -body of the eta-expansion lambdas. Giving us `\x -> Tick<foo> (tagToEnum# @Bool x)`. +`tagToEnum# @Bool`. During eta-expansion, we put that tick back onto the +body of the eta-expansion lambda, resulting in + + \x -> scctick<foo> (tagToEnum# @Bool x) + +which is unproblematic. -} + cpeApp :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs) -- May return a CpeRhs (instead of CpeApp) because of saturating primops cpeApp top_env expr @@ -1045,15 +1084,14 @@ cpeApp top_env expr go (Cast fun co) as = go fun (AICast co : as) go (Tick tickish fun) as - -- Profiling ticks are slightly less strict so we expand their scope - -- if they cover partial applications of things like primOps. - -- See Note [Ticks and mandatory eta expansion] - -- Here we look inside `fun` before we make the final decision about - -- floating the tick which isn't optimal for perf. But this only makes - -- a difference if we have a non-floatable tick which is somewhat rare. + -- Try to move a tick out of the way, if: + -- - the tick can be floated out of the way ('tickishFloatable'), or + -- - the tick must be moved out of the way because it stands in between + -- an 'Id' that must be saturated and some of its arguments; + -- see Note [Ticks and mandatory eta expansion]. | Var vh <- head - , Var head' <- lookupCorePrepEnv top_env vh - , etaExpansionTick head' tickish + , Just head' <- getIdFromTrivialExpr_maybe (lookupCorePrepEnv top_env vh) + , canCollectArgsThroughTick head' tickish = (head,as') where (head,as') = go fun (AITick tickish : as) @@ -1130,7 +1168,10 @@ cpeApp top_env expr hd = getIdFromTrivialExpr_maybe e2 -- Determine number of required arguments. See Note [Ticks and mandatory eta expansion] min_arity = case hd of - Just v_hd -> if hasNoBinding v_hd then Just $! (idArity v_hd) else Nothing + Just v_hd -> + if cantEtaReduceFun v_hd + then Just $! idArity v_hd + else Nothing Nothing -> Nothing -- ; pprTraceM "cpe_app:stricts:" (ppr v <+> ppr args $$ ppr stricts $$ ppr (idCbvMarks_maybe v)) ; (app, floats, unsat_ticks) <- rebuild_app env args e2 emptyFloats stricts min_arity @@ -2293,8 +2334,8 @@ deFloatTop floats get b _ = pprPanic "deFloatTop" (ppr b) -- See Note [Dead code in CorePrep] - get_bind (NonRec x e) = NonRec x (occurAnalyseExpr e) - get_bind (Rec xes) = Rec [(x, occurAnalyseExpr e) | (x, e) <- xes] + get_bind (NonRec x e) = NonRec x (occurAnalyseExpr_Prep e) + get_bind (Rec xes) = Rec [(x, occurAnalyseExpr_Prep e) | (x, e) <- xes] --------------------------------------------------------------------------- ===================================== compiler/GHC/Driver/Config/Core/Lint.hs ===================================== @@ -115,7 +115,8 @@ perPassFlags dflags pass , lf_check_inline_loop_breakers = check_lbs , lf_check_static_ptrs = check_static_ptrs , lf_check_linearity = check_linearity - , lf_check_rubbish_lits = check_rubbish } + , lf_check_rubbish_lits = check_rubbish + , lf_allow_weak_joins = allow_weak_joins } where -- See Note [Checking for global Ids] check_globals = case pass of @@ -152,6 +153,11 @@ perPassFlags dflags pass CorePrep -> True _ -> False + -- See Note [Linting join points with casts or ticks] in GHC.Core.Lint + allow_weak_joins = case pass of + CorePrep -> True + _ -> False + initLintConfig :: DynFlags -> [Var] -> LintConfig initLintConfig dflags vars =LintConfig { l_diagOpts = initDiagOpts dflags @@ -168,4 +174,5 @@ defaultLintFlags dflags = LF { lf_check_global_ids = False , lf_report_unsat_syns = True , lf_check_fixed_rep = True , lf_check_rubbish_lits = True + , lf_allow_weak_joins = False } ===================================== compiler/GHC/Iface/Tidy.hs ===================================== @@ -1272,7 +1272,7 @@ tidyTopIdInfo rhs_tidy_env name rhs_ty orig_rhs tidy_rhs idinfo show_unfold is_external = isExternalName name --------- OccInfo ------------ - robust_occ_info = zapFragileOcc (occInfo idinfo) + robust_occ_info = zapFragileOccInfo (occInfo idinfo) -- It's important to keep loop-breaker information -- when we are doing -fexpose-all-unfoldings ===================================== compiler/GHC/StgToCmm/Expr.hs ===================================== @@ -1273,5 +1273,5 @@ cgTick tick ProfNote cc t p -> emitSetCCC cc t p HpcTick m n -> emit (mkTickBox platform m n) SourceNote s n -> emitTick $ SourceNote s n - _other -> return () -- ignore + Breakpoint {} -> return () -- ignore } ===================================== compiler/GHC/Types/Basic.hs ===================================== @@ -66,7 +66,7 @@ module GHC.Types.Basic ( noOneShotInfo, hasNoOneShotInfo, isOneShotInfo, bestOneShot, worstOneShot, - OccInfo(..), noOccInfo, seqOccInfo, zapFragileOcc, isOneOcc, + OccInfo(..), noOccInfo, seqOccInfo, zapFragileOccInfo, isOneOcc, isDeadOcc, isStrongLoopBreaker, isWeakLoopBreaker, isManyOccs, isNoOccInfo, strongLoopBreaker, weakLoopBreaker, @@ -980,10 +980,13 @@ isOneOcc :: OccInfo -> Bool isOneOcc (OneOcc {}) = True isOneOcc _ = False -zapFragileOcc :: OccInfo -> OccInfo --- Keep only the most robust data: deadness, loop-breaker-hood -zapFragileOcc (OneOcc {}) = noOccInfo -zapFragileOcc occ = zapOccTailCallInfo occ +-- | Keep only the most robust occurrence info: deadness, loop-breaker-hood. +-- +-- In particular, it zaps 'TailCallInfo': see Note [JoinId vs TailCallInfo] +-- in 'GHC.Core.Opt.Simplify.Env'. +zapFragileOccInfo :: OccInfo -> OccInfo +zapFragileOccInfo (OneOcc {}) = noOccInfo +zapFragileOccInfo occ = zapOccTailCallInfo occ instance Outputable OccInfo where -- only used for debugging; never parsed. KSW 1999-07 ===================================== compiler/GHC/Types/Id/Info.hs ===================================== @@ -914,14 +914,15 @@ zapUsedOnceInfo info , demandInfo = zapUsedOnceDemand (demandInfo info) } zapFragileInfo :: IdInfo -> Maybe IdInfo --- ^ Zap info that depends on free variables +-- ^ Zap fragile 'IdInfo', such as info that depends on free variables +-- or fragile occurrence info (see 'zapFragileOccInfo'). zapFragileInfo info@(IdInfo { occInfo = occ, realUnfoldingInfo = unf }) = new_unf `seq` -- The unfolding field is not (currently) strict, so we -- force it here to avoid a (zapFragileUnfolding unf) thunk -- which might leak space Just (info `setRuleInfo` emptyRuleInfo `setUnfoldingInfo` new_unf - `setOccInfo` zapFragileOcc occ) + `setOccInfo` zapFragileOccInfo occ) where new_unf = zapFragileUnfolding unf ===================================== compiler/GHC/Types/Tickish.hs ===================================== @@ -6,9 +6,8 @@ module GHC.Types.Tickish ( CoreTickish, StgTickish, CmmTickish, XTickishId, tickishCounts, - TickishScoping(..), - tickishScoped, - tickishScopesLike, + tickishHasNoScope, + tickishHasSoftScope, tickishFloatable, tickishCanSplit, mkNoCount, @@ -206,103 +205,177 @@ instance Binary BreakpointId where -------------------------------------------------------------------------------- --- | A "counting tick" (where tickishCounts is True) is one that +-- | A "counting tick" (for which 'tickishCounts' is True) is one that -- counts evaluations in some way. We cannot discard a counting tick, --- and the compiler should preserve the number of counting ticks as --- far as possible. +-- and the compiler should preserve the number of counting ticks (as +-- far as possible). -- --- However, we still allow the simplifier to increase or decrease --- sharing, so in practice the actual number of ticks may vary, except --- that we never change the value from zero to non-zero or vice versa. +-- See Note [Counting ticks] tickishCounts :: GenTickish pass -> Bool -tickishCounts n@ProfNote{} = profNoteCount n -tickishCounts HpcTick{} = True -tickishCounts Breakpoint{} = True -tickishCounts _ = False - - --- | Specifies the scoping behaviour of ticks. This governs the --- behaviour of ticks that care about the covered code and the cost --- associated with it. Important for ticks relating to profiling. -data TickishScoping = - -- | No scoping: The tick does not care about what code it - -- covers. Transformations can freely move code inside as well as - -- outside without any additional annotation obligations - NoScope - - -- | Soft scoping: We want all code that is covered to stay - -- covered. Note that this scope type does not forbid - -- transformations from happening, as long as all results of - -- the transformations are still covered by this tick or a copy of - -- it. For example - -- - -- let x = tick<...> (let y = foo in bar) in baz - -- ===> - -- let x = tick<...> bar; y = tick<...> foo in baz - -- - -- Is a valid transformation as far as "bar" and "foo" is - -- concerned, because both still are scoped over by the tick. - -- - -- Note though that one might object to the "let" not being - -- covered by the tick any more. However, we are generally lax - -- with this - constant costs don't matter too much, and given - -- that the "let" was effectively merged we can view it as having - -- lost its identity anyway. - -- - -- Also note that this scoping behaviour allows floating a tick - -- "upwards" in pretty much any situation. For example: - -- - -- case foo of x -> tick<...> bar - -- ==> - -- tick<...> case foo of x -> bar - -- - -- While this is always legal, we want to make a best effort to - -- only make us of this where it exposes transformation - -- opportunities. - | SoftScope - - -- | Cost centre scoping: We don't want any costs to move to other - -- cost-centre stacks. This means we not only want no code or cost - -- to get moved out of their cost centres, but we also object to - -- code getting associated with new cost-centre ticks - or - -- changing the order in which they get applied. - -- - -- A rule of thumb is that we don't want any code to gain new - -- annotations. However, there are notable exceptions, for - -- example: - -- - -- let f = \y -> foo in tick<...> ... (f x) ... - -- ==> - -- tick<...> ... foo[x/y] ... - -- - -- In-lining lambdas like this is always legal, because inlining a - -- function does not change the cost-centre stack when the - -- function is called. - | CostCentreScope - - deriving (Eq) - --- | Returns the intended scoping rule for a Tickish -tickishScoped :: GenTickish pass -> TickishScoping -tickishScoped n@ProfNote{} - | profNoteScope n = CostCentreScope - | otherwise = NoScope -tickishScoped HpcTick{} = NoScope -tickishScoped Breakpoint{} = CostCentreScope - -- Breakpoints are scoped: eventually we're going to do call - -- stacks, but also this helps prevent the simplifier from moving - -- breakpoints around and changing their result type (see #1531). -tickishScoped SourceNote{} = SoftScope - --- | Returns whether the tick scoping rule is at least as permissive --- as the given scoping rule. -tickishScopesLike :: GenTickish pass -> TickishScoping -> Bool -tickishScopesLike t scope = tickishScoped t `like` scope - where NoScope `like` _ = True - _ `like` NoScope = False - SoftScope `like` _ = True - _ `like` SoftScope = False - CostCentreScope `like` _ = True +tickishCounts = \case + ProfNote { profNoteCount = counts } -> counts + HpcTick {} -> True + Breakpoint {} -> True + SourceNote {} -> False + +-- | Is this a non-scoping tick, for which we don't care about precisely +-- the extent of code that the tick encompasses? +-- +-- See Note [Scoped ticks] +tickishHasNoScope :: GenTickish pass -> Bool +tickishHasNoScope = \case + ProfNote { profNoteScope = scopes } -> not scopes + HpcTick {} -> True + Breakpoint {} -> False + SourceNote {} -> False + +-- | A "tick with soft scoping" (for which 'tickishHasSoftScope' is True) is +-- one that either does not scope at all (for which 'tickishHasNoScope' is True), +-- or that has a "soft" scope: we allow new code to be floated into to the scope, +-- as long as all code that was covered remains covered. +-- +-- See Note [Scoped ticks] +tickishHasSoftScope :: GenTickish pass -> Bool +tickishHasSoftScope = \case + ProfNote { profNoteScope = scopes } -> not scopes + HpcTick {} -> True + Breakpoint {} -> False + SourceNote {} -> True + +{- Note [Scoping ticks and counting ticks] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Ticks have two independent attributes: + + * Whether the tick /counts/. + Counting ticks are used when we want a counter to be bumped, e.g. counting + how many times a function is called. + + See Note [Counting ticks] + + * What kind of /scope/ the tick has: + * Cost-centre scope: you cannot move a redex into the scope of the tick, + nor can you float a redex out. + * Soft scope: you can move a redex /into/ the scope of a tick, + but you cannot float a redex /out/ + * No scope: there are no restrictions on floating in or out. + + See Note [Scoped ticks] + +Note [Counting ticks] +~~~~~~~~~~~~~~~~~~~~ +The following ticks count: + - ProfNote ticks with profNoteCounts = True + - HPC ticks + - Breakpoints + +Going past a counting tick implies bumping a counter. +Generally, the simplifier attempts to preserve counts when transforming +programs and moving ticks, for example by transforming: + + case <tick> e of + alt1 -> rhs1 + alt2 -> rhs2 + +to + + case e of + alt1 -> <tick> rhs1 + alt2 -> <tick> rhs2 + +which preserves the total count (as exactly one branch of the case +will be taken). + +However, we still allow the simplifier to increase or decrease +sharing, so in practice the actual number of ticks may vary, except +that we never change the value from zero to non-zero or vice-versa. + +Note [Scoped ticks] +~~~~~~~~~~~~~~~~~~~~ +The following ticks are scoped: + - ProfNote ticks with profNoteScope = True + - Breakpoints + - Source notes + +A scoped tick is one that scopes over a portion of code. For example, +an SCC anotation sets the cost centre for the code within; any allocations +within that piece of code should get attributed to that cost centre. + +When the simplifier deals with a scoping tick, it ensures that all code that +was covered remains covered. For example + + let x = tick<...> (let y = foo in bar) in baz + ===> + let x = tick<...> bar; y = tick<...> foo in baz + +is a valid transformation as far as "bar" and "foo" are concerned, because +both still are scoped over by the tick. One might object to the "let" not +being covered by the tick any more. However, we are generally lax with this; +constant costs don't matter too much, and given that the "let" was effectively +merged we can view it as having lost its identity anyway. + +Perhaps surprisingly, breakpoints are considered to be scoped, because we +don't want the simplifier to move them around, changing their result type (see #1531). + +We specifically forbid floating code outside of a scoping tick, as cost +associated with the floated-out code would no longer be attributed to the +appropriate scope. + +Whether we are allowed to float in additional cost depends on the tick: + + Cost-centre scope ticks + - ProfNote with profNoteScope = True + - Breakpoints + + A tick with cost-centre scope is one for which we can neither move + redexes into or move redexes outside of the tick. For example, we don't + want profiling costs to move to other cost-centre stacks. + Morever, we also object to changing the order in which such ticks + are applied. + + A rule of thumb is that we don't want any code to gain new + lexically-enclosing ticks. For example, we should not transform: + + f (scctick<foo> a) ==> scctick<foo> (f a) + + as this would attribute the cost of evaluating the application 'f a' + to the cost centre 'foo'. + + However, there are notable exceptions, for example: + + let f = \y -> foo in tick<...> ... (f x) ... + ==> + tick<...> ... foo[x/y] ... + + Inlining lambdas like this is always legal, because inlining a function + does not change the cost-centre stack when the function is called. + + Soft scope ticks + - Source notes + + A tick with soft scope is one for which we can move redexes inside the + tick, but cannot float redexes outside the tick. This is a slightly more + lenient notion of scoping than cost-centres, and is used only for source + note ticks (they are used to provide DWARF debug symbols, and for those + it matters less if code from outside gets moved under the tick). + + Examples: + + - FloatIn (GHC.Core.Opt.FloatIn.fiExpr) + + let x = rhs in <tick> body + ==> + <tick> (let x = rhs in body) + + - Moving a tick outside of a case or of an application + (GHC.Core.Opt.Simplify.Iteration.simplTick) + + case <tick> e of alts ==> <tick> case e of alts + + (<tick> e1) e2 ==> <tick> (e1 e2) + + While these transformations are legal, we want to make a best effort to + only make use of them where it exposes transformation opportunities. +-} -- | Returns @True@ for ticks that can be floated upwards easily even -- where it might change execution counts, such as: @@ -311,12 +384,11 @@ tickishScopesLike t scope = tickishScoped t `like` scope -- ==> -- tick<...> (Just foo) -- --- This is a combination of @tickishSoftScope@ and --- @tickishCounts@. Note that in principle splittable ticks can become --- floatable using @mkNoTick@ -- even though there's currently no --- tickish for which that is the case. +-- This is a combination of @tickishHasSoftScope@ and @tickishCounts@. +-- Note that in principle splittable ticks can become floatable using @mkNoTick@, +-- even though there's currently no tickish for which that is the case. tickishFloatable :: GenTickish pass -> Bool -tickishFloatable t = t `tickishScopesLike` SoftScope && not (tickishCounts t) +tickishFloatable t = tickishHasSoftScope t && not (tickishCounts t) -- | Returns @True@ for a tick that is both counting /and/ scoping and -- can be split into its (tick, scope) parts using 'mkNoScope' and @@ -334,7 +406,7 @@ mkNoCount n@ProfNote{} = let n' = n {profNoteCount = False} mkNoCount _ = panic "mkNoCount: Undefined split!" mkNoScope :: GenTickish pass -> GenTickish pass -mkNoScope n | tickishScoped n == NoScope = n +mkNoScope n | tickishHasNoScope n = n | not (tickishCanSplit n) = panic "mkNoScope: Cannot split!" mkNoScope n@ProfNote{} = let n' = n {profNoteScope = False} in assert (profNoteCount n) n' @@ -357,7 +429,9 @@ mkNoScope _ = panic "mkNoScope: Undefined split!" -- translate the code as if it found the latter. tickishIsCode :: GenTickish pass -> Bool tickishIsCode SourceNote{} = False -tickishIsCode _tickish = True -- all the rest for now +tickishIsCode ProfNote{} = True +tickishIsCode Breakpoint{} = True +tickishIsCode HpcTick{} = True isProfTick :: GenTickish pass -> Bool isProfTick ProfNote{} = True ===================================== testsuite/tests/codeGen/should_compile/debug.stdout ===================================== @@ -18,7 +18,6 @@ srcdebug.hs:4:9 srcdebug.hs:5:21-29 srcdebug.hs:5:9-29 srcdebug.hs:6:1-21 -srcdebug.hs:6:16-21 == CBE == srcdebug.hs:4:9 89 ===================================== testsuite/tests/simplCore/should_compile/T26642.hs ===================================== @@ -0,0 +1,46 @@ +module T26642 ( saveClobberedTemps ) where + +import Prelude ( IO, Bool(..), Int, (>>=), (==), return ) +import Data.Word ( Word64 ) + +------------------------------------------------------------------------------- + +data Word64Map a + = Bin (Word64Map a) (Word64Map a) + | Tip a + | Nil + +{-# NOINLINE myFoldr #-} +myFoldr :: (a -> b -> b) -> b -> Word64Map a -> b +myFoldr f = go + where + {-# NOINLINE go #-} + go z' Nil = z' + go z' (Tip x) = f x z' + go z' (Bin l r) = go (go z' r) l + +{-# NOINLINE nonDetFold #-} +nonDetFold :: (b -> elt -> IO b) -> b -> Word64Map elt -> IO b +nonDetFold f z0 xs = myFoldr c return xs z0 + where + {-# NOINLINE c #-} + c x k z = f z x >>= k + +{-# NOINLINE myFalse #-} +myFalse :: Bool +myFalse = False + +type RealReg = Int +data Loc = InReg RealReg | InMem + +saveClobberedTemps :: forall instr. [RealReg] -> IO [instr] +saveClobberedTemps clobbered = nonDetFold maybe_spill [] Nil + where + {-# NOINLINE maybe_spill #-} + maybe_spill :: [instr] -> Loc -> IO [instr] + maybe_spill instrs !loc = + case loc of + InReg reg + | myFalse + -> return [] + _ -> return instrs ===================================== testsuite/tests/simplCore/should_compile/TrickyJoins.hs ===================================== @@ -0,0 +1,154 @@ +{-# LANGUAGE DataKinds #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeFamilies #-} + +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE UnboxedTuples #-} + +module TrickyJoinPoints where + +import Data.Coerce + ( coerce ) +import Data.Kind + ( Type ) + + +import Data.Map.Strict (Map) +import qualified Data.Map.Strict as Map +import qualified Data.Set as Set + +----------------------------------- +-- Join points and profiling ticks + +data ModGuts2 = MkModGuts2 + +runCorePasses3 :: Bool -> ModGuts2 -> IO ModGuts2 +runCorePasses3 pass guts = doCorePass3 pass guts + +doCorePass3 :: Bool -> ModGuts2 -> IO ModGuts2 +doCorePass3 pass guts = do + _ <- putStrLn "hi" + + let + updateBinds _ = return guts + + case pass of + True -> {-# SCC "XXX3" #-} updateBinds False + _ -> {-# SCC "YYY3" #-} updateBinds True + +-------------------------- +-- Join points & casts + +newtype AdjacencyMap a = AM { + adjacencyMap :: Map a (Set.Set a) } + +overlays :: Ord a => [AdjacencyMap a] -> AdjacencyMap a +overlays = AM . Map.unionsWith Set.union . map adjacencyMap + + +type SBool :: Bool -> Type +data SBool b where + SFalse :: SBool False + STrue :: SBool True + +type N :: Bool -> Type +data family N b +newtype instance N False = NF ( Int -> Int ) +newtype instance N True = NT ( Int -> Int ) + +testCast :: forall b. SBool b -> Int -> Int +testCast b n = + case + ( let + {-# NOINLINE juliet #-} + juliet :: Int -> Int -> Int + juliet x = \ y -> x + y + n + in + case b of + SFalse -> NF (juliet 1) + STrue -> NT (juliet 2) + ) :: N b of + n | SFalse <- b + , NF f <- n + -> f 100 + | STrue <- b + , NT g <- n + -> g 200 + + +------------------------------------------ +-- Join points, profiling ticks and casts + +newtype M = M ( Int -> Int -> Int ) + +testCastTick :: forall b. SBool b -> Int -> Int +testCastTick b n = + case + ( let + {-# NOINLINE j #-} + j :: Int -> Int -> Int + j x = \ y -> x + y + n + {-# NOINLINE k #-} + k :: M + k = coerce j + in + case b of + SFalse -> {-# SCC "ticked" #-} NF ( coerce @M @( Int -> Int -> Int ) k 1 ) + STrue -> NT ( coerce @M @( Int -> Int -> Int ) k 2 ) + ) :: N b of + n | SFalse <- b + , NF f <- n + -> f 100 + | STrue <- b + , NT g <- n + -> g 200 + +------------------------------------------ + +{-# NOINLINE testJoinTransitivity #-} +testJoinTransitivity :: Bool -> Int -> Int +testJoinTransitivity b n = + let + f x = x ^ ( 99 :: Int ) + 7 * ( x - 19 ) + {-# NOINLINE f #-} + in + f ( + let + j1 :: Int -> Int + j1 x = x + n + {-# NOINLINE j1 #-} + + j2 :: Int -> Int + j2 y = j1 (y * 2) + {-# NOINLINE j2 #-} + + j3 :: Int -> Int + j3 z = j2 (z * 3) + {-# NOINLINE j3 #-} + + in case b of + True -> {-# SCC "ticked" #-} j3 10 + False -> j3 20 + ) + +-------------------------------------------------------------------------------- +-- Test relating to Note [JoinId vs TailCallInfo] + +expt :: Int -> Int +expt _ = 3 +{-# NOINLINE expt #-} + +repro :: (Int, Int) -> (Int, Int) +repro (f0,e0) = + let + (f,e) = + let n = e0 + in + case n > 0 of + True -> (f0, e0 + n) + False -> (f0, e0) + r = let be = expt e in f * be + in + (r, 7) ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -470,6 +470,9 @@ test('T22272', normal, multimod_compile, ['T22272', '-O -fexpose-all-unfoldings # go should become a join point test('T22428', [grep_errmsg(r'jump go') ], compile, ['-O -ddump-simpl -dsuppress-uniques -dno-typeable-binds -dsuppress-unfoldings']) +test('TrickyJoins', normal, compile, ['']) +test('T26642', [unless(have_profiling(), skip)], compile, ['-O -prof -fprof-auto-calls']) + test('T22459', normal, compile, ['']) test('T22623', normal, multimod_compile, ['T22623', '-O -v0']) test('T22662', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/08bc245be70d95801bc1138804ed1de9... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/08bc245be70d95801bc1138804ed1de9... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Marge Bot (@marge-bot)