September 2025 - ghc-commits

[Git][ghc/ghc][wip/T23162-spj] More refactoring
by Simon Peyton Jones (＠simonpj) 11 Sep '25

11 Sep '25

Simon Peyton Jones pushed to branch wip/T23162-spj at Glasgow Haskell Compiler / GHC Commits: dc8360ef by Simon Peyton Jones at 2025-09-11T13:59:30+01:00 More refactoring - - - - - 10 changed files: - compiler/GHC/Core/TyCon.hs - compiler/GHC/Tc/Instance/FunDeps.hs - compiler/GHC/Tc/Solver/Default.hs - compiler/GHC/Tc/Solver/Dict.hs - compiler/GHC/Tc/Solver/Equality.hs - compiler/GHC/Tc/Solver/FunDeps.hs - compiler/GHC/Tc/Solver/Monad.hs - compiler/GHC/Tc/Solver/Solve.hs - compiler/GHC/Tc/Utils/Monad.hs - compiler/GHC/Tc/Utils/Unify.hs Changes: ===================================== compiler/GHC/Core/TyCon.hs ===================================== @@ -1282,6 +1282,7 @@ isNoParent _ = False data Injectivity = NotInjective | Injective [Bool] -- 1-1 with tyConTyVars (incl kind vars) + -- INVARIANT: not all False deriving( Eq ) -- | Information pertaining to the expansion of a type synonym (@type@) ===================================== compiler/GHC/Tc/Instance/FunDeps.hs ===================================== @@ -94,7 +94,7 @@ an equality for the RHS. Wrinkles: -(1) meta_tvs: sometimes the instance mentions variables in the RHS that +(IMP1) fd_qtvs: sometimes the instance mentions variables in the RHS that are not bound in the LHS. For example class C a b | a -> b @@ -109,7 +109,7 @@ Wrinkles: Note that the fd_qtvs can be free in the /first/ component of the Pair, but not in the second (which comes from the [W] constraint). -(2) Multi-range fundeps. When these meta_tvs are involved, there is a subtle +(IMP2) Multi-range fundeps. When these meta_tvs are involved, there is a subtle difference between the fundep (a -> b c) and the two fundeps (a->b, a->c). Consider class D a b c | a -> b c @@ -125,15 +125,15 @@ Wrinkles: FDEqn { fd_qtvs = [x2], fd_eqs = [ Maybe x2 ~ ty ] } with two FDEqns, generating two separate unification variables. -(3) improveFromInstEnv doesn't return any equations that already hold. - Reason: then we know if any actual improvement has happened, in - which case we need to iterate the solver +(IMP3) improveFromInstEnv doesn't return any equations that already hold. + Reason: just an optimisation; the caller does the same thing, but + with a bit more ceremony. -} data FunDepEqn = FDEqn { fd_qtvs :: [TyVar] -- Instantiate these type and kind vars -- to fresh unification vars, - -- Non-empty only for FunDepEqns arising from instance decls + -- See (IMP2) in Note [Improving against instances] , fd_eqs :: [TypeEqn] -- Make these pairs of types equal -- Invariant: In each (Pair ty1 ty2), the fd_qtvs may be @@ -193,7 +193,8 @@ zipAndComputeFDEqs :: (Type -> Type -> Bool) -- Discard this FDEq if true -- Create a list of (Type,Type) pairs from two lists of types, -- making sure that the types are not already equal zipAndComputeFDEqs discard (ty1:tys1) (ty2:tys2) - | discard ty1 ty2 = zipAndComputeFDEqs discard tys1 tys2 + | discard ty1 ty2 = -- See (IMP3) in Note [Improving against instances] + zipAndComputeFDEqs discard tys1 tys2 | otherwise = Pair ty1 ty2 : zipAndComputeFDEqs discard tys1 tys2 zipAndComputeFDEqs _ _ _ = [] ===================================== compiler/GHC/Tc/Solver/Default.hs ===================================== @@ -543,10 +543,9 @@ defaultEquality ct -- This handles cases such as @IO alpha[tau] ~R# IO Int@ -- by defaulting @alpha := Int@, which is useful in practice -- (see Note [Defaulting representational equalities]). - ; (co, new_eqs, _unifs) <- - wrapUnifierX (ctEvidence ct) Nominal $ \uenv -> - -- NB: nominal equality! - uType uenv z_ty1 z_ty2 + ; (co, new_eqs) <- wrapUnifier (ctEvidence ct) Nominal $ \uenv -> + -- NB: nominal equality! + uType uenv z_ty1 z_ty2 -- Only accept this solution if no new equalities are produced -- by the unifier. ===================================== compiler/GHC/Tc/Solver/Dict.hs ===================================== @@ -473,8 +473,8 @@ solveEqualityDict ev cls tys do { let (role, t1, t2) = matchEqualityInst cls tys -- Unify t1~t2, putting anything that can't be solved -- immediately into the work list - ; (co, _, _) <- wrapUnifierTcS ev role $ \uenv -> - uType uenv t1 t2 + ; co <- wrapUnifierAndEmit ev role $ \uenv -> + uType uenv t1 t2 -- Set d :: (t1~t2) = Eq# co ; setWantedEvTerm dest EvCanonical $ evDictApp cls tys [Coercion co] ===================================== compiler/GHC/Tc/Solver/Equality.hs ===================================== @@ -544,7 +544,7 @@ can_eq_nc_forall ev eq_rel s1 s2 -- Generate the constraints that live in the body of the implication -- See (SF5) in Note [Solving forall equalities] ; (lvl, (all_co, wanteds)) <- pushLevelNoWorkList (ppr skol_info) $ - unifyForAllBody ev (eqRelRole eq_rel) $ \uenv -> + wrapUnifier ev (eqRelRole eq_rel) $ \uenv -> go uenv skol_tvs init_subst2 bndrs1 bndrs2 -- Solve the implication right away, using `trySolveImplication` @@ -634,9 +634,9 @@ There are lots of wrinkles of course: (SF5) Rather than manually gather the constraints needed in the body of the implication, we use `uType`. That way we can solve some of them on the fly, - especially Refl ones. We use the `unifyForAllBody` wrapper for `uType`, + especially Refl ones. We use the `wrapUnifier` wrapper for `uType`, because we want to /gather/ the equality constraint (to put in the implication) - rather than /emit/ them into the monad, as `wrapUnifierTcS` does. + rather than /emit/ them into the monad, as `wrapUnifierAndEmit` does. (SF6) We solve the implication on the spot, using `trySolveImplication`. In the past we instead generated an `Implication` to be solved later. Nice in @@ -808,7 +808,7 @@ can_eq_app ev s1 t1 s2 t2 = do { traceTcS "can_eq_app" (vcat [ text "s1:" <+> ppr s1, text "t1:" <+> ppr t1 , text "s2:" <+> ppr s2, text "t2:" <+> ppr t2 , text "vis:" <+> ppr (isNextArgVisible s1) ]) - ; (co,_,_) <- wrapUnifierTcS ev Nominal $ \uenv -> + ; co <- wrapUnifierAndEmit ev Nominal $ \uenv -> -- Unify arguments t1/t2 before function s1/s2, because -- the former have smaller kinds, and hence simpler error messages -- c.f. GHC.Tc.Utils.Unify.uType (go_app) @@ -966,7 +966,7 @@ then we will just decompose s1~s2, and it might be better to do so on the spot. An important special case is where s1=s2, and we get just Refl. -So canDecomposableTyConAppOK uses wrapUnifierTcS etc to short-cut +So canDecomposableTyConAppOK uses wrapUnifierAndEmit etc to short-cut that work. See also Note [Work-list ordering]. Note [Decomposing TyConApp equalities] @@ -1090,7 +1090,7 @@ up in the complexities of canEqLHSHetero. To do this: * `uType` keeps the bag of emitted constraints in the same left-to-right order. See the use of `snocBag` in `uType_defer`. -* `wrapUnifierTcS` adds the bag of deferred constraints from +* `wrapUnifierAndEmit` adds the bag of deferred constraints from `do_unifications` to the work-list using `extendWorkListChildEqs`. * `extendWorkListChildEqs` and `selectWorkItem` together arrange that the @@ -1394,7 +1394,7 @@ canDecomposableTyConAppOK ev eq_rel tc (ty1,tys1) (ty2,tys2) -- new_locs and tc_roles are both infinite, so we are -- guaranteed that cos has the same length as tys1 and tys2 -- See Note [Fast path when decomposing TyConApps] - -> do { (co, _, _) <- wrapUnifierTcS ev role $ \uenv -> + -> do { co <- wrapUnifierAndEmit ev role $ \uenv -> do { cos <- zipWith4M (u_arg uenv) new_locs tc_roles tys1 tys2 -- zipWith4M: see Note [Work-list ordering] ; return (mkTyConAppCo role tc cos) } @@ -1449,7 +1449,7 @@ canDecomposableFunTy ev eq_rel af f1@(ty1,m1,a1,r1) f2@(ty2,m2,a2,r2) (ppr ev $$ ppr eq_rel $$ ppr f1 $$ ppr f2) ; case ev of CtWanted (WantedCt { ctev_dest = dest }) - -> do { (co, _, _) <- wrapUnifierTcS ev Nominal $ \ uenv -> + -> do { co <- wrapUnifierAndEmit ev Nominal $ \ uenv -> do { let mult_env = uenv `updUEnvLoc` toInvisibleLoc `setUEnvRole` funRole role SelMult ; mult <- uType mult_env m1 m2 @@ -1694,12 +1694,18 @@ canEqCanLHSHetero ev eq_rel swapped lhs1 ps_xi1 ki1 xi2 ps_xi2 ki2 ; finish emptyRewriterSet (givenCtEvCoercion kind_ev) } CtWanted {} - -> do { (kind_co, cts, unifs) <- wrapUnifierTcS ev Nominal $ \uenv -> - let uenv' = updUEnvLoc uenv (mkKindEqLoc xi1 xi2) - in unSwap swapped (uType uenv') ki1 ki2 + -> do { (unifs, (kind_co, cts)) <- reportUnifications $ + wrapUnifier ev Nominal $ \uenv -> + let uenv' = updUEnvLoc uenv (mkKindEqLoc xi1 xi2) + in unSwap swapped (uType uenv') ki1 ki2 -- mkKindEqLoc: any new constraints, arising from the kind -- unification, say they thay come from unifying xi1~xi2 - ; if not (null unifs) + + -- Emit any unsolved kind equalities + ; unless (isEmptyBag cts) $ + updWorkListTcS (extendWorkListChildEqs ev cts) + + ; if unifs then -- Unifications happened, so start again to do the zonking -- Otherwise we might put something in the inert set that isn't inert startAgainWith (mkNonCanonical ev) @@ -2037,9 +2043,6 @@ canEqCanLHSFinish_try_unification ev eq_rel swapped lhs rhs ; setEvBindIfWanted new_ev EvCanonical $ evCoercion (mkNomReflCo final_rhs) - -- Kick out any constraints that can now be rewritten - ; recordUnification tv - ; return (Stop new_ev (text "Solved by unification")) } --------------------------- @@ -2405,7 +2408,7 @@ FamAppBreaker. Why TauTvs? See [Why TauTvs] below. Critically, we emit the two new constraints (the last two above) -directly instead of calling wrapUnifierTcS. (Otherwise, we'd end up +directly instead of calling wrapUnifier. (Otherwise, we'd end up unifying cbv1 and cbv2 immediately, achieving nothing.) Next, we unify alpha := cbv1 -> cbv2, having eliminated the occurs check. This unification happens immediately following a successful call to ===================================== compiler/GHC/Tc/Solver/FunDeps.hs ===================================== @@ -20,7 +20,6 @@ import GHC.Tc.Utils.Unify( UnifyEnv(..) ) import GHC.Tc.Utils.Monad as TcM import GHC.Tc.Types.Evidence import GHC.Tc.Types.Constraint -import GHC.Tc.Types.CtLoc import GHC.Core.FamInstEnv import GHC.Core.Coercion @@ -39,27 +38,57 @@ import GHC.Utils.Misc( filterOut ) import GHC.Data.Pair -{- ********************************************************************* -* * -* Functional dependencies for dictionaries -* * -************************************************************************ +{- Note [Overview of fundeps] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Here is our plan for dealing with functional dependencies -When we spot an equality arising from a functional dependency, -we now use that equality (a "wanted") to rewrite the work-item -constraint right away. This avoids two dangers +* When we have failed to solve a Wanted constraint, do this + 1. Generate any fundep-equalities [FunDepEqn] from that constraint. + 2. Try to solve that [FunDepEqn] + 3. If any unifications happened, send the constraint back to the + start of the pipeline - Danger 1: If we send the original constraint on down the pipeline - it may react with an instance declaration, and in delicate - situations (when a Given overlaps with an instance) that - may produce new insoluble goals: see #4952 +* Step (1) How we generate those [FunDepEqn] varies: + - tryDictFunDeps: for class constraints (C t1 .. tn) + we look at top-level instances and inert Givens + - tryEqFunDeps: for type-family equalities (F t1 .. tn ~ ty) + we look at top-level family instances + and inert Given family equalities - Danger 2: If we don't rewrite the constraint, it may re-react - with the same thing later, and produce the same equality - again --> termination worries. +* Step (2). We use `solveFunDeps` to solve the [FunDepEqn] in a nested + solver. Key property: + + The ONLY effect of `solveFunDeps` is possibly to perform unifications: -To achieve this required some refactoring of GHC.Tc.Instance.FunDeps (nicer -now!). + - It entirely discards any unsolved fundep equalities. + + - Ite entirely discards any evidence arising from solving fundep equalities + +* Step (3) if we did any unifications in Step (2), we start again with the + current unsolved Wanted. It might now be soluble! + +* For Given constraints, things are different: + - tryDictFunDeps: we do nothing + - tryEqFunDeps: for type-family equalities, we can produce new + actual evidence for built-in type families. E.g. + [G] co : 3 ~ x + 1 + We can produce new evidence + [G] co' : x ~ 2 + So we generate and emit fresh Givens. See + `improveGivenTopFunEqs` and `improveGivenLocalFunEqs` + No unification is involved here, just emitting new Givens. + +(FD1) Consequences for error messages. + Because we discard any unsolved FunDepEqns, we get better error messages. + Consider class C a b | a -> b + instance C Int Bool + and [W] C Int Char + We'll get an insoluble fundep-equality (Char ~ Bool), but it's very + unhelpful to report it. Much better just to say + No instance for C Int Bool + + Similarly if had [W] C Int S, [W] C Int T, it is not helpful to + complain about insoluble (S ~ T). Note [FunDep and implicit parameter reactions] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -107,141 +136,65 @@ Then it is solvable, but its very hard to detect this on the spot. It's exactly the same with implicit parameters, except that the "aggressive" approach would be much easier to implement. -Note [Fundeps with instances, and equality orientation] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This Note describes a delicate interaction that constrains the orientation of -equalities. This one is about fundeps, but the /exact/ same thing arises for -type-family injectivity constraints: see Note [Improvement orientation]. - -doTopFunDepImprovement compares the constraint with all the instance -declarations, to see if we can produce any equalities. E.g - class C2 a b | a -> b - instance C Int Bool -Then the constraint (C Int ty) generates the equality [W] ty ~ Bool. - -There is a nasty corner in #19415 which led to the typechecker looping: - class C s t b | s -> t - instance ... => C (T kx x) (T ky y) Int - T :: forall k. k -> Type - - work_item: dwrk :: C (T @ka (a::ka)) (T @kb0 (b0::kb0)) Char - where kb0, b0 are unification vars - - ==> {doTopFunDepImprovement: compare work_item with instance, - generate /fresh/ unification variables kfresh0, yfresh0, - emit a new Wanted, and add dwrk to inert set} - - Suppose we emit this new Wanted from the fundep: - [W] T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0) - - ==> {solve that equality kb0 := kfresh0, b0 := yfresh0} - Now kick out dwrk, since it mentions kb0 - But now we are back to the start! Loop! - -NB1: This example relies on an instance that does not satisfy the - coverage condition (although it may satisfy the weak coverage - condition), and hence whose fundeps generate fresh unification - variables. Not satisfying the coverage condition is known to - lead to termination trouble, but in this case it's plain silly. - -NB2: In this example, the third parameter to C ensures that the - instance doesn't actually match the Wanted, so we can't use it to - solve the Wanted - -We solve the problem by (#21703): +Note [Partial functional dependencies] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider this (#12522): + type family F x = t | t -> x + type instance F (a, Int) = (Int, G a) +where G is injective; and wanted constraints + [W] F (alpha, beta) ~ (Int, <some type>) - carefully orienting the new Wanted so that all the - freshly-generated unification variables are on the LHS. +The injectivity will give rise to fundep equalities + [W] gamma1 ~ alpha + [W] Int ~ beta - Thus we call unifyWanteds on - T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0) - and /NOT/ - T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0) +The fresh unification variable `gamma1` comes from the fact that we can only do +"partial improvement" here; see Section 5.2 of "Injective type families for +Haskell" (HS'15). -Now we'll unify kfresh0:=kb0, yfresh0:=b0, and all is well. The general idea -is that we want to preferentially eliminate those freshly-generated -unification variables, rather than unifying older variables, which causes -kick-out etc. +Now it is crucial that, when solving, + we unify gamma1 := alpha (YES) + and not alpha := gamma1 (NO) -Keeping younger variables on the left also gives very minor improvement in -the compiler performance by having less kick-outs and allocations (-0.1% on -average). Indeed Historical Note [Eliminate younger unification variables] -in GHC.Tc.Utils.Unify describes an earlier attempt to do so systematically, -apparently now in abeyance. +Why? Because if we do (YES) we'll think we have made some progress +(some unification has happened), and hence go round again; but actually all we +have done is to replace `alpha` with `gamma1`. -But this is is a delicate solution. We must take care to /preserve/ -orientation during solving. Wrinkles: +These "fresh unification variables" in fundep-equalities are ubituitous. +For example + class C a b | a -> b + instance .. => C Int [x] +If we see + [W] C Int alpha +we'll generate a fundep-equality [W] alpha ~ [beta1] +where `beta1` is one of those "fresh unification variables -(W1) We start with - [W] T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0) - Decompose to - [W] kfresh0 ~ kb0 - [W] (yfresh0::kfresh0) ~ (b0::kb0) - Preserve orientation when decomposing!! +This problem shows up in several guises; see (at the bottom) + * Historical Note [Improvement orientation] + * Historical Note [Fundeps with instances, and equality orientation] -(W2) Suppose we happen to tackle the second Wanted from (W1) - first. Then in canEqCanLHSHetero we emit a /kind/ equality, as - well as a now-homogeneous type equality - [W] kco : kfresh0 ~ kb0 - [W] (yfresh0::kfresh0) ~ (b0::kb0) |> (sym kco) - Preserve orientation in canEqCanLHSHetero!! (Failing to - preserve orientation here was the immediate cause of #21703.) +The solution is super-simple: -(W3) There is a potential interaction with the swapping done by - GHC.Tc.Utils.Unify.swapOverTyVars. We think it's fine, but it's - a slight worry. See especially Note [TyVar/TyVar orientation] in - that module. + * A fundep-equality is described by `FunDepEqn`, whose `fd_qtvs` field explicitly + lists the "fresh variables" -The trouble is that "preserving orientation" is a rather global invariant, -and sometimes we definitely do want to swap (e.g. Int ~ alpha), so we don't -even have a precise statement of what the invariant is. The advantage -of the preserve-orientation plan is that it is extremely cheap to implement, -and apparently works beautifully. + * Function `instantiateFunDepEqn` instantiates a `FunDepEqn`, and CRUCIALLY + gives the new unification variables a level one deeper than the current + level. ---- Alternative plan (1) --- -Rather than have an ill-defined invariant, another possiblity is to -elminate those fresh unification variables at birth, when generating -the new fundep-inspired equalities. + * Now, given `alpha ~ beta`, all the unification machinery guarantees, to + unify the variable with the deeper level. See GHC.Tc.Utils.Unify + Note [Deeper level on the left]. That ensures that the fresh `gamma1` + will be eliminated in favour of `alpha`. Hooray. -The key idea is to call `instFlexiX` in `emitFunDepWanteds` on only those -type variables that are guaranteed to give us some progress. This means we -have to locally (without calling emitWanteds) identify the type variables -that do not give us any progress. In the above example, we _know_ that -emitting the two wanteds `kco` and `co` is fruitless. + * Better still, we solve the [FunDepEqn] with + solveFunDeps :: CtEvidence -> [FunDepEqn] -> TcS Bool + It uses `reportUnifications` to see if any unification happened at this + level or outside -- that is, it does NOT report unifications to the fresh + unification variables. So `solveFunDeps` returns True only if it + unifies a variable /other than/ the fresh ones. Bingo. - Q: How do we identify such no-ops? - - 1. Generate a matching substitution from LHS to RHS - ɸ = [kb0 :-> k0, b0 :-> y0] - 2. Call `instFlexiX` on only those type variables that do not appear in the domain of ɸ - ɸ' = instFlexiX ɸ (tvs - domain ɸ) - 3. Apply ɸ' on LHS and then call emitWanteds - unifyWanteds ... (subst ɸ' LHS) RHS - -Why will this work? The matching substitution ɸ will be a best effort -substitution that gives us all the easy solutions. It can be generated with -modified version of `Core/Unify.unify_tys` where we run it in a matching mode -and never generate `SurelyApart` and always return a `MaybeApart Subst` -instead. - -The same alternative plan would work for type-family injectivity constraints: -see Note [Improvement orientation] in GHC.Tc.Solver.Equality. ---- End of Alternative plan (1) --- - ---- Alternative plan (2) --- -We could have a new flavour of TcTyVar (like `TauTv`, `TyVarTv` etc; see GHC.Tc.Utils.TcType.MetaInfo) -for the fresh unification variables introduced by functional dependencies. Say `FunDepTv`. Then in -GHC.Tc.Utils.Unify.swapOverTyVars we could arrange to keep a `FunDepTv` on the left if possible. -Looks possible, but it's one more complication. ---- End of Alternative plan (2) --- - - ---- Historical note: Failed Alternative Plan (3) --- -Previously we used a flag `cc_fundeps` in `CDictCan`. It would flip to False -once we used a fun dep to hint the solver to break and to stop emitting more -wanteds. This solution was not complete, and caused a failures while trying -to solve for transitive functional dependencies (test case: T21703) --- End of Historical note: Failed Alternative Plan (3) -- +Another victory for levels numbers! Note [Do fundeps last] ~~~~~~~~~~~~~~~~~~~~~~ @@ -260,7 +213,7 @@ Consider T4254b: If we interact that Wanted with /both/ the top-level instance, /and/ the local Given, we'll get beta ~ Int and beta ~ b - respectively. That would generate (b~Bool), which would fai. I think + respectively. That would generate (b~Bool), which would fail. I think it doesn't matter which of the two we pick, but historically we have picked the local-fundeps first. @@ -273,7 +226,6 @@ Consider T4254b: (DFL2) is achieved by trying fundeps only on /unsolved/ Wanteds. - Note [Weird fundeps] ~~~~~~~~~~~~~~~~~~~~ Consider class Het a b | a -> b where @@ -296,6 +248,13 @@ as the fundeps. #7875 is a case in point. -} + +{- ********************************************************************* +* * +* Functional dependencies for dictionaries +* * +********************************************************************* -} + tryDictFunDeps :: DictCt -> SolverStage () -- (tryDictFunDeps inst_envs cts) -- * Generate the fundeps from interacting the @@ -334,6 +293,7 @@ tryDictFunDepsLocal dict_ct@(DictCt { di_cls = cls, di_ev = work_ev }) text "imp =" <+> ppr imp $$ text "eqns = " <+> ppr eqns ; if imp then startAgainWith (CDictCan dict_ct) + -- See (DFL1) of Note [Do fundeps last] else continueWith () } where work_pred = ctEvPred work_ev @@ -436,88 +396,6 @@ and Given/instance fundeps entirely. Functional dependencies for type families * * ********************************************************************** - -Note [Reverse order of fundep equations] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this scenario (from dependent/should_fail/T13135_simple): - - type Sig :: Type -> Type - data Sig a = SigFun a (Sig a) - - type SmartFun :: forall (t :: Type). Sig t -> Type - type family SmartFun sig = r | r -> sig where - SmartFun @Type (SigFun @Type a sig) = a -> SmartFun @Type sig - - [W] SmartFun @kappa sigma ~ (Int -> Bool) - -The injectivity of SmartFun allows us to produce two new equalities: - - [W] w1 :: Type ~ kappa - [W] w2 :: SigFun @Type Int beta ~ sigma - -for some fresh (beta :: SigType). The second Wanted here is actually -heterogeneous: the LHS has type Sig Type while the RHS has type Sig kappa. -Of course, if we solve the first wanted first, the second becomes homogeneous. - -When looking for injectivity-inspired equalities, we work left-to-right, -producing the two equalities in the order written above. However, these -equalities are then passed into wrapUnifierTcS, which will fail, adding these -to the work list. However, crucially, the work list operates like a *stack*. -So, because we add w1 and then w2, we process w2 first. This is silly: solving -w1 would unlock w2. So we make sure to add equalities to the work -list in left-to-right order, which requires a few key calls to 'reverse'. - -This treatment is also used for class-based functional dependencies, although -we do not have a program yet known to exhibit a loop there. It just seems -like the right thing to do. - -When this was originally conceived, it was necessary to avoid a loop in T13135. -That loop is now avoided by continuing with the kind equality (not the type -equality) in canEqCanLHSHetero (see Note [Equalities with heterogeneous kinds]). -However, the idea of working left-to-right still seems worthwhile, and so the calls -to 'reverse' remain. - -Note [Improvement orientation] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -See also Note [Fundeps with instances, and equality orientation], which describes -the Exact Same Problem, with the same solution, but for functional dependencies. - -A very delicate point is the orientation of equalities -arising from injectivity improvement (#12522). Suppose we have - type family F x = t | t -> x - type instance F (a, Int) = (Int, G a) -where G is injective; and wanted constraints - - [W] F (alpha, beta) ~ (Int, <some type>) - -The injectivity will give rise to constraints - - [W] gamma1 ~ alpha - [W] Int ~ beta - -The fresh unification variable gamma1 comes from the fact that we -can only do "partial improvement" here; see Section 5.2 of -"Injective type families for Haskell" (HS'15). - -Now, it's very important to orient the equations this way round, -so that the fresh unification variable will be eliminated in -favour of alpha. If we instead had - [W] alpha ~ gamma1 -then we would unify alpha := gamma1; and kick out the wanted -constraint. But when we substitute it back in, it'd look like - [W] F (gamma1, beta) ~ fuv -and exactly the same thing would happen again! Infinite loop. - ----> ToDo: all this fragility has gone away! Fix the Note! <--- - -This all seems fragile, and it might seem more robust to avoid -introducing gamma1 in the first place, in the case where the -actual argument (alpha, beta) partly matches the improvement -template. But that's a bit tricky, esp when we remember that the -kinds much match too; so it's easier to let the normal machinery -handle it. Instead we are careful to orient the new -equality with the template on the left. Delicate, but it works. - -} -------------------- @@ -562,27 +440,18 @@ improveWantedTopFunEqs :: TyCon -> [TcType] -> CtEvidence -> Xi -> TcS Bool -- TyCon is definitely a type family -- Work-item is a Wanted improveWantedTopFunEqs fam_tc args ev rhs_ty - = do { eqns <- improve_wanted_top_fun_eqs fam_tc args rhs_ty + = do { fd_eqns <- improve_wanted_top_fun_eqs fam_tc args rhs_ty ; traceTcS "improveTopFunEqs" (vcat [ text "lhs:" <+> ppr fam_tc <+> ppr args , text "rhs:" <+> ppr rhs_ty - , text "eqns:" <+> ppr eqns ]) - ; unifyFunDeps ev Nominal $ \uenv -> - uPairsTcM (bump_depth uenv) (reverse eqns) } - -- Missing that `reverse` causes T13135 and T13135_simple to loop. - -- See Note [Reverse order of fundep equations] - -- ToDo: is this still a problem? + , text "eqns:" <+> ppr fd_eqns ]) + ; solveFunDeps ev fd_eqns } - where - bump_depth env = env { u_loc = bumpCtLocDepth (u_loc env) } - -- ToDo: this location is wrong; it should be FunDepOrigin2 - -- See #14778 - -improve_wanted_top_fun_eqs :: TyCon -> [TcType] -> Xi - -> TcS [TypeEqn] +improve_wanted_top_fun_eqs :: TyCon -> [TcType] -> Xi -> TcS [FunDepEqn] -- TyCon is definitely a type family improve_wanted_top_fun_eqs fam_tc lhs_tys rhs_ty | Just ops <- isBuiltInSynFamTyCon_maybe fam_tc - = return (map snd $ tryInteractTopFam ops fam_tc lhs_tys rhs_ty) + = return [FDEqn { fd_qtvs = [] + , fd_eqs = map snd $ tryInteractTopFam ops fam_tc lhs_tys rhs_ty }] -- ToDo: use ideas in #23162 for closed type families; injectivity only for open @@ -593,16 +462,20 @@ improve_wanted_top_fun_eqs fam_tc lhs_tys rhs_ty ; top_eqns <- improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty ; let local_eqns = improve_injective_wanted_famfam inj_args fam_tc lhs_tys rhs_ty ; traceTcS "improve_wanted_top_fun_eqs" $ - vcat [ ppr fam_tc, text "local_eqns" <+> ppr local_eqns, text "top_eqns" <+> ppr top_eqns ] - -- xxx ToDo: this does both local and top => bug? + vcat [ ppr fam_tc + , text "local_eqns" <+> ppr local_eqns + , text "top_eqns" <+> ppr top_eqns ] + -- xxx ToDo: this does both local and top => bug? ; return (local_eqns ++ top_eqns) } | otherwise -- No injectivity = return [] -improve_injective_wanted_top :: FamInstEnvs -> [Bool] -> TyCon -> [TcType] -> Xi -> TcS [TypeEqn] +improve_injective_wanted_top :: FamInstEnvs -> [Bool] -> TyCon + -> [TcType] -> Xi -> TcS [FunDepEqn] -- Interact with top-level instance declarations -- See Section 5.2 in the Injective Type Families paper +-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty = concatMapM do_one branches where @@ -617,7 +490,7 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty | otherwise = [] - do_one :: CoAxBranch -> TcS [TypeEqn] + do_one :: CoAxBranch -> TcS [FunDepEqn] do_one branch@(CoAxBranch { cab_tvs = branch_tvs, cab_lhs = branch_lhs_tys, cab_rhs = branch_rhs }) | let in_scope1 = in_scope `extendInScopeSetList` branch_tvs , Just subst <- tcUnifyTyForInjectivity False in_scope1 branch_rhs rhs_ty @@ -638,9 +511,10 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty , text "rhs_ty" <+> ppr rhs_ty , text "subst" <+> ppr subst , text "subst1" <+> ppr subst1 ] - ; if apartnessCheck (substTys subst1 branch_lhs_tys) branch - then do { traceTcS "improv_inj_top1" (ppr branch_lhs_tys) - ; return (mkInjectivityEqns inj_args (map (substTy subst1) branch_lhs_tys) lhs_tys) } + ; let branch_lhs_tys' = substTys subst1 branch_lhs_tys + ; if apartnessCheck branch_lhs_tys' branch + then do { traceTcS "improv_inj_top1" (ppr branch_lhs_tys') + ; return [mkInjectivityFDEqn inj_args branch_lhs_tys' lhs_tys] } -- NB: The fresh unification variables (from unsubstTvs) are on the left -- See Note [Improvement orientation] else do { traceTcS "improve_inj_top2" empty; return [] } } @@ -651,20 +525,25 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty in_scope = mkInScopeSet (tyCoVarsOfType rhs_ty) -improve_injective_wanted_famfam :: [Bool] -> TyCon -> [TcType] -> Xi -> [TypeEqn] +improve_injective_wanted_famfam :: [Bool] -> TyCon -> [TcType] -> Xi -> [FunDepEqn] -- Interact with itself, specifically F s1 s2 ~ F t1 t2 +-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are improve_injective_wanted_famfam inj_args fam_tc lhs_tys rhs_ty | Just (tc, rhs_tys) <- tcSplitTyConApp_maybe rhs_ty , tc == fam_tc - = mkInjectivityEqns inj_args lhs_tys rhs_tys + = [mkInjectivityFDEqn inj_args lhs_tys rhs_tys] | otherwise = [] -mkInjectivityEqns :: [Bool] -> [TcType] -> [TcType] -> [TypeEqn] +mkInjectivityFDEqn :: [Bool] -> [TcType] -> [TcType] -> FunDepEqn -- When F s1 s2 s3 ~ F t1 t2 t3, and F has injectivity info [True,False,True] --- return the equations [Pair s1 t1, Pair s3 t3] -mkInjectivityEqns inj_args lhs_args rhs_args - = [ Pair lhs_arg rhs_arg | (True, lhs_arg, rhs_arg) <- zip3 inj_args lhs_args rhs_args ] +-- return the FDEqn { fd_eqs = [Pair s1 t1, Pair s3 t3] } +-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are +mkInjectivityFDEqn inj_args lhs_args rhs_args + = FDEqn { fd_qtvs = [], fd_eqs = eqs } + where + eqs = [ Pair lhs_arg rhs_arg + | (True, lhs_arg, rhs_arg) <- zip3 inj_args lhs_args rhs_args ] --------------------------------------------- improveLocalFunEqs :: TyCon -> [TcType] -> EqCt -- F args ~ rhs @@ -765,30 +644,23 @@ improveWantedLocalFunEqs funeqs_for_tc fam_tc args work_ev rhs = [] -------------------- - do_one_built_in ops rhs (EqCt { eq_lhs = TyFamLHS _ iargs, eq_rhs = irhs, eq_ev = inert_ev }) + do_one_built_in ops rhs (EqCt { eq_lhs = TyFamLHS _ iargs, eq_rhs = irhs }) | irhs `tcEqType` rhs - = mk_fd_eqns inert_ev (map snd $ tryInteractInertFam ops fam_tc args iargs) + = [FDEqn { fd_qtvs = [], fd_eqs = map snd $ tryInteractInertFam ops fam_tc args iargs }] | otherwise = [] do_one_built_in _ _ _ = pprPanic "interactFunEq 1" (ppr fam_tc) -- TyVarLHS -------------------- -- See Note [Type inference for type families with injectivity] - do_one_injective inj_args rhs (EqCt { eq_lhs = TyFamLHS _ inert_args - , eq_rhs = irhs, eq_ev = inert_ev }) + do_one_injective inj_args rhs (EqCt { eq_lhs = TyFamLHS _ inert_args, eq_rhs = irhs }) | rhs `tcEqType` irhs - = mk_fd_eqns inert_ev $ mkInjectivityEqns inj_args args inert_args + = [mkInjectivityFDEqn inj_args args inert_args] | otherwise = [] do_one_injective _ _ _ = pprPanic "interactFunEq 2" (ppr fam_tc) -- TyVarLHS - -------------------- - -- ToDO: fix me - mk_fd_eqns :: CtEvidence -> [TypeEqn] -> [FunDepEqn] - mk_fd_eqns _inert_ev eqns - | null eqns = [] - | otherwise = [ FDEqn { fd_qtvs = [], fd_eqs = eqns } ] {- Note [Type inference for type families with injectivity] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -870,13 +742,11 @@ just an optimization so we don't lose anything in terms of completeness of solving. -} -{- -************************************************************************ +{- ********************************************************************* * * Emitting equalities arising from fundeps * * -************************************************************************ --} +********************************************************************* -} solveFunDeps :: CtEvidence -- The work item -> [FunDepEqn] @@ -885,16 +755,18 @@ solveFunDeps :: CtEvidence -- The work item -- By "solve" we mean: (only) do unifications. We do not generate evidence, and -- other than unifications there should be no effects whatsoever -- --- Return True if some unifications happened --- See Note [FunDep and implicit parameter reactions] +-- The returned Bool is True if some unifications happened +-- +-- See Note [Overview of fundeps] solveFunDeps work_ev fd_eqns | null fd_eqns - = return False -- common case noop + = return False -- Common case no-op | otherwise = do { (unif_happened, _res) - <- nestFunDepsTcS $ - do { (_, eqs) <- unifyForAllBody work_ev Nominal do_fundeps + <- reportUnifications $ + nestFunDepsTcS $ + do { (_, eqs) <- wrapUnifier work_ev Nominal do_fundeps ; solveSimpleWanteds eqs } -- ToDo: why solveSimpleWanteds? Answer -- (a) don't rely on eager unifier @@ -920,6 +792,7 @@ instantiateFunDepEqn (FDEqn { fd_qtvs = tvs, fd_eqs = eqs }) where rev_eqs = reverse eqs -- (reverse eqs): See Note [Reverse order of fundep equations] + -- ToDo: is this still a problem? subst_pair subst (Pair ty1 ty2) = Pair (substTyUnchecked subst' ty1) ty2 @@ -934,3 +807,257 @@ instantiateFunDepEqn (FDEqn { fd_qtvs = tvs, fd_eqs = eqs }) -- though ty1 will never (currently) be a poytype, so this -- InScopeSet will never be looked at. + +{- Note [Reverse order of fundep equations] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider this scenario (from dependent/should_fail/T13135_simple): + + type Sig :: Type -> Type + data Sig a = SigFun a (Sig a) + + type SmartFun :: forall (t :: Type). Sig t -> Type + type family SmartFun sig = r | r -> sig where + SmartFun @Type (SigFun @Type a sig) = a -> SmartFun @Type sig + + [W] SmartFun @kappa sigma ~ (Int -> Bool) + +The injectivity of SmartFun allows us to produce two new equalities: + + [W] w1 :: Type ~ kappa + [W] w2 :: SigFun @Type Int beta ~ sigma + +for some fresh (beta :: SigType). The second Wanted here is actually +heterogeneous: the LHS has type Sig Type while the RHS has type Sig kappa. +Of course, if we solve the first wanted first, the second becomes homogeneous. + +When looking for injectivity-inspired equalities, we work left-to-right, +producing the two equalities in the order written above. However, these +equalities are then passed into wrapUnifierAndEmit, which will fail, adding these +to the work list. However, the work list operates like a *stack*. +So, because we add w1 and then w2, we process w2 first. This is silly: solving +w1 would unlock w2. So we make sure to add equalities to the work +list in left-to-right order, which requires a few key calls to 'reverse'. + +When this was originally conceived, it was necessary to avoid a loop in T13135. +That loop is now avoided by continuing with the kind equality (not the type +equality) in canEqCanLHSHetero (see Note [Equalities with heterogeneous kinds]). +However, the idea of working left-to-right still seems worthwhile, and so the calls +to 'reverse' remain. + +This treatment is also used for class-based functional dependencies, although +we do not have a program yet known to exhibit a loop there. It just seems +like the right thing to do. + +In general, I believe this is (now, anyway) just an optimisation, not required +to avoid loops. +-} + +{- ********************************************************************* +* * + Historical notes + + Here are a bunch of Notes that are rendered obselete by + Note [Partial functional dependencies] + +* * +********************************************************************* -} + +{- +Historical Note [Improvement orientation] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +See also Note [Fundeps with instances, and equality orientation], which describes +the Exact Same Problem, with the same solution, but for functional dependencies. + +A very delicate point is the orientation of equalities +arising from injectivity improvement (#12522). Suppose we have + type family F x = t | t -> x + type instance F (a, Int) = (Int, G a) +where G is injective; and wanted constraints + + [W] F (alpha, beta) ~ (Int, <some type>) + +The injectivity will give rise to constraints + + [W] gamma1 ~ alpha + [W] Int ~ beta + +The fresh unification variable gamma1 comes from the fact that we +can only do "partial improvement" here; see Section 5.2 of +"Injective type families for Haskell" (HS'15). + +Now, it's very important to orient the equations this way round, +so that the fresh unification variable will be eliminated in +favour of alpha. If we instead had + [W] alpha ~ gamma1 +then we would unify alpha := gamma1; and kick out the wanted +constraint. But when we substitute it back in, it'd look like + [W] F (gamma1, beta) ~ fuv +and exactly the same thing would happen again! Infinite loop. + +---> ToDo: all this fragility has gone away! Fix the Note! <--- + +This all seems fragile, and it might seem more robust to avoid +introducing gamma1 in the first place, in the case where the +actual argument (alpha, beta) partly matches the improvement +template. But that's a bit tricky, esp when we remember that the +kinds much match too; so it's easier to let the normal machinery +handle it. Instead we are careful to orient the new +equality with the template on the left. Delicate, but it works. + +Historical Note [Fundeps with instances, and equality orientation] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This Note describes a delicate interaction that constrains the orientation of +equalities. This one is about fundeps, but the /exact/ same thing arises for +type-family injectivity constraints: see Note [Improvement orientation]. + +doTopFunDepImprovement compares the constraint with all the instance +declarations, to see if we can produce any equalities. E.g + class C2 a b | a -> b + instance C Int Bool +Then the constraint (C Int ty) generates the equality [W] ty ~ Bool. + +There is a nasty corner in #19415 which led to the typechecker looping: + class C s t b | s -> t + instance ... => C (T kx x) (T ky y) Int + T :: forall k. k -> Type + + work_item: dwrk :: C (T @ka (a::ka)) (T @kb0 (b0::kb0)) Char + where kb0, b0 are unification vars + + ==> {doTopFunDepImprovement: compare work_item with instance, + generate /fresh/ unification variables kfresh0, yfresh0, + emit a new Wanted, and add dwrk to inert set} + + Suppose we emit this new Wanted from the fundep: + [W] T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0) + + ==> {solve that equality kb0 := kfresh0, b0 := yfresh0} + Now kick out dwrk, since it mentions kb0 + But now we are back to the start! Loop! + +NB1: This example relies on an instance that does not satisfy the + coverage condition (although it may satisfy the weak coverage + condition), and hence whose fundeps generate fresh unification + variables. Not satisfying the coverage condition is known to + lead to termination trouble, but in this case it's plain silly. + +NB2: In this example, the third parameter to C ensures that the + instance doesn't actually match the Wanted, so we can't use it to + solve the Wanted + +We solve the problem by (#21703): + + carefully orienting the new Wanted so that all the + freshly-generated unification variables are on the LHS. + + Thus we call unifyWanteds on + T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0) + and /NOT/ + T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0) + +Now we'll unify kfresh0:=kb0, yfresh0:=b0, and all is well. The general idea +is that we want to preferentially eliminate those freshly-generated +unification variables, rather than unifying older variables, which causes +kick-out etc. + +Keeping younger variables on the left also gives very minor improvement in +the compiler performance by having less kick-outs and allocations (-0.1% on +average). Indeed Historical Note [Eliminate younger unification variables] +in GHC.Tc.Utils.Unify describes an earlier attempt to do so systematically, +apparently now in abeyance. + +But this is is a delicate solution. We must take care to /preserve/ +orientation during solving. Wrinkles: + +(W1) We start with + [W] T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0) + Decompose to + [W] kfresh0 ~ kb0 + [W] (yfresh0::kfresh0) ~ (b0::kb0) + Preserve orientation when decomposing!! + +(W2) Suppose we happen to tackle the second Wanted from (W1) + first. Then in canEqCanLHSHetero we emit a /kind/ equality, as + well as a now-homogeneous type equality + [W] kco : kfresh0 ~ kb0 + [W] (yfresh0::kfresh0) ~ (b0::kb0) |> (sym kco) + Preserve orientation in canEqCanLHSHetero!! (Failing to + preserve orientation here was the immediate cause of #21703.) + +(W3) There is a potential interaction with the swapping done by + GHC.Tc.Utils.Unify.swapOverTyVars. We think it's fine, but it's + a slight worry. See especially Note [TyVar/TyVar orientation] in + that module. + +The trouble is that "preserving orientation" is a rather global invariant, +and sometimes we definitely do want to swap (e.g. Int ~ alpha), so we don't +even have a precise statement of what the invariant is. The advantage +of the preserve-orientation plan is that it is extremely cheap to implement, +and apparently works beautifully. + +--- Alternative plan (1) --- +Rather than have an ill-defined invariant, another possiblity is to +elminate those fresh unification variables at birth, when generating +the new fundep-inspired equalities. + +The key idea is to call `instFlexiX` in `emitFunDepWanteds` on only those +type variables that are guaranteed to give us some progress. This means we +have to locally (without calling emitWanteds) identify the type variables +that do not give us any progress. In the above example, we _know_ that +emitting the two wanteds `kco` and `co` is fruitless. + + Q: How do we identify such no-ops? + + 1. Generate a matching substitution from LHS to RHS + ɸ = [kb0 :-> k0, b0 :-> y0] + 2. Call `instFlexiX` on only those type variables that do not appear in the domain of ɸ + ɸ' = instFlexiX ɸ (tvs - domain ɸ) + 3. Apply ɸ' on LHS and then call emitWanteds + unifyWanteds ... (subst ɸ' LHS) RHS + +Why will this work? The matching substitution ɸ will be a best effort +substitution that gives us all the easy solutions. It can be generated with +modified version of `Core/Unify.unify_tys` where we run it in a matching mode +and never generate `SurelyApart` and always return a `MaybeApart Subst` +instead. + +The same alternative plan would work for type-family injectivity constraints: +see Note [Improvement orientation] in GHC.Tc.Solver.Equality. +--- End of Alternative plan (1) --- + +--- Alternative plan (2) --- +We could have a new flavour of TcTyVar (like `TauTv`, `TyVarTv` etc; see GHC.Tc.Utils.TcType.MetaInfo) +for the fresh unification variables introduced by functional dependencies. Say `FunDepTv`. Then in +GHC.Tc.Utils.Unify.swapOverTyVars we could arrange to keep a `FunDepTv` on the left if possible. +Looks possible, but it's one more complication. +--- End of Alternative plan (2) --- + + +--- Historical note: Failed Alternative Plan (3) --- +Previously we used a flag `cc_fundeps` in `CDictCan`. It would flip to False +once we used a fun dep to hint the solver to break and to stop emitting more +wanteds. This solution was not complete, and caused a failures while trying +to solve for transitive functional dependencies (test case: T21703) +-- End of Historical note: Failed Alternative Plan (3) -- + + +Historical Note +~~~~~~~~~~~~~~~ +This Note (anonymous, but related to dict-solving) is rendered obselete by + - Danger 1: solved by Note [Instance and Given overlap] + - Danger 2: solved by fundeps being idempotent + +When we spot an equality arising from a functional dependency, +we now use that equality (a "wanted") to rewrite the work-item +constraint right away. This avoids two dangers + + Danger 1: If we send the original constraint on down the pipeline + it may react with an instance declaration, and in delicate + situations (when a Given overlaps with an instance) that + may produce new insoluble goals: see #4952 + + Danger 2: If we don't rewrite the constraint, it may re-react + with the same thing later, and produce the same equality + again --> termination worries. + +-} ===================================== compiler/GHC/Tc/Solver/Monad.hs ===================================== @@ -81,7 +81,7 @@ module GHC.Tc.Solver.Monad ( lookupInertDict, -- The Model - recordUnification, recordUnifications, kickOutRewritable, + recordUnification, kickOutRewritable, -- Inert Safe Haskell safe-overlap failures insertSafeOverlapFailureTcS, @@ -102,7 +102,7 @@ module GHC.Tc.Solver.Monad ( instDFunType, -- Unification - wrapUnifierX, wrapUnifierTcS, unifyFunDeps, uPairsTcM, unifyForAllBody, + wrapUnifier, wrapUnifierAndEmit, uPairsTcM, -- MetaTyVars newFlexiTcSTy, instFlexiX, instFlexiXTcM, @@ -908,21 +908,19 @@ data TcSEnv = TcSEnv { tcs_ev_binds :: EvBindsVar, - tcs_unif_lvl :: IORef (Maybe TcLevel), - -- The Unification Level Flag - -- Outermost level at which we have unified a meta tyvar - -- Starts at Nothing, then (Just i), then (Just j) where j<i - -- See Note [The Unification Level Flag] + tcs_unif_lvl :: TcRef WhatUnifications, + -- Level of the outermost meta-tyvar that we have unified + -- See Note [WhatUnifications] in GHC.Tc.Utils.Unify - tcs_count :: IORef Int, -- Global step count + tcs_count :: TcRef Int, -- Global step count - tcs_inerts :: IORef InertSet, -- Current inert set + tcs_inerts :: TcRef InertSet, -- Current inert set -- | The mode of operation for the constraint solver. -- See Note [TcSMode] tcs_mode :: TcSMode, - tcs_worklist :: IORef WorkList + tcs_worklist :: TcRef WorkList } --------------- @@ -1103,7 +1101,7 @@ runTcSWithEvBinds' mode ev_binds_var thing_inside ; inert_var <- TcM.newTcRef (emptyInertSet tc_lvl) ; wl_var <- TcM.newTcRef emptyWorkList - ; unif_lvl_var <- TcM.newTcRef Nothing + ; unif_lvl_var <- TcM.newTcRef NoUnificationsYet ; let env = TcSEnv { tcs_ev_binds = ev_binds_var , tcs_unif_lvl = unif_lvl_var , tcs_count = step_count @@ -1202,10 +1200,9 @@ nestImplicTcS ev_binds_var inner_tclvl (TcS thing_inside) #endif ; return res } -nestFunDepsTcS :: TcS a -> TcS (Bool, a) +nestFunDepsTcS :: TcS a -> TcS a nestFunDepsTcS (TcS thing_inside) - = reportUnifications $ - TcS $ \ env@(TcSEnv { tcs_inerts = inerts_var }) -> + = TcS $ \ env@(TcSEnv { tcs_inerts = inerts_var }) -> TcM.pushTcLevelM_ $ -- pushTcLevelTcM: increase the level so that unification variables -- allocated by the fundep-creation itself don't count as useful unifications @@ -1220,6 +1217,10 @@ nestFunDepsTcS (TcS thing_inside) ; TcM.traceTc "nestFunDepsTcS {" empty ; res <- thing_inside nest_env ; TcM.traceTc "nestFunDepsTcS }" empty + + -- Unlike nestTcS, do /not/ do `updateInertsWith`; we are going to + -- abandon everything about this sub-computation except its unifications + ; return res } nestTcS :: TcS a -> TcS a @@ -1733,72 +1734,22 @@ pushLevelNoWorkList _ (TcS thing_inside) * * ********************************************************************* -} -{- Note [The Unification Level Flag] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider a deep tree of implication constraints - forall[1] a. -- Outer-implic - C alpha[1] -- Simple - forall[2] c. ....(C alpha[1]).... -- Implic-1 - forall[2] b. ....(alpha[1] ~ Int).... -- Implic-2 - -The (C alpha) is insoluble until we know alpha. We solve alpha -by unifying alpha:=Int somewhere deep inside Implic-2. But then we -must try to solve the Outer-implic all over again. This time we can -solve (C alpha) both in Outer-implic, and nested inside Implic-1. - -When should we iterate solving a level-n implication? -Answer: if any unification of a tyvar at level n takes place - in the ic_implics of that implication. - -* What if a unification takes place at level n-1? Then don't iterate - level n, because we'll iterate level n-1, and that will in turn iterate - level n. - -* What if a unification takes place at level n, in the ic_simples of - level n? No need to track this, because the kick-out mechanism deals - with it. (We can't drop kick-out in favour of iteration, because kick-out - works for skolem-equalities, not just unifications.) - -So the monad-global Unification Level Flag, kept in tcs_unif_lvl keeps -track of - - Whether any unifications at all have taken place (Nothing => no unifications) - - If so, what is the outermost level that has seen a unification (Just lvl) - -The iteration is done in the simplify_loop/maybe_simplify_again loop in GHC.Tc.Solver. - -It is helpful not to iterate unless there is a chance of progress. #8474 is -an example: - - * There's a deeply-nested chain of implication constraints. - ?x:alpha => ?y1:beta1 => ... ?yn:betan => [W] ?x:Int - - * From the innermost one we get a [W] alpha[1] ~ Int, - so we can unify. - - * It's better not to iterate the inner implications, but go all the - way out to level 1 before iterating -- because iterating level 1 - will iterate the inner levels anyway. - -(In the olden days when we "floated" thse Derived constraints, this was -much, much more important -- we got exponential behaviour, as each iteration -produced the same Derived constraint.) --} - - unifyTyVar :: TcTyVar -> TcType -> TcS () -- Unify a meta-tyvar with a type -- We should never unify the same variable twice! +-- C.f. GHC.Tc.Utils.Unify.unifyTyVar unifyTyVar tv ty = assertPpr (isMetaTyVar tv) (ppr tv) $ do { liftZonkTcS (TcM.writeMetaTyVar tv ty) -- Produces a trace message - ; recordUnification tv } + ; uni_ref <- getWhatUnifications + ; wrapTcS $ recordUnification uni_ref tv } reportUnifications :: TcS a -> TcS (Bool, a) --- Record whether any unifications are done by thing_inside +-- Record whether any useful unifications are done by thing_inside -- Remember to propagate the information to the enclosing context reportUnifications (TcS thing_inside) = TcS $ \ env@(TcSEnv { tcs_unif_lvl = outer_ul_var }) -> - do { inner_ul_var <- TcM.newTcRef Nothing + do { inner_ul_var <- TcM.newTcRef NoUnificationsYet ; res <- thing_inside (env { tcs_unif_lvl = inner_ul_var }) @@ -1806,25 +1757,19 @@ reportUnifications (TcS thing_inside) ; mb_inner_lvl <- TcM.readTcRef inner_ul_var ; case mb_inner_lvl of - Just unif_lvl + UnificationsDone unif_lvl | ambient_lvl `deeperThanOrSame` unif_lvl -> -- Some useful unifications took place - do { mb_outer_lvl <- TcM.readTcRef outer_ul_var - ; TcM.traceTc "reportUnifications" $ - vcat [ text "ambient =" <+> ppr ambient_lvl - , text "unif_lvl =" <+> ppr unif_lvl - , text "mb_outer =" <+> ppr mb_outer_lvl ] - ; case mb_outer_lvl of - Just outer_unif_lvl | unif_lvl `deeperThanOrSame` outer_unif_lvl - -> -- No need to update: outer_unif_lvl is already shallower - return () - _ -> -- Update the outer level - TcM.writeTcRef outer_ul_var (Just unif_lvl) + do { recordUnificationLevel outer_ul_var unif_lvl ; return (True, res) } _ -> -- No useful unifications return (False, res) } +getWhatUnifications :: TcS (TcRef WhatUnifications) +getWhatUnifications + = TcS $ \env -> return (tcs_unif_lvl env) + traceUnificationFlag :: String -> TcS () traceUnificationFlag str = TcS $ \env -> @@ -1837,7 +1782,8 @@ traceUnificationFlag str getUnificationFlag :: TcS Bool -- We are at ambient level i --- If the unification flag = Just i, reset it to Nothing and return True +-- If the unification flag = UnificationsDone i, +-- reset it to NoUnificationsYet, and return True -- Otherwise leave it unchanged and return False getUnificationFlag = TcS $ \env -> @@ -1848,39 +1794,13 @@ getUnificationFlag vcat [ text "ambient:" <+> ppr ambient_lvl , text "unif_lvl:" <+> ppr mb_lvl ] ; case mb_lvl of - Nothing -> return False - Just unif_lvl | ambient_lvl `strictlyDeeperThan` unif_lvl - -> return False - | otherwise - -> do { TcM.writeTcRef ref Nothing - ; return True } } - -recordUnification :: TcTyVar -> TcS () -recordUnification tv = setUnificationFlagTo (tcTyVarLevel tv) - -recordUnifications :: [TcTyVar] -> TcS () -recordUnifications tvs - = case tvs of - [] -> return () - (tv:tvs) -> do { traceTcS "recordUnifications" (ppr min_tv_lvl $$ ppr tvs) - ; setUnificationFlagTo min_tv_lvl } - where - min_tv_lvl = foldr (minTcLevel . tcTyVarLevel) (tcTyVarLevel tv) tvs - -setUnificationFlagTo :: TcLevel -> TcS () --- (setUnificationFlag i) sets the unification level to (Just i) --- unless it already is (Just j) where j <= i -setUnificationFlagTo lvl - = TcS $ \env -> - do { let ref = tcs_unif_lvl env - ; mb_lvl <- TcM.readTcRef ref - ; case mb_lvl of - Just unif_lvl | lvl `deeperThanOrSame` unif_lvl - -> do { TcM.traceTc "set-uni-flag skip" $ - vcat [ text "lvl" <+> ppr lvl, text "unif_lvl" <+> ppr unif_lvl ] - ; return () } - _ -> do { TcM.traceTc "set-uni-flag" (ppr lvl) - ; TcM.writeTcRef ref (Just lvl) } } + NoUnificationsYet -> return False + UnificationsDone unif_lvl + | ambient_lvl `strictlyDeeperThan` unif_lvl + -> return False + | otherwise + -> do { TcM.writeTcRef ref NoUnificationsYet + ; return True } } {- ********************************************************************* @@ -2182,77 +2102,30 @@ solverDepthError loc ty * * ************************************************************************ -Note [wrapUnifierTcS] -~~~~~~~~~~~~~~~~~~~ +Note [wrapUnifier] +~~~~~~~~~~~~~~~~~~ When decomposing equalities we often create new wanted constraints for (s ~ t). But what if s=t? Then it'd be faster to return Refl right away. Rather than making an equality test (which traverses the structure of the type, -perhaps fruitlessly), we call uType (via wrapUnifierTcS) to traverse the common +perhaps fruitlessly), we call uType (via wrapUnifier) to traverse the common structure, and bales out when it finds a difference by creating a new deferred Wanted constraint. But where it succeeds in finding common structure, it just builds a coercion to reflect it. This is all much faster than creating a new constraint, putting it in the work list, picking it out, canonicalising it, etc etc. - -Note [unifyFunDeps] -~~~~~~~~~~~~~~~~~~~ -The Bool returned by `unifyFunDeps` is True if we have unified a variable -that occurs in the constraint we are trying to solve; it is not in the -inert set so `wrapUnifierTcS` won't kick it out. Instead we want to send it -back to the start of the pipeline. Hence the Bool. - -It's vital that we don't return (not (null unified)) because the fundeps -may create fresh variables; unifying them (alone) should not make us send -the constraint back to the start, or we'll get an infinite loop. See -Note [Fundeps with instances, and equality orientation] in GHC.Tc.Solver.Dict -and Note [Improvement orientation] in GHC.Tc.Solver.Equality. -} uPairsTcM :: UnifyEnv -> [TypeEqn] -> TcM () uPairsTcM uenv eqns = mapM_ (\(Pair ty1 ty2) -> uType uenv ty1 ty2) eqns -unifyFunDeps :: CtEvidence -> Role - -> (UnifyEnv -> TcM ()) - -> TcS Bool -unifyFunDeps ev role do_unifications - = do { (_, _, unified) <- wrapUnifierTcS ev role do_unifications - ; return (any (`elemVarSet` fvs) unified) } - -- See Note [unifyFunDeps] - where - fvs = tyCoVarsOfType (ctEvPred ev) - -unifyForAllBody :: CtEvidence -> Role -> (UnifyEnv -> TcM a) - -> TcS (a, Cts) --- We /return/ the equality constraints we generate, --- rather than emitting them into the monad. --- See See (SF5) in Note [Solving forall equalities] in GHC.Tc.Solver.Equality -unifyForAllBody ev role unify_body - = do { (res, cts, unified) <- wrapUnifierX ev role unify_body - - -- Record the unificaions we have done - ; recordUnifications unified - - ; return (res, cts) } - -wrapUnifierTcS :: CtEvidence -> Role - -> (UnifyEnv -> TcM a) -- Some calls to uType - -> TcS (a, Bag Ct, [TcTyVar]) --- Invokes the do_unifications argument, with a suitable UnifyEnv. --- Emit deferred equalities and kick-out from the inert set as a --- result of any unifications. --- Very good short-cut when the two types are equal, or nearly so --- See Note [wrapUnifierTcS] --- --- The [TcTyVar] is the list of unification variables that were --- unified the process; the (Bag Ct) are the deferred constraints. - -wrapUnifierTcS ev role do_unifications - = do { (res, cts, unified) <- wrapUnifierX ev role do_unifications - - -- Record the unificaions we have done - ; recordUnifications unified +wrapUnifierAndEmit :: CtEvidence -> Role + -> (UnifyEnv -> TcM a) -- Some calls to uType + -> TcS a +-- Like wrapUnifier, but emits any unsolved equalities into the work-list +wrapUnifierAndEmit ev role do_unifications + = do { (res, cts) <- wrapUnifier ev role do_unifications -- Emit the deferred constraints -- See Note [Work-list ordering] in GHC.Tc.Solved.Equality @@ -2263,31 +2136,40 @@ wrapUnifierTcS ev role do_unifications ; unless (isEmptyBag cts) $ updWorkListTcS (extendWorkListChildEqs ev cts) - ; return (res, cts, unified) } + ; return res } -wrapUnifierX :: CtEvidence -> Role +wrapUnifier :: CtEvidence -> Role -> (UnifyEnv -> TcM a) -- Some calls to uType - -> TcS (a, Bag Ct, [TcTyVar]) -wrapUnifierX ev role do_unifications + -> TcS (a, Bag Ct) +-- Invokes the do_unifications argument, with a suitable UnifyEnv. +-- Very good short-cut when the two types are equal, or nearly so +-- See Note [wrapUnifier] +-- The (Bag Ct) are the deferred constraints; we emit them but +-- also return them +wrapUnifier ev role do_unifications = do { given_eq_lvl <- getInnermostGivenEqLevel + ; what_uni_ref <- getWhatUnifications + ; wrapTcS $ - do { defer_ref <- TcM.newTcRef emptyBag - ; unified_ref <- TcM.newTcRef [] + do { defer_ref <- TcM.newTcRef emptyBag ; let env = UE { u_role = role , u_given_eq_lvl = given_eq_lvl , u_rewriters = ctEvRewriters ev , u_loc = ctEvLoc ev , u_defer = defer_ref - , u_unified = Just unified_ref} + , u_what = Just what_uni_ref } -- u_rewriters: the rewriter set and location from -- the parent constraint `ev` are inherited in any -- new constraints spat out by the unifier + -- + -- u_what: likewise inherit the WhatUnifications flag, + -- so that unifications done here are visible + -- to the caller ; res <- do_unifications env ; cts <- TcM.readTcRef defer_ref - ; unified <- TcM.readTcRef unified_ref - ; return (res, cts, unified) } } + ; return (res, cts) } } {- ===================================== compiler/GHC/Tc/Solver/Solve.hs ===================================== @@ -132,9 +132,10 @@ simplify_loop n limit definitely_redo_implications ; return (wc { wc_simple = simples1 , wc_impl = implics1 }) } + -- See Note [When to iterate: unifications] ; unif_happened <- getUnificationFlag ; csTraceTcS $ text "unif_happened" <+> ppr unif_happened - -- Note [The Unification Level Flag] in GHC.Tc.Solver.Monad + ; maybe_simplify_again (n+1) limit unif_happened wc2 } data NextAction @@ -225,10 +226,59 @@ any new unifications, and iterate the implications only if so. "RAE": Add comment here about fundeps also using this mechanism. And probably update name of Note. --} -{- Note [Expanding Recursive Superclasses and ExpansionFuel] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Note [When to iterate the solver: unifications] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider a deep tree of implication constraints + forall[1] a. -- Outer-implic + C alpha[1] -- Simple + forall[2] c. ....(C alpha[1]).... -- Implic-1 + forall[2] b. ....(alpha[1] ~ Int).... -- Implic-2 + +The (C alpha) is insoluble until we know alpha. We solve alpha +by unifying alpha:=Int somewhere deep inside Implic-2. But then we +must try to solve the Outer-implic all over again. This time we can +solve (C alpha) both in Outer-implic, and nested inside Implic-1. + +When should we iterate solving a level-n implication? +Answer: if any unification of a tyvar at level n takes place + in the ic_implics of that implication. + +* What if a unification takes place at level n-1? Then don't iterate + level n, because we'll iterate level n-1, and that will in turn iterate + level n. + +* What if a unification takes place at level n, in the ic_simples of + level n? No need to track this, because the kick-out mechanism deals + with it. (We can't drop kick-out in favour of iteration, because kick-out + works for skolem-equalities, not just unifications.) + +So the monad-global `WhatUnifications` flag, kept in `tcs_unif_lvl` keeps +track of whether any unifications at all have taken place, and if so, what +is the outermost level that has seen a unification. Seee GHC.Tc.Utils.Unify +Note [WhatUnifications]. + +The iteration is done in the simplify_loop/maybe_simplify_again loop. + +It is helpful not to iterate unless there is a chance of progress. #8474 is +an example: + + * There's a deeply-nested chain of implication constraints. + ?x:alpha => ?y1:beta1 => ... ?yn:betan => [W] ?x:Int + + * From the innermost one we get a [W] alpha[1] ~ Int, + so we can unify. + + * It's better not to iterate the inner implications, but go all the + way out to level 1 before iterating -- because iterating level 1 + will iterate the inner levels anyway. + +(In the olden days when we "floated" these Derived constraints, this was +much, much more important -- we got exponential behaviour, as each iteration +produced the same Derived constraint.) + +Note [Expanding Recursive Superclasses and ExpansionFuel] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider the class declaration (T21909) class C [a] => C a where ===================================== compiler/GHC/Tc/Utils/Monad.hs ===================================== @@ -1907,6 +1907,9 @@ emitSimple ct emitSimples :: Cts -> TcM () emitSimples cts + | null cts + = return () + | otherwise = do { lie_var <- getConstraintVar ; updTcRef lie_var (`addSimples` cts) } ===================================== compiler/GHC/Tc/Utils/Unify.hs ===================================== @@ -30,14 +30,15 @@ module GHC.Tc.Utils.Unify ( dsInstantiate, -- Various unifications - unifyType, unifyKind, unifyInvisibleType, + uType, unifyType, unifyKind, unifyInvisibleType, unifyExprType, unifyTypeAndEmit, promoteTcType, swapOverTyVars, touchabilityTest, checkTopShape, lhsPriority, - UnifyEnv(..), updUEnvLoc, setUEnvRole, - uType, mightEqualLater, makeTypeConcrete, + UnifyEnv(..), updUEnvLoc, setUEnvRole, + WhatUnifications(..), recordUnification, recordUnificationLevel, + -------------------------------- -- Holes matchExpectedListTy, @@ -2296,15 +2297,75 @@ unifyTypeAndEmit t_or_k orig ty1 ty2 ; let env = UE { u_loc = loc, u_role = Nominal , u_given_eq_lvl = cur_lvl , u_rewriters = emptyRewriterSet -- ToDo: check this - , u_defer = ref, u_unified = Nothing } + , u_defer = ref, u_what = Nothing } -- The hard work happens here ; co <- uType env ty1 ty2 + -- Emit any deferred constraints ; cts <- readTcRef ref - ; unless (null cts) (emitSimples cts) + ; emitSimples cts + ; return co } + +{- ********************************************************************* +* * + WhatUnifications +* * +**********************************************************************-} + +data WhatUnifications + = NoUnificationsYet + | UnificationsDone TcLevel + +{- Note [WhatUnifications] +~~~~~~~~~~~~~~~~~~~~~~~~~~ +We record, in mutable variable carried by the monad, the `WhatUnifications` flag. + +* In the eager unifier (this module) it is held the + u_what :: Maybe (TcRef WhatUnificatons) + field of `UnifyEnv` + +* In TcS monad, it is held in the + tcs_unif_lvl :: IORef WhatUnifications + field of `TcSEnv`. + +In all cases the idea is this: + + --------------------------------------- + `WhatUnifications` records the level of the + outermost meta-tyvar that we have unified + ---------------------------------------- + +It starts life as `NoUnificationsYet`. Then when we unify a tyvar at level j, +we set the flag to `UnificationsDone j`, unless the flag is /already/ set to +`UnificationsDone i` where i<=j. + +Why do all this? + * See Note [When to iterate the solver: unifications] in GHC.Tc.Solver.Solve +-} + +recordUnification :: TcRef WhatUnifications -> TcTyVar -> TcM () +recordUnification what_ref tv = recordUnificationLevel what_ref (tcTyVarLevel tv) + +recordUnificationLevel :: TcRef WhatUnifications -> TcLevel -> TcM () +recordUnificationLevel what_ref tv_lvl + = do { what <- readTcRef what_ref + ; case what of + UnificationsDone unif_lvl + | tv_lvl `deeperThanOrSame` unif_lvl + -> do { traceTc "set-uni-flag: no-op" $ + vcat [ text "lvl" <+> ppr tv_lvl, text "unif_lvl" <+> ppr unif_lvl ] + ; return () } + _ -> do { traceTc "set-uni-flag" (ppr tv_lvl) + ; writeTcRef what_ref (UnificationsDone tv_lvl) } } + + +instance Outputable WhatUnifications where + ppr NoUnificationsYet = text "NoUniYet" + ppr (UnificationsDone lvl) = text "UniDone" <> braces (ppr lvl) + {- %************************************************************************ %* * @@ -2320,7 +2381,7 @@ The eager unifier, `uType`, is called by via the wrappers `unifyType`, `unifyKind` etc * The constraint solver (e.g. in GHC.Tc.Solver.Equality), - via `GHC.Tc.Solver.Monad.wrapUnifierTcS`. + via `GHC.Tc.Solver.Monad.wrapUnifie`. `uType` runs in the TcM monad, but it carries a UnifyEnv that tells it what to do when unifying a variable or deferring a constraint. Specifically, @@ -2355,7 +2416,7 @@ data UnifyEnv -- Which variables are unified; -- if Nothing, we don't care - , u_unified :: Maybe (TcRef [TcTyVar]) + , u_what :: Maybe (TcRef WhatUnifications) } setUEnvRole :: UnifyEnv -> Role -> UnifyEnv @@ -2752,10 +2813,7 @@ uUnfilledVar2 env@(UE { u_defer = def_eq_ref, u_given_eq_lvl = given_eq_lvl }) -- Only proceed if the kinds match -- NB: tv1 should still be unfilled, despite the kind unification -- because tv1 is not free in ty2' (or, hence, in its kind) - then do { liftZonkM $ writeMetaTyVar tv1 ty2 - ; case u_unified env of - Nothing -> return () - Just uref -> updTcRef uref (tv1 :) + then do { unifyTyVar env tv1 ty2 ; return (mkNomReflCo ty2) } -- Unification is always Nominal else -- The kinds don't match yet, so defer instead. @@ -2770,6 +2828,14 @@ uUnfilledVar2 env@(UE { u_defer = def_eq_ref, u_given_eq_lvl = given_eq_lvl }) ty1 = mkTyVarTy tv1 defer = unSwap swapped (uType_defer env) ty1 ty2 +unifyTyVar :: UnifyEnv -> TcTyVar -> TcType -> TcM () +-- Actually do the unification, and record it in WhatUnifications +unifyTyVar (UE { u_what = mb_what_unifications }) tv ty + = do { liftZonkM $ writeMetaTyVar tv ty + ; case mb_what_unifications of + Nothing -> return () + Just wu -> recordUnification wu tv } + swapOverTyVars :: Bool -> TcTyVar -> TcTyVar -> Bool swapOverTyVars is_given tv1 tv2 -- See Note [Unification variables on the left] @@ -3011,8 +3077,14 @@ The most important thing is that we want to put tyvars with the deepest level on the left. The reason to do so differs for Wanteds and Givens, but either way, deepest wins! Simple. -* Wanteds. Putting the deepest variable on the left maximise the +* Wanteds. Putting the deepest variable on the left maximises the chances that it's a touchable meta-tyvar which can be solved. + It also /crucial/ for skolem escape. Consider + [W] alpha[7] ~ beta[8] + [W] beta[8] ~ a[8] -- `a` is a skolem + If we unify alpha[7]:=beta[8], we will then happily unify + beta[8]:=a[8]. But that's wrong because now alpha[7] + is unified with an inner skolem a[8]. Disaster. * Givens. Suppose we have something like forall a[2]. b[1] ~ a[2] => beta[1] ~ a[2] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc8360efa255fb74ed2b90567657409… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc8360efa255fb74ed2b90567657409… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/9.12.3-backports] Fix for alex-3.5.2.0 (#25623)
by Zubin (＠wz1000) 11 Sep '25

11 Sep '25

Zubin pushed to branch wip/9.12.3-backports at Glasgow Haskell Compiler / GHC Commits: 91047c03 by Brandon Chinn at 2025-09-11T17:58:18+05:30 Fix for alex-3.5.2.0 (#25623) This INLINE pragma for alexScanUser was added in 9.12, but then I ported the change to alex in 3.5.2.0 (https://github.com/haskell/alex/pull/262) I didn't realize that GHC errors on duplicate INLINE pragmas, so this ended up being a breaking change. This change should be backported into 9.12 (cherry picked from commit a1d923786baed5b001c523fd2a76f133be510b04) - - - - - 1 changed file: - compiler/GHC/Parser/Lexer.x Changes: ===================================== compiler/GHC/Parser/Lexer.x ===================================== @@ -41,6 +41,7 @@ -- Alex "Haskell code fragment top" { +{-# LANGUAGE CPP #-} {-# LANGUAGE ViewPatterns #-} {-# LANGUAGE LambdaCase #-} {-# LANGUAGE MultiWayIf #-} @@ -3370,11 +3371,15 @@ topNoLayoutContainsCommas [] = False topNoLayoutContainsCommas (ALRLayout _ _ : ls) = topNoLayoutContainsCommas ls topNoLayoutContainsCommas (ALRNoLayout b _ : _) = b +#ifdef MIN_TOOL_VERSION_alex +#if !MIN_TOOL_VERSION_alex(3,5,2) -- If the generated alexScan/alexScanUser functions are called multiple times -- in this file, alexScanUser gets broken out into a separate function and -- increases memory usage. Make sure GHC inlines this function and optimizes it. -- https://github.com/haskell/alex/pull/262 {-# INLINE alexScanUser #-} +#endif +#endif lexToken :: P (PsLocated Token) lexToken = do View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/91047c0326939af6fc174cd5de42df7… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/91047c0326939af6fc174cd5de42df7… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/ubxsumtag] Use slots smaller than word as tag for smaller unboxed sums
by Luite Stegeman (＠luite) 11 Sep '25

11 Sep '25

Luite Stegeman pushed to branch wip/ubxsumtag at Glasgow Haskell Compiler / GHC Commits: 80d50227 by Luite Stegeman at 2025-09-11T14:25:56+02:00 Use slots smaller than word as tag for smaller unboxed sums This packs unboxed sums more efficiently by allowing Word8, Word16 and Word32 for the tag field if the number of constructors is small enough - - - - - 10 changed files: - compiler/GHC/Cmm/Utils.hs - compiler/GHC/Stg/Unarise.hs - compiler/GHC/Types/RepType.hs - testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 - + testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 - testsuite/tests/unboxedsums/all.T - testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs Changes: ===================================== compiler/GHC/Cmm/Utils.hs ===================================== @@ -115,6 +115,9 @@ slotCmmType platform = \case PtrUnliftedSlot -> gcWord platform PtrLiftedSlot -> gcWord platform WordSlot -> bWord platform + Word8Slot -> b8 + Word16Slot -> b16 + Word32Slot -> b32 Word64Slot -> b64 FloatSlot -> f32 DoubleSlot -> f64 ===================================== compiler/GHC/Stg/Unarise.hs ===================================== @@ -404,7 +404,6 @@ import GHC.Stg.Syntax import GHC.Stg.Utils import GHC.Stg.Make import GHC.Core.Type -import GHC.Builtin.Types.Prim (intPrimTy) import GHC.Builtin.Types import GHC.Types.Unique.Supply import GHC.Types.Unique @@ -681,15 +680,15 @@ elimCase rho args bndr (MultiValAlt _) [GenStgAlt{ alt_con = _ elimCase rho args@(tag_arg : real_args) bndr (MultiValAlt _) alts | isUnboxedSumBndr bndr - = do tag_bndr <- mkId (mkFastString "tag") tagTy + = do tag_bndr <- mkId (mkFastString "tag") (tagTyArg tag_arg) -- this won't be used but we need a binder anyway let rho1 = extendRho rho bndr (MultiVal args) scrut' = case tag_arg of StgVarArg v -> StgApp v [] StgLitArg l -> StgLit l - - alts' <- unariseSumAlts rho1 real_args alts - return (StgCase scrut' tag_bndr tagAltTy alts') + alt_ty = (tagAltTyArg tag_arg) + alts' <- unariseSumAlts rho1 alt_ty real_args alts + return (StgCase scrut' tag_bndr alt_ty alts') elimCase _ args bndr alt_ty alts = pprPanic "elimCase - unhandled case" @@ -732,8 +731,9 @@ unariseAlts rho (MultiValAlt _) bndr [GenStgAlt{ alt_con = DEFAULT unariseAlts rho (MultiValAlt _) bndr alts | isUnboxedSumBndr bndr = do (rho_sum_bndrs, scrt_bndrs@(tag_bndr : real_bndrs)) <- unariseConArgBinder rho bndr - alts' <- unariseSumAlts rho_sum_bndrs (map StgVarArg real_bndrs) alts - let inner_case = StgCase (StgApp tag_bndr []) tag_bndr tagAltTy alts' + let alt_ty = tagAltTy tag_bndr + alts' <- unariseSumAlts rho_sum_bndrs alt_ty (map StgVarArg real_bndrs) alts + let inner_case = StgCase (StgApp tag_bndr []) tag_bndr alt_ty alts' return [GenStgAlt{ alt_con = DataAlt (tupleDataCon Unboxed (length scrt_bndrs)) , alt_bndrs = scrt_bndrs , alt_rhs = inner_case @@ -753,21 +753,23 @@ unariseAlt rho alt@GenStgAlt{alt_con=_,alt_bndrs=xs,alt_rhs=e} -- | Make alternatives that match on the tag of a sum -- (i.e. generate LitAlts for the tag) unariseSumAlts :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> [StgAlt] -- original alternative with sum LHS -> UniqSM [StgAlt] -unariseSumAlts env args alts - = do alts' <- mapM (unariseSumAlt env args) alts +unariseSumAlts env tag_slot args alts + = do alts' <- mapM (unariseSumAlt env tag_slot args) alts return (mkDefaultLitAlt alts') unariseSumAlt :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> StgAlt -- original alternative with sum LHS -> UniqSM StgAlt -unariseSumAlt rho _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} +unariseSumAlt rho _ _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} = GenStgAlt DEFAULT mempty <$> unariseExpr rho e -unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon +unariseSumAlt rho tag_slot args alt@GenStgAlt{ alt_con = DataAlt sumCon , alt_bndrs = bs , alt_rhs = e } @@ -776,10 +778,18 @@ unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon [b] -> mapSumIdBinders b args e rho -- Sums must have one binder _ -> pprPanic "unariseSumAlt2" (ppr args $$ pprPanicAlt alt) - let lit_case = LitAlt (LitNumber LitNumInt (fromIntegral (dataConTag sumCon))) + let num_ty = + case tag_slot of + PrimAlt Word8Rep -> LitNumWord8 + PrimAlt Word16Rep -> LitNumWord16 + PrimAlt Word32Rep -> LitNumWord32 + PrimAlt WordRep -> LitNumWord + _ -> pprPanic "unariseSumAlt: unexpected tag slot type" (ppr tag_slot) + + lit_case = LitAlt (LitNumber num_ty (fromIntegral (dataConTag sumCon))) GenStgAlt lit_case mempty <$> unariseExpr rho' e' -unariseSumAlt _ scrt alt +unariseSumAlt _ _ scrt alt = pprPanic "unariseSumAlt3" (ppr scrt $$ pprPanicAlt alt) -------------------------------------------------------------------------------- @@ -865,12 +875,6 @@ mapSumIdBinders alt_bndr args rhs rho0 typed_id_args = map StgVarArg typed_ids - -- pprTrace "mapSumIdBinders" - -- (text "fld_reps" <+> ppr fld_reps $$ - -- text "id_args" <+> ppr id_arg_exprs $$ - -- text "rhs" <+> ppr rhs $$ - -- text "rhs_with_casts" <+> ppr rhs_with_casts - -- ) $ if isMultiValBndr alt_bndr then return (extendRho rho0 alt_bndr (MultiVal typed_id_args), rhs_with_casts rhs) else assert (typed_id_args `lengthIs` 1) $ @@ -921,13 +925,19 @@ mkUbxSum ) mkUbxSum dc ty_args args0 us = let - _ :| sum_slots = ubxSumRepType ty_args + tag_slot :| sum_slots = ubxSumRepType ty_args -- drop tag slot field_slots = (mapMaybe (repSlotTy . stgArgRep) args0) tag = dataConTag dc layout' = layoutUbxSum sum_slots field_slots - tag_arg = StgLitArg (LitNumber LitNumInt (fromIntegral tag)) + tag_arg = + case tag_slot of + Word8Slot -> StgLitArg (LitNumber LitNumWord8 (fromIntegral tag)) + Word16Slot -> StgLitArg (LitNumber LitNumWord16 (fromIntegral tag)) + Word32Slot -> StgLitArg (LitNumber LitNumWord32 (fromIntegral tag)) + WordSlot -> StgLitArg (LitNumber LitNumWord (fromIntegral tag)) + _ -> pprPanic "mkUbxSum: unexpected tag slot type" (ppr tag_slot) arg_idxs = IM.fromList (zipEqual layout' args0) ((_idx,_idx_map,_us,wrapper),slot_args) @@ -990,6 +1000,9 @@ ubxSumRubbishArg :: SlotTy -> StgArg ubxSumRubbishArg PtrLiftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg PtrUnliftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg WordSlot = StgLitArg (LitNumber LitNumWord 0) +ubxSumRubbishArg Word8Slot = StgLitArg (LitNumber LitNumWord8 0) +ubxSumRubbishArg Word16Slot = StgLitArg (LitNumber LitNumWord16 0) +ubxSumRubbishArg Word32Slot = StgLitArg (LitNumber LitNumWord32 0) ubxSumRubbishArg Word64Slot = StgLitArg (LitNumber LitNumWord64 0) ubxSumRubbishArg FloatSlot = StgLitArg (LitFloat 0) ubxSumRubbishArg DoubleSlot = StgLitArg (LitDouble 0) @@ -1166,11 +1179,18 @@ isUnboxedTupleBndr = isUnboxedTupleType . idType mkTuple :: [StgArg] -> StgExpr mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args [] -tagAltTy :: AltType -tagAltTy = PrimAlt IntRep +tagAltTyArg :: StgArg -> AltType +tagAltTyArg a + | [pr] <- typePrimRep (stgArgType a) = PrimAlt pr + | otherwise = pprPanic "tagAltTyArg" (ppr a) + +tagAltTy :: Id -> AltType +tagAltTy i + | [pr] <- typePrimRep (idType i) = PrimAlt pr + | otherwise = pprPanic "tagAltTy" (ppr $ idType i) -tagTy :: Type -tagTy = intPrimTy +tagTyArg :: StgArg -> Type +tagTyArg x = stgArgType x voidArg :: StgArg voidArg = StgVarArg voidPrimId ===================================== compiler/GHC/Types/RepType.hs ===================================== @@ -197,12 +197,12 @@ type SortedSlotTys = [SlotTy] -- of the list we have the slot for the tag. ubxSumRepType :: [[PrimRep]] -> NonEmpty SlotTy ubxSumRepType constrs0 - -- These first two cases never classify an actual unboxed sum, which always + -- This first case never classifies an actual unboxed sum, which always -- has at least two disjuncts. But it could happen if a user writes, e.g., -- forall (a :: TYPE (SumRep [IntRep])). ... -- which could never be instantiated. We still don't want to panic. | constrs0 `lengthLessThan` 2 - = WordSlot :| [] + = Word8Slot :| [] | otherwise = let @@ -230,8 +230,17 @@ ubxSumRepType constrs0 rep :: [PrimRep] -> SortedSlotTys rep ty = sort (map primRepSlot ty) - sumRep = WordSlot :| combine_alts (map rep constrs0) - -- WordSlot: for the tag of the sum + -- constructors start at 1, pick an appropriate slot size for the tag + tag_slot | length constrs0 < 256 = Word8Slot + | length constrs0 < 65536 = Word16Slot + -- we use 2147483647 instead of 4294967296 to avoid + -- overflow when building a 32 bit GHC. Please fix the + -- overflow if you encounter a type with more than 2147483646 + -- constructors and need the tag to be 32 bits. + | length constrs0 < 2147483647 = Word32Slot + | otherwise = WordSlot + + sumRep = tag_slot :| combine_alts (map rep constrs0) in sumRep @@ -275,22 +284,32 @@ layoutUbxSum sum_slots0 arg_slots0 = -- - Float slots: Shared between floating point types. -- -- - Void slots: Shared between void types. Not used in sums. --- --- TODO(michalt): We should probably introduce `SlotTy`s for 8-/16-/32-bit --- values, so that we can pack things more tightly. -data SlotTy = PtrLiftedSlot | PtrUnliftedSlot | WordSlot | Word64Slot | FloatSlot | DoubleSlot | VecSlot Int PrimElemRep + +data SlotTy = PtrLiftedSlot + | PtrUnliftedSlot + | Word8Slot + | Word16Slot + | Word32Slot + | WordSlot + | Word64Slot + | FloatSlot + | DoubleSlot + | VecSlot Int PrimElemRep deriving (Eq, Ord) -- Constructor order is important! If slot A could fit into slot B -- then slot A must occur first. E.g. FloatSlot before DoubleSlot -- - -- We are assuming that WordSlot is smaller than or equal to Word64Slot - -- (would not be true on a 128-bit machine) + -- We are assuming that Word32Slot <= WordSlot <= Word64Slot + -- (would not be true on a 16-bit or 128-bit machine) instance Outputable SlotTy where ppr PtrLiftedSlot = text "PtrLiftedSlot" ppr PtrUnliftedSlot = text "PtrUnliftedSlot" ppr Word64Slot = text "Word64Slot" ppr WordSlot = text "WordSlot" + ppr Word32Slot = text "Word32Slot" + ppr Word16Slot = text "Word16Slot" + ppr Word8Slot = text "Word8Slot" ppr DoubleSlot = text "DoubleSlot" ppr FloatSlot = text "FloatSlot" ppr (VecSlot n e) = text "VecSlot" <+> ppr n <+> ppr e @@ -307,14 +326,14 @@ primRepSlot (BoxedRep mlev) = case mlev of Just Lifted -> PtrLiftedSlot Just Unlifted -> PtrUnliftedSlot primRepSlot IntRep = WordSlot -primRepSlot Int8Rep = WordSlot -primRepSlot Int16Rep = WordSlot -primRepSlot Int32Rep = WordSlot +primRepSlot Int8Rep = Word8Slot +primRepSlot Int16Rep = Word16Slot +primRepSlot Int32Rep = Word32Slot primRepSlot Int64Rep = Word64Slot primRepSlot WordRep = WordSlot -primRepSlot Word8Rep = WordSlot -primRepSlot Word16Rep = WordSlot -primRepSlot Word32Rep = WordSlot +primRepSlot Word8Rep = Word8Slot +primRepSlot Word16Rep = Word16Slot +primRepSlot Word32Rep = Word32Slot primRepSlot Word64Rep = Word64Slot primRepSlot AddrRep = WordSlot primRepSlot FloatRep = FloatSlot @@ -325,6 +344,9 @@ slotPrimRep :: SlotTy -> PrimRep slotPrimRep PtrLiftedSlot = BoxedRep (Just Lifted) slotPrimRep PtrUnliftedSlot = BoxedRep (Just Unlifted) slotPrimRep Word64Slot = Word64Rep +slotPrimRep Word32Slot = Word32Rep +slotPrimRep Word16Slot = Word16Rep +slotPrimRep Word8Slot = Word8Rep slotPrimRep WordSlot = WordRep slotPrimRep DoubleSlot = DoubleRep slotPrimRep FloatSlot = FloatRep @@ -349,11 +371,12 @@ fitsIn ty1 ty2 -- See Note [Casting slot arguments] where isWordSlot Word64Slot = True + isWordSlot Word32Slot = True + isWordSlot Word16Slot = True + isWordSlot Word8Slot = True isWordSlot WordSlot = True isWordSlot _ = False - - {- ********************************************************************** * * PrimRep ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 ===================================== @@ -2,5 +2,7 @@ Test.foo_closure: const Test.D_con_info; const GHC.Internal.Types.True_closure+2; - const 2; + const 2 :: W8; + const 0 :: W16; + const 0 :: W8; const 3; ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 ===================================== @@ -0,0 +1,9 @@ +[section ""data" . Test.foo_closure" { + Test.foo_closure: + const Test.D_con_info; + const GHC.Internal.Types.True_closure+2; + const 2 :: W8; + const 0 :: W32; + const 0 :: W16; + const 0 :: W8; + const 3; ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs ===================================== @@ -0,0 +1,254 @@ +module Main where + +import GHC.Exts.Heap.Closures +import Control.Exception (evaluate) +import Data.Word (Word32) +import Data.Int (Int8, Int16) + +-- this should get a Word8 tag +data E1 + = E1_1 | E1_2 | E1_3 | E1_4 | E1_5 | E1_6 | E1_7 | E1_8 + | E1_9 | E1_10 | E1_11 | E1_12 | E1_13 | E1_14 | E1_15 | E1_16 + | E1_17 | E1_18 | E1_19 | E1_20 | E1_21 | E1_22 | E1_23 | E1_24 + | E1_25 | E1_26 | E1_27 | E1_28 | E1_29 | E1_30 | E1_31 | E1_32 + | E1_33 | E1_34 | E1_35 | E1_36 | E1_37 | E1_38 | E1_39 | E1_40 + | E1_41 | E1_42 | E1_43 | E1_44 | E1_45 | E1_46 | E1_47 | E1_48 + | E1_49 | E1_50 | E1_51 | E1_52 | E1_53 | E1_54 | E1_55 | E1_56 + | E1_57 | E1_58 | E1_59 | E1_60 | E1_61 | E1_62 | E1_63 | E1_64 + | E1_65 | E1_66 | E1_67 | E1_68 | E1_69 | E1_70 | E1_71 | E1_72 + | E1_73 | E1_74 | E1_75 | E1_76 | E1_77 | E1_78 | E1_79 | E1_80 + | E1_81 | E1_82 | E1_83 | E1_84 | E1_85 | E1_86 | E1_87 | E1_88 + | E1_89 | E1_90 | E1_91 | E1_92 | E1_93 | E1_94 | E1_95 | E1_96 + | E1_97 | E1_98 | E1_99 | E1_100 | E1_101 | E1_102 | E1_103 | E1_104 + | E1_105 | E1_106 | E1_107 | E1_108 | E1_109 | E1_110 | E1_111 | E1_112 + | E1_113 | E1_114 | E1_115 | E1_116 | E1_117 | E1_118 | E1_119 | E1_120 + | E1_121 | E1_122 | E1_123 | E1_124 | E1_125 | E1_126 | E1_127 | E1_128 + | E1_129 | E1_130 | E1_131 | E1_132 | E1_133 | E1_134 | E1_135 | E1_136 + | E1_137 | E1_138 | E1_139 | E1_140 | E1_141 | E1_142 | E1_143 | E1_144 + | E1_145 | E1_146 | E1_147 | E1_148 | E1_149 | E1_150 | E1_151 | E1_152 + | E1_153 | E1_154 | E1_155 | E1_156 | E1_157 | E1_158 | E1_159 | E1_160 + | E1_161 | E1_162 | E1_163 | E1_164 | E1_165 | E1_166 | E1_167 | E1_168 + | E1_169 | E1_170 | E1_171 | E1_172 | E1_173 | E1_174 | E1_175 | E1_176 + | E1_177 | E1_178 | E1_179 | E1_180 | E1_181 | E1_182 | E1_183 | E1_184 + | E1_185 | E1_186 | E1_187 | E1_188 | E1_189 | E1_190 | E1_191 | E1_192 + | E1_193 | E1_194 | E1_195 | E1_196 | E1_197 | E1_198 | E1_199 | E1_200 + | E1_201 | E1_202 | E1_203 | E1_204 | E1_205 | E1_206 | E1_207 | E1_208 + | E1_209 | E1_210 | E1_211 | E1_212 | E1_213 | E1_214 | E1_215 | E1_216 + | E1_217 | E1_218 | E1_219 | E1_220 | E1_221 | E1_222 | E1_223 | E1_224 + | E1_225 | E1_226 | E1_227 | E1_228 | E1_229 | E1_230 | E1_231 | E1_232 + | E1_233 | E1_234 | E1_235 | E1_236 | E1_237 | E1_238 | E1_239 | E1_240 + | E1_241 | E1_242 | E1_243 | E1_244 | E1_245 | E1_246 | E1_247 | E1_248 + | E1_249 | E1_250 | E1_251 | E1_252 | E1_253 | E1_254 + deriving (Enum, Bounded, Show) + +-- this should get a Word8 tag +data E2 + = E2_1 | E2_2 | E2_3 | E2_4 | E2_5 | E2_6 | E2_7 | E2_8 + | E2_9 | E2_10 | E2_11 | E2_12 | E2_13 | E2_14 | E2_15 | E2_16 + | E2_17 | E2_18 | E2_19 | E2_20 | E2_21 | E2_22 | E2_23 | E2_24 + | E2_25 | E2_26 | E2_27 | E2_28 | E2_29 | E2_30 | E2_31 | E2_32 + | E2_33 | E2_34 | E2_35 | E2_36 | E2_37 | E2_38 | E2_39 | E2_40 + | E2_41 | E2_42 | E2_43 | E2_44 | E2_45 | E2_46 | E2_47 | E2_48 + | E2_49 | E2_50 | E2_51 | E2_52 | E2_53 | E2_54 | E2_55 | E2_56 + | E2_57 | E2_58 | E2_59 | E2_60 | E2_61 | E2_62 | E2_63 | E2_64 + | E2_65 | E2_66 | E2_67 | E2_68 | E2_69 | E2_70 | E2_71 | E2_72 + | E2_73 | E2_74 | E2_75 | E2_76 | E2_77 | E2_78 | E2_79 | E2_80 + | E2_81 | E2_82 | E2_83 | E2_84 | E2_85 | E2_86 | E2_87 | E2_88 + | E2_89 | E2_90 | E2_91 | E2_92 | E2_93 | E2_94 | E2_95 | E2_96 + | E2_97 | E2_98 | E2_99 | E2_100 | E2_101 | E2_102 | E2_103 | E2_104 + | E2_105 | E2_106 | E2_107 | E2_108 | E2_109 | E2_110 | E2_111 | E2_112 + | E2_113 | E2_114 | E2_115 | E2_116 | E2_117 | E2_118 | E2_119 | E2_120 + | E2_121 | E2_122 | E2_123 | E2_124 | E2_125 | E2_126 | E2_127 | E2_128 + | E2_129 | E2_130 | E2_131 | E2_132 | E2_133 | E2_134 | E2_135 | E2_136 + | E2_137 | E2_138 | E2_139 | E2_140 | E2_141 | E2_142 | E2_143 | E2_144 + | E2_145 | E2_146 | E2_147 | E2_148 | E2_149 | E2_150 | E2_151 | E2_152 + | E2_153 | E2_154 | E2_155 | E2_156 | E2_157 | E2_158 | E2_159 | E2_160 + | E2_161 | E2_162 | E2_163 | E2_164 | E2_165 | E2_166 | E2_167 | E2_168 + | E2_169 | E2_170 | E2_171 | E2_172 | E2_173 | E2_174 | E2_175 | E2_176 + | E2_177 | E2_178 | E2_179 | E2_180 | E2_181 | E2_182 | E2_183 | E2_184 + | E2_185 | E2_186 | E2_187 | E2_188 | E2_189 | E2_190 | E2_191 | E2_192 + | E2_193 | E2_194 | E2_195 | E2_196 | E2_197 | E2_198 | E2_199 | E2_200 + | E2_201 | E2_202 | E2_203 | E2_204 | E2_205 | E2_206 | E2_207 | E2_208 + | E2_209 | E2_210 | E2_211 | E2_212 | E2_213 | E2_214 | E2_215 | E2_216 + | E2_217 | E2_218 | E2_219 | E2_220 | E2_221 | E2_222 | E2_223 | E2_224 + | E2_225 | E2_226 | E2_227 | E2_228 | E2_229 | E2_230 | E2_231 | E2_232 + | E2_233 | E2_234 | E2_235 | E2_236 | E2_237 | E2_238 | E2_239 | E2_240 + | E2_241 | E2_242 | E2_243 | E2_244 | E2_245 | E2_246 | E2_247 | E2_248 + | E2_249 | E2_250 | E2_251 | E2_252 | E2_253 | E2_254 | E2_255 + deriving (Enum, Bounded, Show) + +-- this needs a Word16 tag +data E3 + = E3_1 | E3_2 | E3_3 | E3_4 | E3_5 | E3_6 | E3_7 | E3_8 + | E3_9 | E3_10 | E3_11 | E3_12 | E3_13 | E3_14 | E3_15 | E3_16 + | E3_17 | E3_18 | E3_19 | E3_20 | E3_21 | E3_22 | E3_23 | E3_24 + | E3_25 | E3_26 | E3_27 | E3_28 | E3_29 | E3_30 | E3_31 | E3_32 + | E3_33 | E3_34 | E3_35 | E3_36 | E3_37 | E3_38 | E3_39 | E3_40 + | E3_41 | E3_42 | E3_43 | E3_44 | E3_45 | E3_46 | E3_47 | E3_48 + | E3_49 | E3_50 | E3_51 | E3_52 | E3_53 | E3_54 | E3_55 | E3_56 + | E3_57 | E3_58 | E3_59 | E3_60 | E3_61 | E3_62 | E3_63 | E3_64 + | E3_65 | E3_66 | E3_67 | E3_68 | E3_69 | E3_70 | E3_71 | E3_72 + | E3_73 | E3_74 | E3_75 | E3_76 | E3_77 | E3_78 | E3_79 | E3_80 + | E3_81 | E3_82 | E3_83 | E3_84 | E3_85 | E3_86 | E3_87 | E3_88 + | E3_89 | E3_90 | E3_91 | E3_92 | E3_93 | E3_94 | E3_95 | E3_96 + | E3_97 | E3_98 | E3_99 | E3_100 | E3_101 | E3_102 | E3_103 | E3_104 + | E3_105 | E3_106 | E3_107 | E3_108 | E3_109 | E3_110 | E3_111 | E3_112 + | E3_113 | E3_114 | E3_115 | E3_116 | E3_117 | E3_118 | E3_119 | E3_120 + | E3_121 | E3_122 | E3_123 | E3_124 | E3_125 | E3_126 | E3_127 | E3_128 + | E3_129 | E3_130 | E3_131 | E3_132 | E3_133 | E3_134 | E3_135 | E3_136 + | E3_137 | E3_138 | E3_139 | E3_140 | E3_141 | E3_142 | E3_143 | E3_144 + | E3_145 | E3_146 | E3_147 | E3_148 | E3_149 | E3_150 | E3_151 | E3_152 + | E3_153 | E3_154 | E3_155 | E3_156 | E3_157 | E3_158 | E3_159 | E3_160 + | E3_161 | E3_162 | E3_163 | E3_164 | E3_165 | E3_166 | E3_167 | E3_168 + | E3_169 | E3_170 | E3_171 | E3_172 | E3_173 | E3_174 | E3_175 | E3_176 + | E3_177 | E3_178 | E3_179 | E3_180 | E3_181 | E3_182 | E3_183 | E3_184 + | E3_185 | E3_186 | E3_187 | E3_188 | E3_189 | E3_190 | E3_191 | E3_192 + | E3_193 | E3_194 | E3_195 | E3_196 | E3_197 | E3_198 | E3_199 | E3_200 + | E3_201 | E3_202 | E3_203 | E3_204 | E3_205 | E3_206 | E3_207 | E3_208 + | E3_209 | E3_210 | E3_211 | E3_212 | E3_213 | E3_214 | E3_215 | E3_216 + | E3_217 | E3_218 | E3_219 | E3_220 | E3_221 | E3_222 | E3_223 | E3_224 + | E3_225 | E3_226 | E3_227 | E3_228 | E3_229 | E3_230 | E3_231 | E3_232 + | E3_233 | E3_234 | E3_235 | E3_236 | E3_237 | E3_238 | E3_239 | E3_240 + | E3_241 | E3_242 | E3_243 | E3_244 | E3_245 | E3_246 | E3_247 | E3_248 + | E3_249 | E3_250 | E3_251 | E3_252 | E3_253 | E3_254 | E3_255 | E3_256 + deriving (Enum, Bounded, Show) + +data U_Bool = U_Bool {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) + +data U_E1 = U_E1 {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_E2 = U_E2 {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +{- + disabled to reduce memory consumption of test + +data U_E3 = U_E3 {-# UNPACK #-} !E3 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_Mixed = U_Mixed {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) +-} + +data U_Maybe = U_Maybe {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + deriving (Show) + + +data MaybeW32 = NothingW32 + | JustW32 {-# UNPACK #-} !Word32 + deriving (Show) + +data U_MaybeW32 = U_MaybeW32 {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + deriving (Show) + +u_ba :: U_Bool +u_ba = U_Bool minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1a :: U_E1 +u_e1a = U_E1 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1b :: U_E1 +u_e1b = U_E1 maxBound minBound maxBound minBound + maxBound minBound maxBound minBound + +u_e1c :: U_E1 +u_e1c = U_E1 E1_1 126 127 0 1 2 3 4 + +u_e1d :: U_E1 +u_e1d = U_E1 E1_254 126 127 0 1 2 3 4 + +u_e2a :: U_E2 +u_e2a = U_E2 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound +{- +u_e3a :: U_E3 +u_e3a = U_E3 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_mixed :: U_Mixed +u_mixed = U_Mixed maxBound minBound maxBound minBound + maxBound minBound maxBound minBound +-} + +u_maybe :: U_Maybe +u_maybe = U_Maybe Nothing (Just False) Nothing (Just True) + Nothing (Just False) Nothing (Just True) + +u_maybeW32 :: U_MaybeW32 +u_maybeW32 = U_MaybeW32 NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + +test :: Show a => String -> a -> IO () +test name value = do + putStrLn $ "\n### " ++ name + value' <- evaluate value + print value' + putStrLn ("size: " ++ show (closureSize $ asBox value')) + +main :: IO () +main = do + test "u_ba" u_ba + test "u_e1a" u_e1a + test "u_e1b" u_e1b + test "u_e1c" u_e1c + test "u_e1d" u_e1d + test "u_e2a" u_e2a + -- test "u_e3a" u_e3a + -- test "u_mixed" u_mixed + test "u_maybe" u_maybe + test "u_maybeW32" u_maybeW32 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 2 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 2 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 2 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 2 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 10 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 9 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 3 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 3 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 3 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 3 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 11 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 17 ===================================== testsuite/tests/unboxedsums/all.T ===================================== @@ -62,3 +62,5 @@ test('ManyUbxSums', ['ManyUbxSums', [('ManyUbxSums_Addr.hs','')] , '-v0 -dstg-lint -dcmm-lint']) + +test('UbxSumUnpackedSize', [js_broken(22374)], compile_and_run, ['-O']) ===================================== testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs ===================================== @@ -63,33 +63,33 @@ layout_tests = sequence_ assert_layout "layout1" [ ubxtup [ intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, WordSlot ] layout2 = assert_layout "layout2" [ ubxtup [ intTy ] , intTy ] - [ WordSlot, PtrLiftedSlot ] + [ Word8Slot, PtrLiftedSlot ] layout3 = assert_layout "layout3" [ ubxtup [ intTy, intPrimTy, intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy, intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] layout4 = assert_layout "layout4" [ ubxtup [ floatPrimTy, floatPrimTy ] , ubxtup [ intPrimTy, intPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] layout5 = assert_layout "layout5" [ ubxtup [ intPrimTy, intPrimTy ] , ubxtup [ floatPrimTy, floatPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] enum_layout = assert_layout "enum" (replicate 10 (ubxtup [])) - [ WordSlot ] + [ Word8Slot ] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/80d50227e35b54b15d869e42379d01a… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/80d50227e35b54b15d869e42379d01a… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/ubxsumtag] Use slots smaller than word as tag for smaller unboxed sums
by Luite Stegeman (＠luite) 11 Sep '25

11 Sep '25

Luite Stegeman pushed to branch wip/ubxsumtag at Glasgow Haskell Compiler / GHC Commits: dc719288 by Luite Stegeman at 2025-09-11T14:23:07+02:00 Use slots smaller than word as tag for smaller unboxed sums This packs unboxed sums more efficiently by allowing Word8, Word16 and Word32 for the tag field if the number of constructors is small enough - - - - - 10 changed files: - compiler/GHC/Cmm/Utils.hs - compiler/GHC/Stg/Unarise.hs - compiler/GHC/Types/RepType.hs - testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 - + testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 - testsuite/tests/unboxedsums/all.T - testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs Changes: ===================================== compiler/GHC/Cmm/Utils.hs ===================================== @@ -115,6 +115,9 @@ slotCmmType platform = \case PtrUnliftedSlot -> gcWord platform PtrLiftedSlot -> gcWord platform WordSlot -> bWord platform + Word8Slot -> b8 + Word16Slot -> b16 + Word32Slot -> b32 Word64Slot -> b64 FloatSlot -> f32 DoubleSlot -> f64 ===================================== compiler/GHC/Stg/Unarise.hs ===================================== @@ -404,7 +404,6 @@ import GHC.Stg.Syntax import GHC.Stg.Utils import GHC.Stg.Make import GHC.Core.Type -import GHC.Builtin.Types.Prim (intPrimTy) import GHC.Builtin.Types import GHC.Types.Unique.Supply import GHC.Types.Unique @@ -681,15 +680,15 @@ elimCase rho args bndr (MultiValAlt _) [GenStgAlt{ alt_con = _ elimCase rho args@(tag_arg : real_args) bndr (MultiValAlt _) alts | isUnboxedSumBndr bndr - = do tag_bndr <- mkId (mkFastString "tag") tagTy + = do tag_bndr <- mkId (mkFastString "tag") (tagTyArg tag_arg) -- this won't be used but we need a binder anyway let rho1 = extendRho rho bndr (MultiVal args) scrut' = case tag_arg of StgVarArg v -> StgApp v [] StgLitArg l -> StgLit l - - alts' <- unariseSumAlts rho1 real_args alts - return (StgCase scrut' tag_bndr tagAltTy alts') + alt_ty = (tagAltTyArg tag_arg) + alts' <- unariseSumAlts rho1 alt_ty real_args alts + return (StgCase scrut' tag_bndr alt_ty alts') elimCase _ args bndr alt_ty alts = pprPanic "elimCase - unhandled case" @@ -732,8 +731,9 @@ unariseAlts rho (MultiValAlt _) bndr [GenStgAlt{ alt_con = DEFAULT unariseAlts rho (MultiValAlt _) bndr alts | isUnboxedSumBndr bndr = do (rho_sum_bndrs, scrt_bndrs@(tag_bndr : real_bndrs)) <- unariseConArgBinder rho bndr - alts' <- unariseSumAlts rho_sum_bndrs (map StgVarArg real_bndrs) alts - let inner_case = StgCase (StgApp tag_bndr []) tag_bndr tagAltTy alts' + let alt_ty = tagAltTy tag_bndr + alts' <- unariseSumAlts rho_sum_bndrs alt_ty (map StgVarArg real_bndrs) alts + let inner_case = StgCase (StgApp tag_bndr []) tag_bndr alt_ty alts' return [GenStgAlt{ alt_con = DataAlt (tupleDataCon Unboxed (length scrt_bndrs)) , alt_bndrs = scrt_bndrs , alt_rhs = inner_case @@ -753,21 +753,23 @@ unariseAlt rho alt@GenStgAlt{alt_con=_,alt_bndrs=xs,alt_rhs=e} -- | Make alternatives that match on the tag of a sum -- (i.e. generate LitAlts for the tag) unariseSumAlts :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> [StgAlt] -- original alternative with sum LHS -> UniqSM [StgAlt] -unariseSumAlts env args alts - = do alts' <- mapM (unariseSumAlt env args) alts +unariseSumAlts env tag_slot args alts + = do alts' <- mapM (unariseSumAlt env tag_slot args) alts return (mkDefaultLitAlt alts') unariseSumAlt :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> StgAlt -- original alternative with sum LHS -> UniqSM StgAlt -unariseSumAlt rho _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} +unariseSumAlt rho _ _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} = GenStgAlt DEFAULT mempty <$> unariseExpr rho e -unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon +unariseSumAlt rho tag_slot args alt@GenStgAlt{ alt_con = DataAlt sumCon , alt_bndrs = bs , alt_rhs = e } @@ -776,10 +778,18 @@ unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon [b] -> mapSumIdBinders b args e rho -- Sums must have one binder _ -> pprPanic "unariseSumAlt2" (ppr args $$ pprPanicAlt alt) - let lit_case = LitAlt (LitNumber LitNumInt (fromIntegral (dataConTag sumCon))) + let num_ty = + case tag_slot of + PrimAlt Word8Rep -> LitNumWord8 + PrimAlt Word16Rep -> LitNumWord16 + PrimAlt Word32Rep -> LitNumWord32 + PrimAlt WordRep -> LitNumWord + _ -> pprPanic "Unexpected tag slot type" (ppr tag_slot) + + lit_case = LitAlt (LitNumber num_ty (fromIntegral (dataConTag sumCon))) GenStgAlt lit_case mempty <$> unariseExpr rho' e' -unariseSumAlt _ scrt alt +unariseSumAlt _ _ scrt alt = pprPanic "unariseSumAlt3" (ppr scrt $$ pprPanicAlt alt) -------------------------------------------------------------------------------- @@ -865,12 +875,6 @@ mapSumIdBinders alt_bndr args rhs rho0 typed_id_args = map StgVarArg typed_ids - -- pprTrace "mapSumIdBinders" - -- (text "fld_reps" <+> ppr fld_reps $$ - -- text "id_args" <+> ppr id_arg_exprs $$ - -- text "rhs" <+> ppr rhs $$ - -- text "rhs_with_casts" <+> ppr rhs_with_casts - -- ) $ if isMultiValBndr alt_bndr then return (extendRho rho0 alt_bndr (MultiVal typed_id_args), rhs_with_casts rhs) else assert (typed_id_args `lengthIs` 1) $ @@ -921,13 +925,19 @@ mkUbxSum ) mkUbxSum dc ty_args args0 us = let - _ :| sum_slots = ubxSumRepType ty_args + tag_slot :| sum_slots = ubxSumRepType ty_args -- drop tag slot field_slots = (mapMaybe (repSlotTy . stgArgRep) args0) tag = dataConTag dc layout' = layoutUbxSum sum_slots field_slots - tag_arg = StgLitArg (LitNumber LitNumInt (fromIntegral tag)) + tag_arg = + case tag_slot of + Word8Slot -> StgLitArg (LitNumber LitNumWord8 (fromIntegral tag)) + Word16Slot -> StgLitArg (LitNumber LitNumWord16 (fromIntegral tag)) + Word32Slot -> StgLitArg (LitNumber LitNumWord32 (fromIntegral tag)) + WordSlot -> StgLitArg (LitNumber LitNumWord (fromIntegral tag)) + _ -> pprPanic "mkUbxSum: unexpected tag slot: " (ppr tag_slot) arg_idxs = IM.fromList (zipEqual layout' args0) ((_idx,_idx_map,_us,wrapper),slot_args) @@ -990,6 +1000,9 @@ ubxSumRubbishArg :: SlotTy -> StgArg ubxSumRubbishArg PtrLiftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg PtrUnliftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg WordSlot = StgLitArg (LitNumber LitNumWord 0) +ubxSumRubbishArg Word8Slot = StgLitArg (LitNumber LitNumWord8 0) +ubxSumRubbishArg Word16Slot = StgLitArg (LitNumber LitNumWord16 0) +ubxSumRubbishArg Word32Slot = StgLitArg (LitNumber LitNumWord32 0) ubxSumRubbishArg Word64Slot = StgLitArg (LitNumber LitNumWord64 0) ubxSumRubbishArg FloatSlot = StgLitArg (LitFloat 0) ubxSumRubbishArg DoubleSlot = StgLitArg (LitDouble 0) @@ -1166,11 +1179,18 @@ isUnboxedTupleBndr = isUnboxedTupleType . idType mkTuple :: [StgArg] -> StgExpr mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args [] -tagAltTy :: AltType -tagAltTy = PrimAlt IntRep +tagAltTyArg :: StgArg -> AltType +tagAltTyArg a + | [pr] <- typePrimRep (stgArgType a) = PrimAlt pr + | otherwise = pprPanic "tagAltTyArg" (ppr a) + +tagAltTy :: Id -> AltType +tagAltTy i + | [pr] <- typePrimRep (idType i) = PrimAlt pr + | otherwise = pprPanic "tagAltTy" (ppr $ idType i) -tagTy :: Type -tagTy = intPrimTy +tagTyArg :: StgArg -> Type +tagTyArg x = stgArgType x voidArg :: StgArg voidArg = StgVarArg voidPrimId ===================================== compiler/GHC/Types/RepType.hs ===================================== @@ -197,12 +197,12 @@ type SortedSlotTys = [SlotTy] -- of the list we have the slot for the tag. ubxSumRepType :: [[PrimRep]] -> NonEmpty SlotTy ubxSumRepType constrs0 - -- These first two cases never classify an actual unboxed sum, which always + -- This first case never classifies an actual unboxed sum, which always -- has at least two disjuncts. But it could happen if a user writes, e.g., -- forall (a :: TYPE (SumRep [IntRep])). ... -- which could never be instantiated. We still don't want to panic. | constrs0 `lengthLessThan` 2 - = WordSlot :| [] + = Word8Slot :| [] | otherwise = let @@ -230,8 +230,17 @@ ubxSumRepType constrs0 rep :: [PrimRep] -> SortedSlotTys rep ty = sort (map primRepSlot ty) - sumRep = WordSlot :| combine_alts (map rep constrs0) - -- WordSlot: for the tag of the sum + -- constructors start at 1, pick an appropriate slot size for the tag + tag_slot | length constrs0 < 256 = Word8Slot + | length constrs0 < 65536 = Word16Slot + -- we use 2147483647 instead of 4294967296 to avoid + -- overflow when building a 32 bit GHC. Please fix the + -- overflow if you encounter a type with more than 2147483646 + -- constructors and need the tag to be 32 bits. + | length constrs0 < 2147483647 = Word32Slot + | otherwise = WordSlot + + sumRep = tag_slot :| combine_alts (map rep constrs0) in sumRep @@ -275,22 +284,32 @@ layoutUbxSum sum_slots0 arg_slots0 = -- - Float slots: Shared between floating point types. -- -- - Void slots: Shared between void types. Not used in sums. --- --- TODO(michalt): We should probably introduce `SlotTy`s for 8-/16-/32-bit --- values, so that we can pack things more tightly. -data SlotTy = PtrLiftedSlot | PtrUnliftedSlot | WordSlot | Word64Slot | FloatSlot | DoubleSlot | VecSlot Int PrimElemRep + +data SlotTy = PtrLiftedSlot + | PtrUnliftedSlot + | Word8Slot + | Word16Slot + | Word32Slot + | WordSlot + | Word64Slot + | FloatSlot + | DoubleSlot + | VecSlot Int PrimElemRep deriving (Eq, Ord) -- Constructor order is important! If slot A could fit into slot B -- then slot A must occur first. E.g. FloatSlot before DoubleSlot -- - -- We are assuming that WordSlot is smaller than or equal to Word64Slot - -- (would not be true on a 128-bit machine) + -- We are assuming that Word32Slot <= WordSlot <= Word64Slot + -- (would not be true on a 16-bit or 128-bit machine) instance Outputable SlotTy where ppr PtrLiftedSlot = text "PtrLiftedSlot" ppr PtrUnliftedSlot = text "PtrUnliftedSlot" ppr Word64Slot = text "Word64Slot" ppr WordSlot = text "WordSlot" + ppr Word32Slot = text "Word32Slot" + ppr Word16Slot = text "Word16Slot" + ppr Word8Slot = text "Word8Slot" ppr DoubleSlot = text "DoubleSlot" ppr FloatSlot = text "FloatSlot" ppr (VecSlot n e) = text "VecSlot" <+> ppr n <+> ppr e @@ -307,14 +326,14 @@ primRepSlot (BoxedRep mlev) = case mlev of Just Lifted -> PtrLiftedSlot Just Unlifted -> PtrUnliftedSlot primRepSlot IntRep = WordSlot -primRepSlot Int8Rep = WordSlot -primRepSlot Int16Rep = WordSlot -primRepSlot Int32Rep = WordSlot +primRepSlot Int8Rep = Word8Slot +primRepSlot Int16Rep = Word16Slot +primRepSlot Int32Rep = Word32Slot primRepSlot Int64Rep = Word64Slot primRepSlot WordRep = WordSlot -primRepSlot Word8Rep = WordSlot -primRepSlot Word16Rep = WordSlot -primRepSlot Word32Rep = WordSlot +primRepSlot Word8Rep = Word8Slot +primRepSlot Word16Rep = Word16Slot +primRepSlot Word32Rep = Word32Slot primRepSlot Word64Rep = Word64Slot primRepSlot AddrRep = WordSlot primRepSlot FloatRep = FloatSlot @@ -325,6 +344,9 @@ slotPrimRep :: SlotTy -> PrimRep slotPrimRep PtrLiftedSlot = BoxedRep (Just Lifted) slotPrimRep PtrUnliftedSlot = BoxedRep (Just Unlifted) slotPrimRep Word64Slot = Word64Rep +slotPrimRep Word32Slot = Word32Rep +slotPrimRep Word16Slot = Word16Rep +slotPrimRep Word8Slot = Word8Rep slotPrimRep WordSlot = WordRep slotPrimRep DoubleSlot = DoubleRep slotPrimRep FloatSlot = FloatRep @@ -349,11 +371,12 @@ fitsIn ty1 ty2 -- See Note [Casting slot arguments] where isWordSlot Word64Slot = True + isWordSlot Word32Slot = True + isWordSlot Word16Slot = True + isWordSlot Word8Slot = True isWordSlot WordSlot = True isWordSlot _ = False - - {- ********************************************************************** * * PrimRep ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 ===================================== @@ -2,5 +2,7 @@ Test.foo_closure: const Test.D_con_info; const GHC.Internal.Types.True_closure+2; - const 2; + const 2 :: W8; + const 0 :: W16; + const 0 :: W8; const 3; ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 ===================================== @@ -0,0 +1,9 @@ +[section ""data" . Test.foo_closure" { + Test.foo_closure: + const Test.D_con_info; + const GHC.Internal.Types.True_closure+2; + const 2 :: W8; + const 0 :: W32; + const 0 :: W16; + const 0 :: W8; + const 3; ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs ===================================== @@ -0,0 +1,254 @@ +module Main where + +import GHC.Exts.Heap.Closures +import Control.Exception (evaluate) +import Data.Word (Word32) +import Data.Int (Int8, Int16) + +-- this should get a Word8 tag +data E1 + = E1_1 | E1_2 | E1_3 | E1_4 | E1_5 | E1_6 | E1_7 | E1_8 + | E1_9 | E1_10 | E1_11 | E1_12 | E1_13 | E1_14 | E1_15 | E1_16 + | E1_17 | E1_18 | E1_19 | E1_20 | E1_21 | E1_22 | E1_23 | E1_24 + | E1_25 | E1_26 | E1_27 | E1_28 | E1_29 | E1_30 | E1_31 | E1_32 + | E1_33 | E1_34 | E1_35 | E1_36 | E1_37 | E1_38 | E1_39 | E1_40 + | E1_41 | E1_42 | E1_43 | E1_44 | E1_45 | E1_46 | E1_47 | E1_48 + | E1_49 | E1_50 | E1_51 | E1_52 | E1_53 | E1_54 | E1_55 | E1_56 + | E1_57 | E1_58 | E1_59 | E1_60 | E1_61 | E1_62 | E1_63 | E1_64 + | E1_65 | E1_66 | E1_67 | E1_68 | E1_69 | E1_70 | E1_71 | E1_72 + | E1_73 | E1_74 | E1_75 | E1_76 | E1_77 | E1_78 | E1_79 | E1_80 + | E1_81 | E1_82 | E1_83 | E1_84 | E1_85 | E1_86 | E1_87 | E1_88 + | E1_89 | E1_90 | E1_91 | E1_92 | E1_93 | E1_94 | E1_95 | E1_96 + | E1_97 | E1_98 | E1_99 | E1_100 | E1_101 | E1_102 | E1_103 | E1_104 + | E1_105 | E1_106 | E1_107 | E1_108 | E1_109 | E1_110 | E1_111 | E1_112 + | E1_113 | E1_114 | E1_115 | E1_116 | E1_117 | E1_118 | E1_119 | E1_120 + | E1_121 | E1_122 | E1_123 | E1_124 | E1_125 | E1_126 | E1_127 | E1_128 + | E1_129 | E1_130 | E1_131 | E1_132 | E1_133 | E1_134 | E1_135 | E1_136 + | E1_137 | E1_138 | E1_139 | E1_140 | E1_141 | E1_142 | E1_143 | E1_144 + | E1_145 | E1_146 | E1_147 | E1_148 | E1_149 | E1_150 | E1_151 | E1_152 + | E1_153 | E1_154 | E1_155 | E1_156 | E1_157 | E1_158 | E1_159 | E1_160 + | E1_161 | E1_162 | E1_163 | E1_164 | E1_165 | E1_166 | E1_167 | E1_168 + | E1_169 | E1_170 | E1_171 | E1_172 | E1_173 | E1_174 | E1_175 | E1_176 + | E1_177 | E1_178 | E1_179 | E1_180 | E1_181 | E1_182 | E1_183 | E1_184 + | E1_185 | E1_186 | E1_187 | E1_188 | E1_189 | E1_190 | E1_191 | E1_192 + | E1_193 | E1_194 | E1_195 | E1_196 | E1_197 | E1_198 | E1_199 | E1_200 + | E1_201 | E1_202 | E1_203 | E1_204 | E1_205 | E1_206 | E1_207 | E1_208 + | E1_209 | E1_210 | E1_211 | E1_212 | E1_213 | E1_214 | E1_215 | E1_216 + | E1_217 | E1_218 | E1_219 | E1_220 | E1_221 | E1_222 | E1_223 | E1_224 + | E1_225 | E1_226 | E1_227 | E1_228 | E1_229 | E1_230 | E1_231 | E1_232 + | E1_233 | E1_234 | E1_235 | E1_236 | E1_237 | E1_238 | E1_239 | E1_240 + | E1_241 | E1_242 | E1_243 | E1_244 | E1_245 | E1_246 | E1_247 | E1_248 + | E1_249 | E1_250 | E1_251 | E1_252 | E1_253 | E1_254 + deriving (Enum, Bounded, Show) + +-- this should get a Word8 tag +data E2 + = E2_1 | E2_2 | E2_3 | E2_4 | E2_5 | E2_6 | E2_7 | E2_8 + | E2_9 | E2_10 | E2_11 | E2_12 | E2_13 | E2_14 | E2_15 | E2_16 + | E2_17 | E2_18 | E2_19 | E2_20 | E2_21 | E2_22 | E2_23 | E2_24 + | E2_25 | E2_26 | E2_27 | E2_28 | E2_29 | E2_30 | E2_31 | E2_32 + | E2_33 | E2_34 | E2_35 | E2_36 | E2_37 | E2_38 | E2_39 | E2_40 + | E2_41 | E2_42 | E2_43 | E2_44 | E2_45 | E2_46 | E2_47 | E2_48 + | E2_49 | E2_50 | E2_51 | E2_52 | E2_53 | E2_54 | E2_55 | E2_56 + | E2_57 | E2_58 | E2_59 | E2_60 | E2_61 | E2_62 | E2_63 | E2_64 + | E2_65 | E2_66 | E2_67 | E2_68 | E2_69 | E2_70 | E2_71 | E2_72 + | E2_73 | E2_74 | E2_75 | E2_76 | E2_77 | E2_78 | E2_79 | E2_80 + | E2_81 | E2_82 | E2_83 | E2_84 | E2_85 | E2_86 | E2_87 | E2_88 + | E2_89 | E2_90 | E2_91 | E2_92 | E2_93 | E2_94 | E2_95 | E2_96 + | E2_97 | E2_98 | E2_99 | E2_100 | E2_101 | E2_102 | E2_103 | E2_104 + | E2_105 | E2_106 | E2_107 | E2_108 | E2_109 | E2_110 | E2_111 | E2_112 + | E2_113 | E2_114 | E2_115 | E2_116 | E2_117 | E2_118 | E2_119 | E2_120 + | E2_121 | E2_122 | E2_123 | E2_124 | E2_125 | E2_126 | E2_127 | E2_128 + | E2_129 | E2_130 | E2_131 | E2_132 | E2_133 | E2_134 | E2_135 | E2_136 + | E2_137 | E2_138 | E2_139 | E2_140 | E2_141 | E2_142 | E2_143 | E2_144 + | E2_145 | E2_146 | E2_147 | E2_148 | E2_149 | E2_150 | E2_151 | E2_152 + | E2_153 | E2_154 | E2_155 | E2_156 | E2_157 | E2_158 | E2_159 | E2_160 + | E2_161 | E2_162 | E2_163 | E2_164 | E2_165 | E2_166 | E2_167 | E2_168 + | E2_169 | E2_170 | E2_171 | E2_172 | E2_173 | E2_174 | E2_175 | E2_176 + | E2_177 | E2_178 | E2_179 | E2_180 | E2_181 | E2_182 | E2_183 | E2_184 + | E2_185 | E2_186 | E2_187 | E2_188 | E2_189 | E2_190 | E2_191 | E2_192 + | E2_193 | E2_194 | E2_195 | E2_196 | E2_197 | E2_198 | E2_199 | E2_200 + | E2_201 | E2_202 | E2_203 | E2_204 | E2_205 | E2_206 | E2_207 | E2_208 + | E2_209 | E2_210 | E2_211 | E2_212 | E2_213 | E2_214 | E2_215 | E2_216 + | E2_217 | E2_218 | E2_219 | E2_220 | E2_221 | E2_222 | E2_223 | E2_224 + | E2_225 | E2_226 | E2_227 | E2_228 | E2_229 | E2_230 | E2_231 | E2_232 + | E2_233 | E2_234 | E2_235 | E2_236 | E2_237 | E2_238 | E2_239 | E2_240 + | E2_241 | E2_242 | E2_243 | E2_244 | E2_245 | E2_246 | E2_247 | E2_248 + | E2_249 | E2_250 | E2_251 | E2_252 | E2_253 | E2_254 | E2_255 + deriving (Enum, Bounded, Show) + +-- this needs a Word16 tag +data E3 + = E3_1 | E3_2 | E3_3 | E3_4 | E3_5 | E3_6 | E3_7 | E3_8 + | E3_9 | E3_10 | E3_11 | E3_12 | E3_13 | E3_14 | E3_15 | E3_16 + | E3_17 | E3_18 | E3_19 | E3_20 | E3_21 | E3_22 | E3_23 | E3_24 + | E3_25 | E3_26 | E3_27 | E3_28 | E3_29 | E3_30 | E3_31 | E3_32 + | E3_33 | E3_34 | E3_35 | E3_36 | E3_37 | E3_38 | E3_39 | E3_40 + | E3_41 | E3_42 | E3_43 | E3_44 | E3_45 | E3_46 | E3_47 | E3_48 + | E3_49 | E3_50 | E3_51 | E3_52 | E3_53 | E3_54 | E3_55 | E3_56 + | E3_57 | E3_58 | E3_59 | E3_60 | E3_61 | E3_62 | E3_63 | E3_64 + | E3_65 | E3_66 | E3_67 | E3_68 | E3_69 | E3_70 | E3_71 | E3_72 + | E3_73 | E3_74 | E3_75 | E3_76 | E3_77 | E3_78 | E3_79 | E3_80 + | E3_81 | E3_82 | E3_83 | E3_84 | E3_85 | E3_86 | E3_87 | E3_88 + | E3_89 | E3_90 | E3_91 | E3_92 | E3_93 | E3_94 | E3_95 | E3_96 + | E3_97 | E3_98 | E3_99 | E3_100 | E3_101 | E3_102 | E3_103 | E3_104 + | E3_105 | E3_106 | E3_107 | E3_108 | E3_109 | E3_110 | E3_111 | E3_112 + | E3_113 | E3_114 | E3_115 | E3_116 | E3_117 | E3_118 | E3_119 | E3_120 + | E3_121 | E3_122 | E3_123 | E3_124 | E3_125 | E3_126 | E3_127 | E3_128 + | E3_129 | E3_130 | E3_131 | E3_132 | E3_133 | E3_134 | E3_135 | E3_136 + | E3_137 | E3_138 | E3_139 | E3_140 | E3_141 | E3_142 | E3_143 | E3_144 + | E3_145 | E3_146 | E3_147 | E3_148 | E3_149 | E3_150 | E3_151 | E3_152 + | E3_153 | E3_154 | E3_155 | E3_156 | E3_157 | E3_158 | E3_159 | E3_160 + | E3_161 | E3_162 | E3_163 | E3_164 | E3_165 | E3_166 | E3_167 | E3_168 + | E3_169 | E3_170 | E3_171 | E3_172 | E3_173 | E3_174 | E3_175 | E3_176 + | E3_177 | E3_178 | E3_179 | E3_180 | E3_181 | E3_182 | E3_183 | E3_184 + | E3_185 | E3_186 | E3_187 | E3_188 | E3_189 | E3_190 | E3_191 | E3_192 + | E3_193 | E3_194 | E3_195 | E3_196 | E3_197 | E3_198 | E3_199 | E3_200 + | E3_201 | E3_202 | E3_203 | E3_204 | E3_205 | E3_206 | E3_207 | E3_208 + | E3_209 | E3_210 | E3_211 | E3_212 | E3_213 | E3_214 | E3_215 | E3_216 + | E3_217 | E3_218 | E3_219 | E3_220 | E3_221 | E3_222 | E3_223 | E3_224 + | E3_225 | E3_226 | E3_227 | E3_228 | E3_229 | E3_230 | E3_231 | E3_232 + | E3_233 | E3_234 | E3_235 | E3_236 | E3_237 | E3_238 | E3_239 | E3_240 + | E3_241 | E3_242 | E3_243 | E3_244 | E3_245 | E3_246 | E3_247 | E3_248 + | E3_249 | E3_250 | E3_251 | E3_252 | E3_253 | E3_254 | E3_255 | E3_256 + deriving (Enum, Bounded, Show) + +data U_Bool = U_Bool {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) + +data U_E1 = U_E1 {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_E2 = U_E2 {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +{- + disabled to reduce memory consumption of test + +data U_E3 = U_E3 {-# UNPACK #-} !E3 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_Mixed = U_Mixed {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) +-} + +data U_Maybe = U_Maybe {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + deriving (Show) + + +data MaybeW32 = NothingW32 + | JustW32 {-# UNPACK #-} !Word32 + deriving (Show) + +data U_MaybeW32 = U_MaybeW32 {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + deriving (Show) + +u_ba :: U_Bool +u_ba = U_Bool minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1a :: U_E1 +u_e1a = U_E1 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1b :: U_E1 +u_e1b = U_E1 maxBound minBound maxBound minBound + maxBound minBound maxBound minBound + +u_e1c :: U_E1 +u_e1c = U_E1 E1_1 126 127 0 1 2 3 4 + +u_e1d :: U_E1 +u_e1d = U_E1 E1_254 126 127 0 1 2 3 4 + +u_e2a :: U_E2 +u_e2a = U_E2 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound +{- +u_e3a :: U_E3 +u_e3a = U_E3 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_mixed :: U_Mixed +u_mixed = U_Mixed maxBound minBound maxBound minBound + maxBound minBound maxBound minBound +-} + +u_maybe :: U_Maybe +u_maybe = U_Maybe Nothing (Just False) Nothing (Just True) + Nothing (Just False) Nothing (Just True) + +u_maybeW32 :: U_MaybeW32 +u_maybeW32 = U_MaybeW32 NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + +test :: Show a => String -> a -> IO () +test name value = do + putStrLn $ "\n### " ++ name + value' <- evaluate value + print value' + putStrLn ("size: " ++ show (closureSize $ asBox value')) + +main :: IO () +main = do + test "u_ba" u_ba + test "u_e1a" u_e1a + test "u_e1b" u_e1b + test "u_e1c" u_e1c + test "u_e1d" u_e1d + test "u_e2a" u_e2a + -- test "u_e3a" u_e3a + -- test "u_mixed" u_mixed + test "u_maybe" u_maybe + test "u_maybeW32" u_maybeW32 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 2 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 2 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 2 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 2 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 10 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 9 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 3 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 3 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 3 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 3 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 11 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 17 ===================================== testsuite/tests/unboxedsums/all.T ===================================== @@ -62,3 +62,5 @@ test('ManyUbxSums', ['ManyUbxSums', [('ManyUbxSums_Addr.hs','')] , '-v0 -dstg-lint -dcmm-lint']) + +test('UbxSumUnpackedSize', [js_broken(22374)], compile_and_run, ['-O']) ===================================== testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs ===================================== @@ -63,33 +63,33 @@ layout_tests = sequence_ assert_layout "layout1" [ ubxtup [ intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, WordSlot ] layout2 = assert_layout "layout2" [ ubxtup [ intTy ] , intTy ] - [ WordSlot, PtrLiftedSlot ] + [ Word8Slot, PtrLiftedSlot ] layout3 = assert_layout "layout3" [ ubxtup [ intTy, intPrimTy, intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy, intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] layout4 = assert_layout "layout4" [ ubxtup [ floatPrimTy, floatPrimTy ] , ubxtup [ intPrimTy, intPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] layout5 = assert_layout "layout5" [ ubxtup [ intPrimTy, intPrimTy ] , ubxtup [ floatPrimTy, floatPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] enum_layout = assert_layout "enum" (replicate 10 (ubxtup [])) - [ WordSlot ] + [ Word8Slot ] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc71928805cc67285011d71b89d2184… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc71928805cc67285011d71b89d2184… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/ubxsumtag] Use slots smaller than word as tag for smaller unboxed sums
by Luite Stegeman (＠luite) 11 Sep '25

11 Sep '25

Luite Stegeman pushed to branch wip/ubxsumtag at Glasgow Haskell Compiler / GHC Commits: 8466b595 by Luite Stegeman at 2025-09-11T14:14:53+02:00 Use slots smaller than word as tag for smaller unboxed sums This packs unboxed sums more efficiently by allowing Word8, Word16 and Word32 for the tag field if the number of constructors is small enough - - - - - 10 changed files: - compiler/GHC/Cmm/Utils.hs - compiler/GHC/Stg/Unarise.hs - compiler/GHC/Types/RepType.hs - testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 - + testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 - testsuite/tests/unboxedsums/all.T - testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs Changes: ===================================== compiler/GHC/Cmm/Utils.hs ===================================== @@ -115,6 +115,9 @@ slotCmmType platform = \case PtrUnliftedSlot -> gcWord platform PtrLiftedSlot -> gcWord platform WordSlot -> bWord platform + Word8Slot -> b8 + Word16Slot -> b16 + Word32Slot -> b32 Word64Slot -> b64 FloatSlot -> f32 DoubleSlot -> f64 ===================================== compiler/GHC/Stg/Unarise.hs ===================================== @@ -404,7 +404,6 @@ import GHC.Stg.Syntax import GHC.Stg.Utils import GHC.Stg.Make import GHC.Core.Type -import GHC.Builtin.Types.Prim (intPrimTy) import GHC.Builtin.Types import GHC.Types.Unique.Supply import GHC.Types.Unique @@ -681,15 +680,15 @@ elimCase rho args bndr (MultiValAlt _) [GenStgAlt{ alt_con = _ elimCase rho args@(tag_arg : real_args) bndr (MultiValAlt _) alts | isUnboxedSumBndr bndr - = do tag_bndr <- mkId (mkFastString "tag") tagTy + = do tag_bndr <- mkId (mkFastString "tag") (tagTyArg tag_arg) -- this won't be used but we need a binder anyway let rho1 = extendRho rho bndr (MultiVal args) scrut' = case tag_arg of StgVarArg v -> StgApp v [] StgLitArg l -> StgLit l - - alts' <- unariseSumAlts rho1 real_args alts - return (StgCase scrut' tag_bndr tagAltTy alts') + alt_ty = (tagAltTyArg tag_arg) + alts' <- unariseSumAlts rho1 alt_ty real_args alts + return (StgCase scrut' tag_bndr alt_ty alts') elimCase _ args bndr alt_ty alts = pprPanic "elimCase - unhandled case" @@ -732,8 +731,9 @@ unariseAlts rho (MultiValAlt _) bndr [GenStgAlt{ alt_con = DEFAULT unariseAlts rho (MultiValAlt _) bndr alts | isUnboxedSumBndr bndr = do (rho_sum_bndrs, scrt_bndrs@(tag_bndr : real_bndrs)) <- unariseConArgBinder rho bndr - alts' <- unariseSumAlts rho_sum_bndrs (map StgVarArg real_bndrs) alts - let inner_case = StgCase (StgApp tag_bndr []) tag_bndr tagAltTy alts' + let alt_ty = tagAltTy tag_bndr + alts' <- unariseSumAlts rho_sum_bndrs alt_ty (map StgVarArg real_bndrs) alts + let inner_case = StgCase (StgApp tag_bndr []) tag_bndr alt_ty alts' return [GenStgAlt{ alt_con = DataAlt (tupleDataCon Unboxed (length scrt_bndrs)) , alt_bndrs = scrt_bndrs , alt_rhs = inner_case @@ -753,21 +753,23 @@ unariseAlt rho alt@GenStgAlt{alt_con=_,alt_bndrs=xs,alt_rhs=e} -- | Make alternatives that match on the tag of a sum -- (i.e. generate LitAlts for the tag) unariseSumAlts :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> [StgAlt] -- original alternative with sum LHS -> UniqSM [StgAlt] -unariseSumAlts env args alts - = do alts' <- mapM (unariseSumAlt env args) alts +unariseSumAlts env tag_slot args alts + = do alts' <- mapM (unariseSumAlt env tag_slot args) alts return (mkDefaultLitAlt alts') unariseSumAlt :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> StgAlt -- original alternative with sum LHS -> UniqSM StgAlt -unariseSumAlt rho _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} +unariseSumAlt rho _ _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} = GenStgAlt DEFAULT mempty <$> unariseExpr rho e -unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon +unariseSumAlt rho tag_slot args alt@GenStgAlt{ alt_con = DataAlt sumCon , alt_bndrs = bs , alt_rhs = e } @@ -776,10 +778,18 @@ unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon [b] -> mapSumIdBinders b args e rho -- Sums must have one binder _ -> pprPanic "unariseSumAlt2" (ppr args $$ pprPanicAlt alt) - let lit_case = LitAlt (LitNumber LitNumInt (fromIntegral (dataConTag sumCon))) + let num_ty = + case tag_slot of + PrimAlt Word8Rep -> LitNumWord8 + PrimAlt Word16Rep -> LitNumWord16 + PrimAlt Word32Rep -> LitNumWord32 + PrimAlt WordRep -> LitNumWord + _ -> pprPanic "Unexpected tag slot type" (ppr tag_slot) + + lit_case = LitAlt (LitNumber num_ty (fromIntegral (dataConTag sumCon))) GenStgAlt lit_case mempty <$> unariseExpr rho' e' -unariseSumAlt _ scrt alt +unariseSumAlt _ _ scrt alt = pprPanic "unariseSumAlt3" (ppr scrt $$ pprPanicAlt alt) -------------------------------------------------------------------------------- @@ -865,12 +875,6 @@ mapSumIdBinders alt_bndr args rhs rho0 typed_id_args = map StgVarArg typed_ids - -- pprTrace "mapSumIdBinders" - -- (text "fld_reps" <+> ppr fld_reps $$ - -- text "id_args" <+> ppr id_arg_exprs $$ - -- text "rhs" <+> ppr rhs $$ - -- text "rhs_with_casts" <+> ppr rhs_with_casts - -- ) $ if isMultiValBndr alt_bndr then return (extendRho rho0 alt_bndr (MultiVal typed_id_args), rhs_with_casts rhs) else assert (typed_id_args `lengthIs` 1) $ @@ -921,13 +925,19 @@ mkUbxSum ) mkUbxSum dc ty_args args0 us = let - _ :| sum_slots = ubxSumRepType ty_args + tag_slot :| sum_slots = ubxSumRepType ty_args -- drop tag slot field_slots = (mapMaybe (repSlotTy . stgArgRep) args0) tag = dataConTag dc layout' = layoutUbxSum sum_slots field_slots - tag_arg = StgLitArg (LitNumber LitNumInt (fromIntegral tag)) + tag_arg = + case tag_slot of + Word8Slot -> StgLitArg (LitNumber LitNumWord8 (fromIntegral tag)) + Word16Slot -> StgLitArg (LitNumber LitNumWord16 (fromIntegral tag)) + Word32Slot -> StgLitArg (LitNumber LitNumWord32 (fromIntegral tag)) + WordSlot -> StgLitArg (LitNumber LitNumWord (fromIntegral tag)) + _ -> pprPanic "mkUbxSum: unexpected tag slot: " (ppr tag_slot) arg_idxs = IM.fromList (zipEqual layout' args0) ((_idx,_idx_map,_us,wrapper),slot_args) @@ -990,6 +1000,9 @@ ubxSumRubbishArg :: SlotTy -> StgArg ubxSumRubbishArg PtrLiftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg PtrUnliftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg WordSlot = StgLitArg (LitNumber LitNumWord 0) +ubxSumRubbishArg Word8Slot = StgLitArg (LitNumber LitNumWord8 0) +ubxSumRubbishArg Word16Slot = StgLitArg (LitNumber LitNumWord16 0) +ubxSumRubbishArg Word32Slot = StgLitArg (LitNumber LitNumWord32 0) ubxSumRubbishArg Word64Slot = StgLitArg (LitNumber LitNumWord64 0) ubxSumRubbishArg FloatSlot = StgLitArg (LitFloat 0) ubxSumRubbishArg DoubleSlot = StgLitArg (LitDouble 0) @@ -1166,11 +1179,18 @@ isUnboxedTupleBndr = isUnboxedTupleType . idType mkTuple :: [StgArg] -> StgExpr mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args [] -tagAltTy :: AltType -tagAltTy = PrimAlt IntRep +tagAltTyArg :: StgArg -> AltType +tagAltTyArg a + | [pr] <- typePrimRep (stgArgType a) = PrimAlt pr + | otherwise = pprPanic "tagAltTyArg" (ppr a) + +tagAltTy :: Id -> AltType +tagAltTy i + | [pr] <- typePrimRep (idType i) = PrimAlt pr + | otherwise = pprPanic "tagAltTy" (ppr $ idType i) -tagTy :: Type -tagTy = intPrimTy +tagTyArg :: StgArg -> Type +tagTyArg x = stgArgType x voidArg :: StgArg voidArg = StgVarArg voidPrimId ===================================== compiler/GHC/Types/RepType.hs ===================================== @@ -197,12 +197,12 @@ type SortedSlotTys = [SlotTy] -- of the list we have the slot for the tag. ubxSumRepType :: [[PrimRep]] -> NonEmpty SlotTy ubxSumRepType constrs0 - -- These first two cases never classify an actual unboxed sum, which always + -- This first case never classifies an actual unboxed sum, which always -- has at least two disjuncts. But it could happen if a user writes, e.g., -- forall (a :: TYPE (SumRep [IntRep])). ... -- which could never be instantiated. We still don't want to panic. | constrs0 `lengthLessThan` 2 - = WordSlot :| [] + = Word8Slot :| [] | otherwise = let @@ -230,8 +230,17 @@ ubxSumRepType constrs0 rep :: [PrimRep] -> SortedSlotTys rep ty = sort (map primRepSlot ty) - sumRep = WordSlot :| combine_alts (map rep constrs0) - -- WordSlot: for the tag of the sum + -- constructors start at 1, pick an appropriate slot size for the tag + tag_slot | length constrs0 < 256 = Word8Slot + | length constrs0 < 65536 = Word16Slot + -- we use 2147483647 instead of 4294967296 to avoid + -- overflow when building a 32 bit GHC. Please fix the + -- overflow if you encounter a type with more than 2147483646 + -- constructors and need the tag to be 32 bits. + | length constrs0 < 2147483647 = Word32Slot + | otherwise = WordSlot + + sumRep = tag_slot :| combine_alts (map rep constrs0) in sumRep @@ -275,22 +284,32 @@ layoutUbxSum sum_slots0 arg_slots0 = -- - Float slots: Shared between floating point types. -- -- - Void slots: Shared between void types. Not used in sums. --- --- TODO(michalt): We should probably introduce `SlotTy`s for 8-/16-/32-bit --- values, so that we can pack things more tightly. -data SlotTy = PtrLiftedSlot | PtrUnliftedSlot | WordSlot | Word64Slot | FloatSlot | DoubleSlot | VecSlot Int PrimElemRep + +data SlotTy = PtrLiftedSlot + | PtrUnliftedSlot + | Word8Slot + | Word16Slot + | Word32Slot + | WordSlot + | Word64Slot + | FloatSlot + | DoubleSlot + | VecSlot Int PrimElemRep deriving (Eq, Ord) -- Constructor order is important! If slot A could fit into slot B -- then slot A must occur first. E.g. FloatSlot before DoubleSlot -- - -- We are assuming that WordSlot is smaller than or equal to Word64Slot - -- (would not be true on a 128-bit machine) + -- We are assuming that Word32Slot <= WordSlot <= Word64Slot + -- (would not be true on a 16-bit or 128-bit machine) instance Outputable SlotTy where ppr PtrLiftedSlot = text "PtrLiftedSlot" ppr PtrUnliftedSlot = text "PtrUnliftedSlot" ppr Word64Slot = text "Word64Slot" ppr WordSlot = text "WordSlot" + ppr Word32Slot = text "Word32Slot" + ppr Word16Slot = text "Word16Slot" + ppr Word8Slot = text "Word8Slot" ppr DoubleSlot = text "DoubleSlot" ppr FloatSlot = text "FloatSlot" ppr (VecSlot n e) = text "VecSlot" <+> ppr n <+> ppr e @@ -307,14 +326,14 @@ primRepSlot (BoxedRep mlev) = case mlev of Just Lifted -> PtrLiftedSlot Just Unlifted -> PtrUnliftedSlot primRepSlot IntRep = WordSlot -primRepSlot Int8Rep = WordSlot -primRepSlot Int16Rep = WordSlot -primRepSlot Int32Rep = WordSlot +primRepSlot Int8Rep = Word8Slot +primRepSlot Int16Rep = Word16Slot +primRepSlot Int32Rep = Word32Slot primRepSlot Int64Rep = Word64Slot primRepSlot WordRep = WordSlot -primRepSlot Word8Rep = WordSlot -primRepSlot Word16Rep = WordSlot -primRepSlot Word32Rep = WordSlot +primRepSlot Word8Rep = Word8Slot +primRepSlot Word16Rep = Word16Slot +primRepSlot Word32Rep = Word32Slot primRepSlot Word64Rep = Word64Slot primRepSlot AddrRep = WordSlot primRepSlot FloatRep = FloatSlot @@ -325,6 +344,9 @@ slotPrimRep :: SlotTy -> PrimRep slotPrimRep PtrLiftedSlot = BoxedRep (Just Lifted) slotPrimRep PtrUnliftedSlot = BoxedRep (Just Unlifted) slotPrimRep Word64Slot = Word64Rep +slotPrimRep Word32Slot = Word32Rep +slotPrimRep Word16Slot = Word16Rep +slotPrimRep Word8Slot = Word8Rep slotPrimRep WordSlot = WordRep slotPrimRep DoubleSlot = DoubleRep slotPrimRep FloatSlot = FloatRep @@ -349,11 +371,12 @@ fitsIn ty1 ty2 -- See Note [Casting slot arguments] where isWordSlot Word64Slot = True + isWordSlot Word32Slot = True + isWordSlot Word16Slot = True + isWordSlot Word8Slot = True isWordSlot WordSlot = True isWordSlot _ = False - - {- ********************************************************************** * * PrimRep ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 ===================================== @@ -2,5 +2,7 @@ Test.foo_closure: const Test.D_con_info; const GHC.Internal.Types.True_closure+2; - const 2; + const 2 :: W8; + const 0 :: W16; + const 0 :: W8; const 3; ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 ===================================== @@ -0,0 +1,9 @@ +[section ""data" . Test.foo_closure" { + Test.foo_closure: + const Test.D_con_info; + const GHC.Internal.Types.True_closure+2; + const 2 :: W8; + const 0 :: W32; + const 0 :: W16; + const 0 :: W8; + const 3; ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs ===================================== @@ -0,0 +1,254 @@ +module Main where + +import GHC.Exts.Heap.Closures +import Control.Exception (evaluate) +import Data.Word (Word32) +import Data.Int (Int8, Int16) + +-- this should get a Word8 tag +data E1 + = E1_1 | E1_2 | E1_3 | E1_4 | E1_5 | E1_6 | E1_7 | E1_8 + | E1_9 | E1_10 | E1_11 | E1_12 | E1_13 | E1_14 | E1_15 | E1_16 + | E1_17 | E1_18 | E1_19 | E1_20 | E1_21 | E1_22 | E1_23 | E1_24 + | E1_25 | E1_26 | E1_27 | E1_28 | E1_29 | E1_30 | E1_31 | E1_32 + | E1_33 | E1_34 | E1_35 | E1_36 | E1_37 | E1_38 | E1_39 | E1_40 + | E1_41 | E1_42 | E1_43 | E1_44 | E1_45 | E1_46 | E1_47 | E1_48 + | E1_49 | E1_50 | E1_51 | E1_52 | E1_53 | E1_54 | E1_55 | E1_56 + | E1_57 | E1_58 | E1_59 | E1_60 | E1_61 | E1_62 | E1_63 | E1_64 + | E1_65 | E1_66 | E1_67 | E1_68 | E1_69 | E1_70 | E1_71 | E1_72 + | E1_73 | E1_74 | E1_75 | E1_76 | E1_77 | E1_78 | E1_79 | E1_80 + | E1_81 | E1_82 | E1_83 | E1_84 | E1_85 | E1_86 | E1_87 | E1_88 + | E1_89 | E1_90 | E1_91 | E1_92 | E1_93 | E1_94 | E1_95 | E1_96 + | E1_97 | E1_98 | E1_99 | E1_100 | E1_101 | E1_102 | E1_103 | E1_104 + | E1_105 | E1_106 | E1_107 | E1_108 | E1_109 | E1_110 | E1_111 | E1_112 + | E1_113 | E1_114 | E1_115 | E1_116 | E1_117 | E1_118 | E1_119 | E1_120 + | E1_121 | E1_122 | E1_123 | E1_124 | E1_125 | E1_126 | E1_127 | E1_128 + | E1_129 | E1_130 | E1_131 | E1_132 | E1_133 | E1_134 | E1_135 | E1_136 + | E1_137 | E1_138 | E1_139 | E1_140 | E1_141 | E1_142 | E1_143 | E1_144 + | E1_145 | E1_146 | E1_147 | E1_148 | E1_149 | E1_150 | E1_151 | E1_152 + | E1_153 | E1_154 | E1_155 | E1_156 | E1_157 | E1_158 | E1_159 | E1_160 + | E1_161 | E1_162 | E1_163 | E1_164 | E1_165 | E1_166 | E1_167 | E1_168 + | E1_169 | E1_170 | E1_171 | E1_172 | E1_173 | E1_174 | E1_175 | E1_176 + | E1_177 | E1_178 | E1_179 | E1_180 | E1_181 | E1_182 | E1_183 | E1_184 + | E1_185 | E1_186 | E1_187 | E1_188 | E1_189 | E1_190 | E1_191 | E1_192 + | E1_193 | E1_194 | E1_195 | E1_196 | E1_197 | E1_198 | E1_199 | E1_200 + | E1_201 | E1_202 | E1_203 | E1_204 | E1_205 | E1_206 | E1_207 | E1_208 + | E1_209 | E1_210 | E1_211 | E1_212 | E1_213 | E1_214 | E1_215 | E1_216 + | E1_217 | E1_218 | E1_219 | E1_220 | E1_221 | E1_222 | E1_223 | E1_224 + | E1_225 | E1_226 | E1_227 | E1_228 | E1_229 | E1_230 | E1_231 | E1_232 + | E1_233 | E1_234 | E1_235 | E1_236 | E1_237 | E1_238 | E1_239 | E1_240 + | E1_241 | E1_242 | E1_243 | E1_244 | E1_245 | E1_246 | E1_247 | E1_248 + | E1_249 | E1_250 | E1_251 | E1_252 | E1_253 | E1_254 + deriving (Enum, Bounded, Show) + +-- this should get a Word8 tag +data E2 + = E2_1 | E2_2 | E2_3 | E2_4 | E2_5 | E2_6 | E2_7 | E2_8 + | E2_9 | E2_10 | E2_11 | E2_12 | E2_13 | E2_14 | E2_15 | E2_16 + | E2_17 | E2_18 | E2_19 | E2_20 | E2_21 | E2_22 | E2_23 | E2_24 + | E2_25 | E2_26 | E2_27 | E2_28 | E2_29 | E2_30 | E2_31 | E2_32 + | E2_33 | E2_34 | E2_35 | E2_36 | E2_37 | E2_38 | E2_39 | E2_40 + | E2_41 | E2_42 | E2_43 | E2_44 | E2_45 | E2_46 | E2_47 | E2_48 + | E2_49 | E2_50 | E2_51 | E2_52 | E2_53 | E2_54 | E2_55 | E2_56 + | E2_57 | E2_58 | E2_59 | E2_60 | E2_61 | E2_62 | E2_63 | E2_64 + | E2_65 | E2_66 | E2_67 | E2_68 | E2_69 | E2_70 | E2_71 | E2_72 + | E2_73 | E2_74 | E2_75 | E2_76 | E2_77 | E2_78 | E2_79 | E2_80 + | E2_81 | E2_82 | E2_83 | E2_84 | E2_85 | E2_86 | E2_87 | E2_88 + | E2_89 | E2_90 | E2_91 | E2_92 | E2_93 | E2_94 | E2_95 | E2_96 + | E2_97 | E2_98 | E2_99 | E2_100 | E2_101 | E2_102 | E2_103 | E2_104 + | E2_105 | E2_106 | E2_107 | E2_108 | E2_109 | E2_110 | E2_111 | E2_112 + | E2_113 | E2_114 | E2_115 | E2_116 | E2_117 | E2_118 | E2_119 | E2_120 + | E2_121 | E2_122 | E2_123 | E2_124 | E2_125 | E2_126 | E2_127 | E2_128 + | E2_129 | E2_130 | E2_131 | E2_132 | E2_133 | E2_134 | E2_135 | E2_136 + | E2_137 | E2_138 | E2_139 | E2_140 | E2_141 | E2_142 | E2_143 | E2_144 + | E2_145 | E2_146 | E2_147 | E2_148 | E2_149 | E2_150 | E2_151 | E2_152 + | E2_153 | E2_154 | E2_155 | E2_156 | E2_157 | E2_158 | E2_159 | E2_160 + | E2_161 | E2_162 | E2_163 | E2_164 | E2_165 | E2_166 | E2_167 | E2_168 + | E2_169 | E2_170 | E2_171 | E2_172 | E2_173 | E2_174 | E2_175 | E2_176 + | E2_177 | E2_178 | E2_179 | E2_180 | E2_181 | E2_182 | E2_183 | E2_184 + | E2_185 | E2_186 | E2_187 | E2_188 | E2_189 | E2_190 | E2_191 | E2_192 + | E2_193 | E2_194 | E2_195 | E2_196 | E2_197 | E2_198 | E2_199 | E2_200 + | E2_201 | E2_202 | E2_203 | E2_204 | E2_205 | E2_206 | E2_207 | E2_208 + | E2_209 | E2_210 | E2_211 | E2_212 | E2_213 | E2_214 | E2_215 | E2_216 + | E2_217 | E2_218 | E2_219 | E2_220 | E2_221 | E2_222 | E2_223 | E2_224 + | E2_225 | E2_226 | E2_227 | E2_228 | E2_229 | E2_230 | E2_231 | E2_232 + | E2_233 | E2_234 | E2_235 | E2_236 | E2_237 | E2_238 | E2_239 | E2_240 + | E2_241 | E2_242 | E2_243 | E2_244 | E2_245 | E2_246 | E2_247 | E2_248 + | E2_249 | E2_250 | E2_251 | E2_252 | E2_253 | E2_254 | E2_255 + deriving (Enum, Bounded, Show) + +-- this needs a Word16 tag +data E3 + = E3_1 | E3_2 | E3_3 | E3_4 | E3_5 | E3_6 | E3_7 | E3_8 + | E3_9 | E3_10 | E3_11 | E3_12 | E3_13 | E3_14 | E3_15 | E3_16 + | E3_17 | E3_18 | E3_19 | E3_20 | E3_21 | E3_22 | E3_23 | E3_24 + | E3_25 | E3_26 | E3_27 | E3_28 | E3_29 | E3_30 | E3_31 | E3_32 + | E3_33 | E3_34 | E3_35 | E3_36 | E3_37 | E3_38 | E3_39 | E3_40 + | E3_41 | E3_42 | E3_43 | E3_44 | E3_45 | E3_46 | E3_47 | E3_48 + | E3_49 | E3_50 | E3_51 | E3_52 | E3_53 | E3_54 | E3_55 | E3_56 + | E3_57 | E3_58 | E3_59 | E3_60 | E3_61 | E3_62 | E3_63 | E3_64 + | E3_65 | E3_66 | E3_67 | E3_68 | E3_69 | E3_70 | E3_71 | E3_72 + | E3_73 | E3_74 | E3_75 | E3_76 | E3_77 | E3_78 | E3_79 | E3_80 + | E3_81 | E3_82 | E3_83 | E3_84 | E3_85 | E3_86 | E3_87 | E3_88 + | E3_89 | E3_90 | E3_91 | E3_92 | E3_93 | E3_94 | E3_95 | E3_96 + | E3_97 | E3_98 | E3_99 | E3_100 | E3_101 | E3_102 | E3_103 | E3_104 + | E3_105 | E3_106 | E3_107 | E3_108 | E3_109 | E3_110 | E3_111 | E3_112 + | E3_113 | E3_114 | E3_115 | E3_116 | E3_117 | E3_118 | E3_119 | E3_120 + | E3_121 | E3_122 | E3_123 | E3_124 | E3_125 | E3_126 | E3_127 | E3_128 + | E3_129 | E3_130 | E3_131 | E3_132 | E3_133 | E3_134 | E3_135 | E3_136 + | E3_137 | E3_138 | E3_139 | E3_140 | E3_141 | E3_142 | E3_143 | E3_144 + | E3_145 | E3_146 | E3_147 | E3_148 | E3_149 | E3_150 | E3_151 | E3_152 + | E3_153 | E3_154 | E3_155 | E3_156 | E3_157 | E3_158 | E3_159 | E3_160 + | E3_161 | E3_162 | E3_163 | E3_164 | E3_165 | E3_166 | E3_167 | E3_168 + | E3_169 | E3_170 | E3_171 | E3_172 | E3_173 | E3_174 | E3_175 | E3_176 + | E3_177 | E3_178 | E3_179 | E3_180 | E3_181 | E3_182 | E3_183 | E3_184 + | E3_185 | E3_186 | E3_187 | E3_188 | E3_189 | E3_190 | E3_191 | E3_192 + | E3_193 | E3_194 | E3_195 | E3_196 | E3_197 | E3_198 | E3_199 | E3_200 + | E3_201 | E3_202 | E3_203 | E3_204 | E3_205 | E3_206 | E3_207 | E3_208 + | E3_209 | E3_210 | E3_211 | E3_212 | E3_213 | E3_214 | E3_215 | E3_216 + | E3_217 | E3_218 | E3_219 | E3_220 | E3_221 | E3_222 | E3_223 | E3_224 + | E3_225 | E3_226 | E3_227 | E3_228 | E3_229 | E3_230 | E3_231 | E3_232 + | E3_233 | E3_234 | E3_235 | E3_236 | E3_237 | E3_238 | E3_239 | E3_240 + | E3_241 | E3_242 | E3_243 | E3_244 | E3_245 | E3_246 | E3_247 | E3_248 + | E3_249 | E3_250 | E3_251 | E3_252 | E3_253 | E3_254 | E3_255 | E3_256 + deriving (Enum, Bounded, Show) + +data U_Bool = U_Bool {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) + +data U_E1 = U_E1 {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_E2 = U_E2 {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +{- + disabled to reduce memory consumption of test + +data U_E3 = U_E3 {-# UNPACK #-} !E3 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_Mixed = U_Mixed {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) +-} + +data U_Maybe = U_Maybe {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + deriving (Show) + + +data MaybeW32 = NothingW32 + | JustW32 {-# UNPACK #-} !Word32 + deriving (Show) + +data U_MaybeW32 = U_MaybeW32 {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + deriving (Show) + +u_ba :: U_Bool +u_ba = U_Bool minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1a :: U_E1 +u_e1a = U_E1 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1b :: U_E1 +u_e1b = U_E1 maxBound minBound maxBound minBound + maxBound minBound maxBound minBound + +u_e1c :: U_E1 +u_e1c = U_E1 E1_1 126 127 0 1 2 3 4 + +u_e1d :: U_E1 +u_e1d = U_E1 E1_254 126 127 0 1 2 3 4 + +u_e2a :: U_E2 +u_e2a = U_E2 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound +{- +u_e3a :: U_E3 +u_e3a = U_E3 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_mixed :: U_Mixed +u_mixed = U_Mixed maxBound minBound maxBound minBound + maxBound minBound maxBound minBound +-} + +u_maybe :: U_Maybe +u_maybe = U_Maybe Nothing (Just False) Nothing (Just True) + Nothing (Just False) Nothing (Just True) + +u_maybeW32 :: U_MaybeW32 +u_maybeW32 = U_MaybeW32 NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + +test :: Show a => String -> a -> IO () +test name value = do + putStrLn $ "\n### " ++ name + value' <- evaluate value + print value' + putStrLn ("size: " ++ show (closureSize $ asBox value')) + +main :: IO () +main = do + test "u_ba" u_ba + test "u_e1a" u_e1a + test "u_e1b" u_e1b + test "u_e1c" u_e1c + test "u_e1d" u_e1d + test "u_e2a" u_e2a + -- test "u_e3a" u_e3a + -- test "u_mixed" u_mixed + test "u_maybe" u_maybe + test "u_maybeW32" u_maybeW32 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 2 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 2 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 2 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 2 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 10 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 9 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 3 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 3 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 3 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 3 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 11 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 17 ===================================== testsuite/tests/unboxedsums/all.T ===================================== @@ -62,3 +62,5 @@ test('ManyUbxSums', ['ManyUbxSums', [('ManyUbxSums_Addr.hs','')] , '-v0 -dstg-lint -dcmm-lint']) + +test('UbxSumUnpackedSize', [js_broken(22374)], compile_and_run, ['-O']) ===================================== testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs ===================================== @@ -63,33 +63,33 @@ layout_tests = sequence_ assert_layout "layout1" [ ubxtup [ intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, WordSlot ] layout2 = assert_layout "layout2" [ ubxtup [ intTy ] , intTy ] - [ WordSlot, PtrLiftedSlot ] + [ Word8Slot, PtrLiftedSlot ] layout3 = assert_layout "layout3" [ ubxtup [ intTy, intPrimTy, intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy, intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] layout4 = assert_layout "layout4" [ ubxtup [ floatPrimTy, floatPrimTy ] , ubxtup [ intPrimTy, intPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] layout5 = assert_layout "layout5" [ ubxtup [ intPrimTy, intPrimTy ] , ubxtup [ floatPrimTy, floatPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] enum_layout = assert_layout "enum" (replicate 10 (ubxtup [])) - [ WordSlot ] + [ Word8Slot ] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/8466b59599ead28cf3f9cae22cf9487… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/8466b59599ead28cf3f9cae22cf9487… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/ubxsumtag] 2 commits: Decompose padding smallest-first in Cmm toplevel data constructors
by Luite Stegeman (＠luite) 11 Sep '25

11 Sep '25

Luite Stegeman pushed to branch wip/ubxsumtag at Glasgow Haskell Compiler / GHC Commits: e3961ed7 by Luite Stegeman at 2025-09-11T14:08:16+02:00 Decompose padding smallest-first in Cmm toplevel data constructors This makes each individual padding value aligned - - - - - 8d272ba0 by Luite Stegeman at 2025-09-11T14:08:16+02:00 Use slots smaller than word as tag for smaller unboxed sums This packs unboxed sums more efficiently by allowing Word8, Word16 and Word32 for the tag field if the number of constructors is small enough - - - - - 11 changed files: - compiler/GHC/Cmm/Utils.hs - compiler/GHC/Stg/Unarise.hs - compiler/GHC/StgToCmm/DataCon.hs - compiler/GHC/Types/RepType.hs - testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 - + testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout - + testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 - testsuite/tests/unboxedsums/all.T - testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs Changes: ===================================== compiler/GHC/Cmm/Utils.hs ===================================== @@ -115,6 +115,9 @@ slotCmmType platform = \case PtrUnliftedSlot -> gcWord platform PtrLiftedSlot -> gcWord platform WordSlot -> bWord platform + Word8Slot -> b8 + Word16Slot -> b16 + Word32Slot -> b32 Word64Slot -> b64 FloatSlot -> f32 DoubleSlot -> f64 ===================================== compiler/GHC/Stg/Unarise.hs ===================================== @@ -404,7 +404,6 @@ import GHC.Stg.Syntax import GHC.Stg.Utils import GHC.Stg.Make import GHC.Core.Type -import GHC.Builtin.Types.Prim (intPrimTy) import GHC.Builtin.Types import GHC.Types.Unique.Supply import GHC.Types.Unique @@ -681,15 +680,15 @@ elimCase rho args bndr (MultiValAlt _) [GenStgAlt{ alt_con = _ elimCase rho args@(tag_arg : real_args) bndr (MultiValAlt _) alts | isUnboxedSumBndr bndr - = do tag_bndr <- mkId (mkFastString "tag") tagTy + = do tag_bndr <- mkId (mkFastString "tag") (tagTyArg tag_arg) -- this won't be used but we need a binder anyway let rho1 = extendRho rho bndr (MultiVal args) scrut' = case tag_arg of StgVarArg v -> StgApp v [] StgLitArg l -> StgLit l - - alts' <- unariseSumAlts rho1 real_args alts - return (StgCase scrut' tag_bndr tagAltTy alts') + alt_ty = (tagAltTyArg tag_arg) + alts' <- unariseSumAlts rho1 alt_ty real_args alts + return (StgCase scrut' tag_bndr alt_ty alts') elimCase _ args bndr alt_ty alts = pprPanic "elimCase - unhandled case" @@ -732,8 +731,9 @@ unariseAlts rho (MultiValAlt _) bndr [GenStgAlt{ alt_con = DEFAULT unariseAlts rho (MultiValAlt _) bndr alts | isUnboxedSumBndr bndr = do (rho_sum_bndrs, scrt_bndrs@(tag_bndr : real_bndrs)) <- unariseConArgBinder rho bndr - alts' <- unariseSumAlts rho_sum_bndrs (map StgVarArg real_bndrs) alts - let inner_case = StgCase (StgApp tag_bndr []) tag_bndr tagAltTy alts' + let alt_ty = tagAltTy tag_bndr + alts' <- unariseSumAlts rho_sum_bndrs alt_ty (map StgVarArg real_bndrs) alts + let inner_case = StgCase (StgApp tag_bndr []) tag_bndr alt_ty alts' return [GenStgAlt{ alt_con = DataAlt (tupleDataCon Unboxed (length scrt_bndrs)) , alt_bndrs = scrt_bndrs , alt_rhs = inner_case @@ -753,21 +753,23 @@ unariseAlt rho alt@GenStgAlt{alt_con=_,alt_bndrs=xs,alt_rhs=e} -- | Make alternatives that match on the tag of a sum -- (i.e. generate LitAlts for the tag) unariseSumAlts :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> [StgAlt] -- original alternative with sum LHS -> UniqSM [StgAlt] -unariseSumAlts env args alts - = do alts' <- mapM (unariseSumAlt env args) alts +unariseSumAlts env tag_slot args alts + = do alts' <- mapM (unariseSumAlt env tag_slot args) alts return (mkDefaultLitAlt alts') unariseSumAlt :: UnariseEnv + -> AltType -> [StgArg] -- sum components _excluding_ the tag bit. -> StgAlt -- original alternative with sum LHS -> UniqSM StgAlt -unariseSumAlt rho _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} +unariseSumAlt rho _ _ GenStgAlt{alt_con=DEFAULT,alt_bndrs=_,alt_rhs=e} = GenStgAlt DEFAULT mempty <$> unariseExpr rho e -unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon +unariseSumAlt rho tag_slot args alt@GenStgAlt{ alt_con = DataAlt sumCon , alt_bndrs = bs , alt_rhs = e } @@ -776,10 +778,19 @@ unariseSumAlt rho args alt@GenStgAlt{ alt_con = DataAlt sumCon [b] -> mapSumIdBinders b args e rho -- Sums must have one binder _ -> pprPanic "unariseSumAlt2" (ppr args $$ pprPanicAlt alt) - let lit_case = LitAlt (LitNumber LitNumInt (fromIntegral (dataConTag sumCon))) + let num_ty = + case tag_slot of + PrimAlt Int8Rep -> LitNumInt8 + PrimAlt Word8Rep -> LitNumWord8 + PrimAlt Int16Rep -> LitNumInt16 + PrimAlt Word16Rep -> LitNumWord16 + PrimAlt Int32Rep -> LitNumInt32 + PrimAlt Word32Rep -> LitNumWord32 + _ -> LitNumInt + lit_case = LitAlt (LitNumber num_ty (fromIntegral (dataConTag sumCon))) GenStgAlt lit_case mempty <$> unariseExpr rho' e' -unariseSumAlt _ scrt alt +unariseSumAlt _ _ scrt alt = pprPanic "unariseSumAlt3" (ppr scrt $$ pprPanicAlt alt) -------------------------------------------------------------------------------- @@ -865,12 +876,6 @@ mapSumIdBinders alt_bndr args rhs rho0 typed_id_args = map StgVarArg typed_ids - -- pprTrace "mapSumIdBinders" - -- (text "fld_reps" <+> ppr fld_reps $$ - -- text "id_args" <+> ppr id_arg_exprs $$ - -- text "rhs" <+> ppr rhs $$ - -- text "rhs_with_casts" <+> ppr rhs_with_casts - -- ) $ if isMultiValBndr alt_bndr then return (extendRho rho0 alt_bndr (MultiVal typed_id_args), rhs_with_casts rhs) else assert (typed_id_args `lengthIs` 1) $ @@ -921,13 +926,19 @@ mkUbxSum ) mkUbxSum dc ty_args args0 us = let - _ :| sum_slots = ubxSumRepType ty_args + tag_slot :| sum_slots = ubxSumRepType ty_args -- drop tag slot field_slots = (mapMaybe (repSlotTy . stgArgRep) args0) tag = dataConTag dc layout' = layoutUbxSum sum_slots field_slots - tag_arg = StgLitArg (LitNumber LitNumInt (fromIntegral tag)) + tag_arg = + case tag_slot of + Word8Slot -> StgLitArg (LitNumber LitNumWord8 (fromIntegral tag)) + Word16Slot -> StgLitArg (LitNumber LitNumWord16 (fromIntegral tag)) + Word32Slot -> StgLitArg (LitNumber LitNumWord32 (fromIntegral tag)) + WordSlot -> StgLitArg (LitNumber LitNumWord (fromIntegral tag)) + _ -> pprPanic "mkUbxSum: unexpected tag slot: " (ppr tag_slot) arg_idxs = IM.fromList (zipEqual layout' args0) ((_idx,_idx_map,_us,wrapper),slot_args) @@ -990,6 +1001,9 @@ ubxSumRubbishArg :: SlotTy -> StgArg ubxSumRubbishArg PtrLiftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg PtrUnliftedSlot = StgVarArg aBSENT_SUM_FIELD_ERROR_ID ubxSumRubbishArg WordSlot = StgLitArg (LitNumber LitNumWord 0) +ubxSumRubbishArg Word8Slot = StgLitArg (LitNumber LitNumWord8 0) +ubxSumRubbishArg Word16Slot = StgLitArg (LitNumber LitNumWord16 0) +ubxSumRubbishArg Word32Slot = StgLitArg (LitNumber LitNumWord32 0) ubxSumRubbishArg Word64Slot = StgLitArg (LitNumber LitNumWord64 0) ubxSumRubbishArg FloatSlot = StgLitArg (LitFloat 0) ubxSumRubbishArg DoubleSlot = StgLitArg (LitDouble 0) @@ -1166,11 +1180,18 @@ isUnboxedTupleBndr = isUnboxedTupleType . idType mkTuple :: [StgArg] -> StgExpr mkTuple args = StgConApp (tupleDataCon Unboxed (length args)) NoNumber args [] -tagAltTy :: AltType -tagAltTy = PrimAlt IntRep +tagAltTyArg :: StgArg -> AltType +tagAltTyArg a + | [pr] <- typePrimRep (stgArgType a) = PrimAlt pr + | otherwise = pprPanic "tagAltTyArg" (ppr a) + +tagAltTy :: Id -> AltType +tagAltTy i + | [pr] <- typePrimRep (idType i) = PrimAlt pr + | otherwise = pprPanic "tagAltTy" (ppr $ idType i) -tagTy :: Type -tagTy = intPrimTy +tagTyArg :: StgArg -> Type +tagTyArg x = stgArgType x voidArg :: StgArg voidArg = StgVarArg voidPrimId ===================================== compiler/GHC/StgToCmm/DataCon.hs ===================================== @@ -107,10 +107,10 @@ cgTopRhsCon cfg id con mn args fix_padding (x@(Padding n off) : rest) | n == 0 = fix_padding rest | n `elem` [1,2,4,8] = x : fix_padding rest - | n > 8 = add_pad 8 - | n > 4 = add_pad 4 - | n > 2 = add_pad 2 - | otherwise = add_pad 1 + | testBit n 0 = add_pad 1 + | testBit n 1 = add_pad 2 + | testBit n 2 = add_pad 4 + | otherwise = add_pad 8 where add_pad m = Padding m off : fix_padding (Padding (n-m) (off+m) : rest) fix_padding (x : rest) = x : fix_padding rest fix_padding [] = [] ===================================== compiler/GHC/Types/RepType.hs ===================================== @@ -197,12 +197,12 @@ type SortedSlotTys = [SlotTy] -- of the list we have the slot for the tag. ubxSumRepType :: [[PrimRep]] -> NonEmpty SlotTy ubxSumRepType constrs0 - -- These first two cases never classify an actual unboxed sum, which always + -- This first case never classifies an actual unboxed sum, which always -- has at least two disjuncts. But it could happen if a user writes, e.g., -- forall (a :: TYPE (SumRep [IntRep])). ... -- which could never be instantiated. We still don't want to panic. | constrs0 `lengthLessThan` 2 - = WordSlot :| [] + = Word8Slot :| [] | otherwise = let @@ -230,8 +230,17 @@ ubxSumRepType constrs0 rep :: [PrimRep] -> SortedSlotTys rep ty = sort (map primRepSlot ty) - sumRep = WordSlot :| combine_alts (map rep constrs0) - -- WordSlot: for the tag of the sum + -- constructors start at 1, pick an appropriate slot size for the tag + tag_slot | length constrs0 < 256 = Word8Slot + | length constrs0 < 65536 = Word16Slot + -- we use 2147483647 instead of 4294967296 to avoid + -- overflow when building a 32 bit GHC. Please fix the + -- overflow if you encounter a type with more than 2147483646 + -- constructors and need the tag to be 32 bits. + | length constrs0 < 2147483647 = Word32Slot + | otherwise = WordSlot + + sumRep = tag_slot :| combine_alts (map rep constrs0) in sumRep @@ -275,22 +284,32 @@ layoutUbxSum sum_slots0 arg_slots0 = -- - Float slots: Shared between floating point types. -- -- - Void slots: Shared between void types. Not used in sums. --- --- TODO(michalt): We should probably introduce `SlotTy`s for 8-/16-/32-bit --- values, so that we can pack things more tightly. -data SlotTy = PtrLiftedSlot | PtrUnliftedSlot | WordSlot | Word64Slot | FloatSlot | DoubleSlot | VecSlot Int PrimElemRep + +data SlotTy = PtrLiftedSlot + | PtrUnliftedSlot + | Word8Slot + | Word16Slot + | Word32Slot + | WordSlot + | Word64Slot + | FloatSlot + | DoubleSlot + | VecSlot Int PrimElemRep deriving (Eq, Ord) -- Constructor order is important! If slot A could fit into slot B -- then slot A must occur first. E.g. FloatSlot before DoubleSlot -- - -- We are assuming that WordSlot is smaller than or equal to Word64Slot - -- (would not be true on a 128-bit machine) + -- We are assuming that Word32Slot <= WordSlot <= Word64Slot + -- (would not be true on a 16-bit or 128-bit machine) instance Outputable SlotTy where ppr PtrLiftedSlot = text "PtrLiftedSlot" ppr PtrUnliftedSlot = text "PtrUnliftedSlot" ppr Word64Slot = text "Word64Slot" ppr WordSlot = text "WordSlot" + ppr Word32Slot = text "Word32Slot" + ppr Word16Slot = text "Word16Slot" + ppr Word8Slot = text "Word8Slot" ppr DoubleSlot = text "DoubleSlot" ppr FloatSlot = text "FloatSlot" ppr (VecSlot n e) = text "VecSlot" <+> ppr n <+> ppr e @@ -307,14 +326,14 @@ primRepSlot (BoxedRep mlev) = case mlev of Just Lifted -> PtrLiftedSlot Just Unlifted -> PtrUnliftedSlot primRepSlot IntRep = WordSlot -primRepSlot Int8Rep = WordSlot -primRepSlot Int16Rep = WordSlot -primRepSlot Int32Rep = WordSlot +primRepSlot Int8Rep = Word8Slot +primRepSlot Int16Rep = Word16Slot +primRepSlot Int32Rep = Word32Slot primRepSlot Int64Rep = Word64Slot primRepSlot WordRep = WordSlot -primRepSlot Word8Rep = WordSlot -primRepSlot Word16Rep = WordSlot -primRepSlot Word32Rep = WordSlot +primRepSlot Word8Rep = Word8Slot +primRepSlot Word16Rep = Word16Slot +primRepSlot Word32Rep = Word32Slot primRepSlot Word64Rep = Word64Slot primRepSlot AddrRep = WordSlot primRepSlot FloatRep = FloatSlot @@ -325,6 +344,9 @@ slotPrimRep :: SlotTy -> PrimRep slotPrimRep PtrLiftedSlot = BoxedRep (Just Lifted) slotPrimRep PtrUnliftedSlot = BoxedRep (Just Unlifted) slotPrimRep Word64Slot = Word64Rep +slotPrimRep Word32Slot = Word32Rep +slotPrimRep Word16Slot = Word16Rep +slotPrimRep Word8Slot = Word8Rep slotPrimRep WordSlot = WordRep slotPrimRep DoubleSlot = DoubleRep slotPrimRep FloatSlot = FloatRep @@ -349,11 +371,12 @@ fitsIn ty1 ty2 -- See Note [Casting slot arguments] where isWordSlot Word64Slot = True + isWordSlot Word32Slot = True + isWordSlot Word16Slot = True + isWordSlot Word8Slot = True isWordSlot WordSlot = True isWordSlot _ = False - - {- ********************************************************************** * * PrimRep ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout → testsuite/tests/codeGen/should_compile/T25166.stdout-ws-32 ===================================== @@ -2,5 +2,7 @@ Test.foo_closure: const Test.D_con_info; const GHC.Internal.Types.True_closure+2; - const 2; + const 2 :: W8; + const 0 :: W16; + const 0 :: W8; const 3; ===================================== testsuite/tests/codeGen/should_compile/T25166.stdout-ws-64 ===================================== @@ -0,0 +1,9 @@ +[section ""data" . Test.foo_closure" { + Test.foo_closure: + const Test.D_con_info; + const GHC.Internal.Types.True_closure+2; + const 2 :: W8; + const 0 :: W32; + const 0 :: W16; + const 0 :: W8; + const 3; ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs ===================================== @@ -0,0 +1,254 @@ +module Main where + +import GHC.Exts.Heap.Closures +import Control.Exception (evaluate) +import Data.Word (Word32) +import Data.Int (Int8, Int16) + +-- this should get a Word8 tag +data E1 + = E1_1 | E1_2 | E1_3 | E1_4 | E1_5 | E1_6 | E1_7 | E1_8 + | E1_9 | E1_10 | E1_11 | E1_12 | E1_13 | E1_14 | E1_15 | E1_16 + | E1_17 | E1_18 | E1_19 | E1_20 | E1_21 | E1_22 | E1_23 | E1_24 + | E1_25 | E1_26 | E1_27 | E1_28 | E1_29 | E1_30 | E1_31 | E1_32 + | E1_33 | E1_34 | E1_35 | E1_36 | E1_37 | E1_38 | E1_39 | E1_40 + | E1_41 | E1_42 | E1_43 | E1_44 | E1_45 | E1_46 | E1_47 | E1_48 + | E1_49 | E1_50 | E1_51 | E1_52 | E1_53 | E1_54 | E1_55 | E1_56 + | E1_57 | E1_58 | E1_59 | E1_60 | E1_61 | E1_62 | E1_63 | E1_64 + | E1_65 | E1_66 | E1_67 | E1_68 | E1_69 | E1_70 | E1_71 | E1_72 + | E1_73 | E1_74 | E1_75 | E1_76 | E1_77 | E1_78 | E1_79 | E1_80 + | E1_81 | E1_82 | E1_83 | E1_84 | E1_85 | E1_86 | E1_87 | E1_88 + | E1_89 | E1_90 | E1_91 | E1_92 | E1_93 | E1_94 | E1_95 | E1_96 + | E1_97 | E1_98 | E1_99 | E1_100 | E1_101 | E1_102 | E1_103 | E1_104 + | E1_105 | E1_106 | E1_107 | E1_108 | E1_109 | E1_110 | E1_111 | E1_112 + | E1_113 | E1_114 | E1_115 | E1_116 | E1_117 | E1_118 | E1_119 | E1_120 + | E1_121 | E1_122 | E1_123 | E1_124 | E1_125 | E1_126 | E1_127 | E1_128 + | E1_129 | E1_130 | E1_131 | E1_132 | E1_133 | E1_134 | E1_135 | E1_136 + | E1_137 | E1_138 | E1_139 | E1_140 | E1_141 | E1_142 | E1_143 | E1_144 + | E1_145 | E1_146 | E1_147 | E1_148 | E1_149 | E1_150 | E1_151 | E1_152 + | E1_153 | E1_154 | E1_155 | E1_156 | E1_157 | E1_158 | E1_159 | E1_160 + | E1_161 | E1_162 | E1_163 | E1_164 | E1_165 | E1_166 | E1_167 | E1_168 + | E1_169 | E1_170 | E1_171 | E1_172 | E1_173 | E1_174 | E1_175 | E1_176 + | E1_177 | E1_178 | E1_179 | E1_180 | E1_181 | E1_182 | E1_183 | E1_184 + | E1_185 | E1_186 | E1_187 | E1_188 | E1_189 | E1_190 | E1_191 | E1_192 + | E1_193 | E1_194 | E1_195 | E1_196 | E1_197 | E1_198 | E1_199 | E1_200 + | E1_201 | E1_202 | E1_203 | E1_204 | E1_205 | E1_206 | E1_207 | E1_208 + | E1_209 | E1_210 | E1_211 | E1_212 | E1_213 | E1_214 | E1_215 | E1_216 + | E1_217 | E1_218 | E1_219 | E1_220 | E1_221 | E1_222 | E1_223 | E1_224 + | E1_225 | E1_226 | E1_227 | E1_228 | E1_229 | E1_230 | E1_231 | E1_232 + | E1_233 | E1_234 | E1_235 | E1_236 | E1_237 | E1_238 | E1_239 | E1_240 + | E1_241 | E1_242 | E1_243 | E1_244 | E1_245 | E1_246 | E1_247 | E1_248 + | E1_249 | E1_250 | E1_251 | E1_252 | E1_253 | E1_254 + deriving (Enum, Bounded, Show) + +-- this should get a Word8 tag +data E2 + = E2_1 | E2_2 | E2_3 | E2_4 | E2_5 | E2_6 | E2_7 | E2_8 + | E2_9 | E2_10 | E2_11 | E2_12 | E2_13 | E2_14 | E2_15 | E2_16 + | E2_17 | E2_18 | E2_19 | E2_20 | E2_21 | E2_22 | E2_23 | E2_24 + | E2_25 | E2_26 | E2_27 | E2_28 | E2_29 | E2_30 | E2_31 | E2_32 + | E2_33 | E2_34 | E2_35 | E2_36 | E2_37 | E2_38 | E2_39 | E2_40 + | E2_41 | E2_42 | E2_43 | E2_44 | E2_45 | E2_46 | E2_47 | E2_48 + | E2_49 | E2_50 | E2_51 | E2_52 | E2_53 | E2_54 | E2_55 | E2_56 + | E2_57 | E2_58 | E2_59 | E2_60 | E2_61 | E2_62 | E2_63 | E2_64 + | E2_65 | E2_66 | E2_67 | E2_68 | E2_69 | E2_70 | E2_71 | E2_72 + | E2_73 | E2_74 | E2_75 | E2_76 | E2_77 | E2_78 | E2_79 | E2_80 + | E2_81 | E2_82 | E2_83 | E2_84 | E2_85 | E2_86 | E2_87 | E2_88 + | E2_89 | E2_90 | E2_91 | E2_92 | E2_93 | E2_94 | E2_95 | E2_96 + | E2_97 | E2_98 | E2_99 | E2_100 | E2_101 | E2_102 | E2_103 | E2_104 + | E2_105 | E2_106 | E2_107 | E2_108 | E2_109 | E2_110 | E2_111 | E2_112 + | E2_113 | E2_114 | E2_115 | E2_116 | E2_117 | E2_118 | E2_119 | E2_120 + | E2_121 | E2_122 | E2_123 | E2_124 | E2_125 | E2_126 | E2_127 | E2_128 + | E2_129 | E2_130 | E2_131 | E2_132 | E2_133 | E2_134 | E2_135 | E2_136 + | E2_137 | E2_138 | E2_139 | E2_140 | E2_141 | E2_142 | E2_143 | E2_144 + | E2_145 | E2_146 | E2_147 | E2_148 | E2_149 | E2_150 | E2_151 | E2_152 + | E2_153 | E2_154 | E2_155 | E2_156 | E2_157 | E2_158 | E2_159 | E2_160 + | E2_161 | E2_162 | E2_163 | E2_164 | E2_165 | E2_166 | E2_167 | E2_168 + | E2_169 | E2_170 | E2_171 | E2_172 | E2_173 | E2_174 | E2_175 | E2_176 + | E2_177 | E2_178 | E2_179 | E2_180 | E2_181 | E2_182 | E2_183 | E2_184 + | E2_185 | E2_186 | E2_187 | E2_188 | E2_189 | E2_190 | E2_191 | E2_192 + | E2_193 | E2_194 | E2_195 | E2_196 | E2_197 | E2_198 | E2_199 | E2_200 + | E2_201 | E2_202 | E2_203 | E2_204 | E2_205 | E2_206 | E2_207 | E2_208 + | E2_209 | E2_210 | E2_211 | E2_212 | E2_213 | E2_214 | E2_215 | E2_216 + | E2_217 | E2_218 | E2_219 | E2_220 | E2_221 | E2_222 | E2_223 | E2_224 + | E2_225 | E2_226 | E2_227 | E2_228 | E2_229 | E2_230 | E2_231 | E2_232 + | E2_233 | E2_234 | E2_235 | E2_236 | E2_237 | E2_238 | E2_239 | E2_240 + | E2_241 | E2_242 | E2_243 | E2_244 | E2_245 | E2_246 | E2_247 | E2_248 + | E2_249 | E2_250 | E2_251 | E2_252 | E2_253 | E2_254 | E2_255 + deriving (Enum, Bounded, Show) + +-- this needs a Word16 tag +data E3 + = E3_1 | E3_2 | E3_3 | E3_4 | E3_5 | E3_6 | E3_7 | E3_8 + | E3_9 | E3_10 | E3_11 | E3_12 | E3_13 | E3_14 | E3_15 | E3_16 + | E3_17 | E3_18 | E3_19 | E3_20 | E3_21 | E3_22 | E3_23 | E3_24 + | E3_25 | E3_26 | E3_27 | E3_28 | E3_29 | E3_30 | E3_31 | E3_32 + | E3_33 | E3_34 | E3_35 | E3_36 | E3_37 | E3_38 | E3_39 | E3_40 + | E3_41 | E3_42 | E3_43 | E3_44 | E3_45 | E3_46 | E3_47 | E3_48 + | E3_49 | E3_50 | E3_51 | E3_52 | E3_53 | E3_54 | E3_55 | E3_56 + | E3_57 | E3_58 | E3_59 | E3_60 | E3_61 | E3_62 | E3_63 | E3_64 + | E3_65 | E3_66 | E3_67 | E3_68 | E3_69 | E3_70 | E3_71 | E3_72 + | E3_73 | E3_74 | E3_75 | E3_76 | E3_77 | E3_78 | E3_79 | E3_80 + | E3_81 | E3_82 | E3_83 | E3_84 | E3_85 | E3_86 | E3_87 | E3_88 + | E3_89 | E3_90 | E3_91 | E3_92 | E3_93 | E3_94 | E3_95 | E3_96 + | E3_97 | E3_98 | E3_99 | E3_100 | E3_101 | E3_102 | E3_103 | E3_104 + | E3_105 | E3_106 | E3_107 | E3_108 | E3_109 | E3_110 | E3_111 | E3_112 + | E3_113 | E3_114 | E3_115 | E3_116 | E3_117 | E3_118 | E3_119 | E3_120 + | E3_121 | E3_122 | E3_123 | E3_124 | E3_125 | E3_126 | E3_127 | E3_128 + | E3_129 | E3_130 | E3_131 | E3_132 | E3_133 | E3_134 | E3_135 | E3_136 + | E3_137 | E3_138 | E3_139 | E3_140 | E3_141 | E3_142 | E3_143 | E3_144 + | E3_145 | E3_146 | E3_147 | E3_148 | E3_149 | E3_150 | E3_151 | E3_152 + | E3_153 | E3_154 | E3_155 | E3_156 | E3_157 | E3_158 | E3_159 | E3_160 + | E3_161 | E3_162 | E3_163 | E3_164 | E3_165 | E3_166 | E3_167 | E3_168 + | E3_169 | E3_170 | E3_171 | E3_172 | E3_173 | E3_174 | E3_175 | E3_176 + | E3_177 | E3_178 | E3_179 | E3_180 | E3_181 | E3_182 | E3_183 | E3_184 + | E3_185 | E3_186 | E3_187 | E3_188 | E3_189 | E3_190 | E3_191 | E3_192 + | E3_193 | E3_194 | E3_195 | E3_196 | E3_197 | E3_198 | E3_199 | E3_200 + | E3_201 | E3_202 | E3_203 | E3_204 | E3_205 | E3_206 | E3_207 | E3_208 + | E3_209 | E3_210 | E3_211 | E3_212 | E3_213 | E3_214 | E3_215 | E3_216 + | E3_217 | E3_218 | E3_219 | E3_220 | E3_221 | E3_222 | E3_223 | E3_224 + | E3_225 | E3_226 | E3_227 | E3_228 | E3_229 | E3_230 | E3_231 | E3_232 + | E3_233 | E3_234 | E3_235 | E3_236 | E3_237 | E3_238 | E3_239 | E3_240 + | E3_241 | E3_242 | E3_243 | E3_244 | E3_245 | E3_246 | E3_247 | E3_248 + | E3_249 | E3_250 | E3_251 | E3_252 | E3_253 | E3_254 | E3_255 | E3_256 + deriving (Enum, Bounded, Show) + +data U_Bool = U_Bool {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) + +data U_E1 = U_E1 {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_E2 = U_E2 {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +{- + disabled to reduce memory consumption of test + +data U_E3 = U_E3 {-# UNPACK #-} !E3 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !Int8 + deriving (Show) + +data U_Mixed = U_Mixed {-# UNPACK #-} !E1 + {-# UNPACK #-} !Int8 + {-# UNPACK #-} !E2 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Int16 + {-# UNPACK #-} !Bool + {-# UNPACK #-} !Bool + deriving (Show) +-} + +data U_Maybe = U_Maybe {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + {-# UNPACK #-} !(Maybe Bool) + deriving (Show) + + +data MaybeW32 = NothingW32 + | JustW32 {-# UNPACK #-} !Word32 + deriving (Show) + +data U_MaybeW32 = U_MaybeW32 {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + {-# UNPACK #-} !MaybeW32 + deriving (Show) + +u_ba :: U_Bool +u_ba = U_Bool minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1a :: U_E1 +u_e1a = U_E1 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_e1b :: U_E1 +u_e1b = U_E1 maxBound minBound maxBound minBound + maxBound minBound maxBound minBound + +u_e1c :: U_E1 +u_e1c = U_E1 E1_1 126 127 0 1 2 3 4 + +u_e1d :: U_E1 +u_e1d = U_E1 E1_254 126 127 0 1 2 3 4 + +u_e2a :: U_E2 +u_e2a = U_E2 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound +{- +u_e3a :: U_E3 +u_e3a = U_E3 minBound maxBound minBound maxBound + minBound maxBound minBound maxBound + +u_mixed :: U_Mixed +u_mixed = U_Mixed maxBound minBound maxBound minBound + maxBound minBound maxBound minBound +-} + +u_maybe :: U_Maybe +u_maybe = U_Maybe Nothing (Just False) Nothing (Just True) + Nothing (Just False) Nothing (Just True) + +u_maybeW32 :: U_MaybeW32 +u_maybeW32 = U_MaybeW32 NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + NothingW32 (JustW32 minBound) + NothingW32 (JustW32 maxBound) + +test :: Show a => String -> a -> IO () +test name value = do + putStrLn $ "\n### " ++ name + value' <- evaluate value + print value' + putStrLn ("size: " ++ show (closureSize $ asBox value')) + +main :: IO () +main = do + test "u_ba" u_ba + test "u_e1a" u_e1a + test "u_e1b" u_e1b + test "u_e1c" u_e1c + test "u_e1d" u_e1d + test "u_e2a" u_e2a + -- test "u_e3a" u_e3a + -- test "u_mixed" u_mixed + test "u_maybe" u_maybe + test "u_maybeW32" u_maybeW32 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 2 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 2 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 2 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 2 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 2 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 10 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 9 ===================================== testsuite/tests/unboxedsums/UbxSumUnpackedSize.stdout-ws-32 ===================================== @@ -0,0 +1,32 @@ + +### u_ba +U_Bool False True False True False True False True +size: 3 + +### u_e1a +U_E1 E1_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_e1b +U_E1 E1_254 (-128) 127 (-128) 127 (-128) 127 (-128) +size: 3 + +### u_e1c +U_E1 E1_1 126 127 0 1 2 3 4 +size: 3 + +### u_e1d +U_E1 E1_254 126 127 0 1 2 3 4 +size: 3 + +### u_e2a +U_E2 E2_1 127 (-128) 127 (-128) 127 (-128) 127 +size: 3 + +### u_maybe +U_Maybe Nothing (Just False) Nothing (Just True) Nothing (Just False) Nothing (Just True) +size: 11 + +### u_maybeW32 +U_MaybeW32 NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) NothingW32 (JustW32 0) NothingW32 (JustW32 4294967295) +size: 17 ===================================== testsuite/tests/unboxedsums/all.T ===================================== @@ -62,3 +62,5 @@ test('ManyUbxSums', ['ManyUbxSums', [('ManyUbxSums_Addr.hs','')] , '-v0 -dstg-lint -dcmm-lint']) + +test('UbxSumUnpackedSize', [js_broken(22374)], compile_and_run, ['-O']) ===================================== testsuite/tests/unboxedsums/unboxedsums_unit_tests.hs ===================================== @@ -63,33 +63,33 @@ layout_tests = sequence_ assert_layout "layout1" [ ubxtup [ intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, WordSlot ] layout2 = assert_layout "layout2" [ ubxtup [ intTy ] , intTy ] - [ WordSlot, PtrLiftedSlot ] + [ Word8Slot, PtrLiftedSlot ] layout3 = assert_layout "layout3" [ ubxtup [ intTy, intPrimTy, intTy, intPrimTy ] , ubxtup [ intPrimTy, intTy, intPrimTy, intTy ] ] - [ WordSlot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] + [ Word8Slot, PtrLiftedSlot, PtrLiftedSlot, WordSlot, WordSlot ] layout4 = assert_layout "layout4" [ ubxtup [ floatPrimTy, floatPrimTy ] , ubxtup [ intPrimTy, intPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] layout5 = assert_layout "layout5" [ ubxtup [ intPrimTy, intPrimTy ] , ubxtup [ floatPrimTy, floatPrimTy ] ] - [ WordSlot, WordSlot, WordSlot, FloatSlot, FloatSlot ] + [ Word8Slot, WordSlot, WordSlot, FloatSlot, FloatSlot ] enum_layout = assert_layout "enum" (replicate 10 (ubxtup [])) - [ WordSlot ] + [ Word8Slot ] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/161da2e39ec6c167a67b22f6725861… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/161da2e39ec6c167a67b22f6725861… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/9.12.3-backports] 7 commits: Prepare 9.12.3
by Zubin (＠wz1000) 11 Sep '25

11 Sep '25

Zubin pushed to branch wip/9.12.3-backports at Glasgow Haskell Compiler / GHC Commits: 82425570 by Zubin Duggal at 2025-09-11T17:28:38+05:30 Prepare 9.12.3 - - - - - 1a2e2500 by Simon Peyton Jones at 2025-09-11T17:28:38+05:30 Take more care in zonkEqTypes on AppTy/AppTy This patch fixes #26256. See Note [zonkEqTypes and the PKTI] in GHC.Tc.Solver.Equality (cherry picked from commit 18036d5205ac648bb245217519fed2fd931a9982) - - - - - c6e5ce5c by Andreas Klebinger at 2025-09-11T17:28:38+05:30 Make unexpected LLVM versions a warning rather than an error. Typically a newer LLVM version *will* work so erroring out if a user uses a newer LLVM version is too aggressive. Fixes #25915 (cherry picked from commit 50842f83f467ff54dd22470559a7af79d2025c03) - - - - - 7ec2f532 by Teo Camarasu at 2025-09-11T17:28:38+05:30 rts: spin if we see a WHITEHOLE in messageBlackHole When a BLACKHOLE gets cancelled in raiseAsync, we indirect to a THUNK. GC can then shortcut this, replacing our BLACKHOLE with a fresh THUNK. This THUNK is not guaranteed to have a valid indirectee field. If at the same time, a message intended for the previous BLACKHOLE is processed and concurrently we BLACKHOLE the THUNK, thus temporarily turning it into a WHITEHOLE, we can get a segfault, since we look at the undefined indirectee field of the THUNK The fix is simple: spin if we see a WHITEHOLE, and it will soon be replaced with a valid BLACKHOLE. Resolves #26205 (cherry picked from commit 4021181ee0860aca2054883a531f3312361cc701) - - - - - b3176a25 by Teo Camarasu at 2025-09-11T17:28:38+05:30 rts: ensure MessageBlackHole.link is always a valid closure We turn a MessageBlackHole into an StgInd in wakeBlockingQueue(). Therefore it's important that the link field, which becomes the indirection field, always points to a valid closure. It's unclear whether it's currently possible for the previous behaviour to lead to a crash, but it's good to be consistent about this invariant nonetheless. Co-authored-by: Andreas Klebinger <klebinger.andreas(a)gmx.at> (cherry picked from commit a8b2fbae6bcf20bc2f3fe58803096d2a9c5fc43d) - - - - - 6f94a682 by Reed Mullanix at 2025-09-11T17:28:38+05:30 ghc-internal: Fix naturalAndNot for NB/NS case When the first argument to `naturalAndNot` is larger than a `Word` and the second is `Word`-sized, `naturalAndNot` will truncate the result: ``` >>> naturalAndNot ((2 ^ 65) .|. (2 ^ 3)) (2 ^ 3) 0 ``` In contrast, `naturalAndNot` does not truncate when both arguments are larger than a `Word`, so this appears to be a bug. Luckily, the fix is pretty easy: we just need to call `bigNatAndNotWord#` instead of truncating. Fixes #26230 (cherry picked from commit a766286fe759251eceb304c54ba52841c2a51f86) - - - - - 9fdad140 by Ben Gamari at 2025-09-11T17:28:38+05:30 llvmGen: Fix built-in variable predicate Previously the predicate to identify LLVM builtin global variables was checking for `$llvm` rather than `@llvm` as it should. (cherry picked from commit 6e67fa083a50684e1cfae546e07cab4d4250e871) - - - - - 21 changed files: - compiler/GHC/CmmToLlvm/Base.hs - compiler/GHC/Driver/Errors/Ppr.hs - compiler/GHC/Tc/Solver/Equality.hs - configure.ac - docs/users_guide/9.12.3-notes.rst - libraries/base/changelog.md - libraries/ghc-bignum/changelog.md - libraries/ghc-bignum/src/GHC/Num/Natural.hs - rts/Messages.c - rts/StgMiscClosures.cmm - rts/Updates.h - testsuite/driver/testlib.py - + testsuite/tests/numeric/should_run/T26230.hs - + testsuite/tests/numeric/should_run/T26230.stdout - testsuite/tests/numeric/should_run/all.T - + testsuite/tests/partial-sigs/should_compile/T26256.hs - + testsuite/tests/partial-sigs/should_compile/T26256.stderr - testsuite/tests/partial-sigs/should_compile/all.T - testsuite/tests/polykinds/T14172.stderr - + testsuite/tests/typecheck/should_compile/T26256a.hs - testsuite/tests/typecheck/should_compile/all.T Changes: ===================================== compiler/GHC/CmmToLlvm/Base.hs ===================================== @@ -526,10 +526,10 @@ generateExternDecls = do modifyEnv $ \env -> env { envAliases = emptyUniqSet } return (concat defss, []) --- | Is a variable one of the special @$llvm@ globals? +-- | Is a variable one of the special @\@llvm@ globals? isBuiltinLlvmVar :: LlvmVar -> Bool isBuiltinLlvmVar (LMGlobalVar lbl _ _ _ _ _) = - "$llvm" `isPrefixOf` unpackFS lbl + "llvm." `isPrefixOf` unpackFS lbl isBuiltinLlvmVar _ = False -- | Here we take a global variable definition, rename it with a ===================================== compiler/GHC/Driver/Errors/Ppr.hs ===================================== @@ -276,7 +276,7 @@ instance Diagnostic DriverMessage where ++ llvmVersionStr supportedLlvmVersionLowerBound ++ " and " ++ llvmVersionStr supportedLlvmVersionUpperBound - ++ ") and reinstall GHC to make -fllvm work") + ++ ") and reinstall GHC to ensure -fllvm works") diagnosticReason = \case DriverUnknownMessage m @@ -347,7 +347,7 @@ instance Diagnostic DriverMessage where DriverInstantiationNodeInDependencyGeneration {} -> ErrorWithoutFlag DriverNoConfiguredLLVMToolchain - -> ErrorWithoutFlag + -> WarningWithoutFlag diagnosticHints = \case DriverUnknownMessage m ===================================== compiler/GHC/Tc/Solver/Equality.hs ===================================== @@ -193,12 +193,8 @@ zonkEqTypes ev eq_rel ty1 ty2 then tycon tc1 tys1 tys2 else bale_out ty1 ty2 - go ty1 ty2 - | Just (ty1a, ty1b) <- tcSplitAppTyNoView_maybe ty1 - , Just (ty2a, ty2b) <- tcSplitAppTyNoView_maybe ty2 - = do { res_a <- go ty1a ty2a - ; res_b <- go ty1b ty2b - ; return $ combine_rev mkAppTy res_b res_a } + -- If you are temppted to add a case for AppTy/AppTy, be careful + -- See Note [zonkEqTypes and the PKTI] go ty1@(LitTy lit1) (LitTy lit2) | lit1 == lit2 @@ -274,6 +270,32 @@ zonkEqTypes ev eq_rel ty1 ty2 combine_rev f (Right tys) (Right ty) = Right (f ty tys) +{- Note [zonkEqTypes and the PKTI] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Because `zonkEqTypes` does /partial/ zonking, we need to be very careful +to maintain the Purely Kinded Type Invariant: see GHC.Tc.Gen/HsType +HsNote [The Purely Kinded Type Invariant (PKTI)]. + +In #26256 we try to solve this equality constraint: + Int :-> Maybe Char ~# k0 Int (m0 Char) +where m0 and k0 are unification variables, and + m0 :: Type -> Type +It happens that m0 was already unified + m0 := (w0 :: kappa) +where kappa is another unification variable that is also already unified: + kappa := Type->Type. +So the original type satisifed the PKTI, but a partially-zonked form + k0 Int (w0 Char) +does not!! (This a bit reminiscent of Note [mkAppTyM].) + +The solution I have adopted is simply to make `zonkEqTypes` bale out on `AppTy`. +After all, it's only supposed to be a quick hack to see if two types are already +equal; if we bale out we'll just get into the "proper" canonicaliser. + +The only tricky thing about this approach is that it relies on /omitting/ +code -- for the AppTy/AppTy case! Hence this Note +-} + {- ********************************************************************* * * * canonicaliseEquality ===================================== configure.ac ===================================== @@ -22,7 +22,7 @@ AC_INIT([The Glorious Glasgow Haskell Compilation System], [9.12.2], [glasgow-ha AC_CONFIG_MACRO_DIRS([m4]) # Set this to YES for a released version, otherwise NO -: ${RELEASE=YES} +: ${RELEASE=NO} # The primary version (e.g. 7.5, 7.4.1) is set in the AC_INIT line # above. If this is not a released version, then we will append the ===================================== docs/users_guide/9.12.3-notes.rst ===================================== @@ -13,6 +13,83 @@ Compiler - Fixed re-exports of ``MkSolo`` and ``MkSolo#`` (:ghc-ticket:`25182`) - Fixed the behavior of ``Language.Haskell.TH.mkName "FUN"`` (:ghc-ticket:`25174`) +- Fixed miscompilation involving ``zonkEqTypes`` on ``AppTy/AppTy`` (:ghc-ticket:`26256`) +- Fixed CprAnal to detect recursive newtypes (:ghc-ticket:`25944`) +- Fixed specialisation of incoherent instances (:ghc-ticket:`25883`) +- Fixed bytecode generation for ``tagToEnum# <LITERAL>`` (:ghc-ticket:`25975`) +- Fixed panic with EmptyCase and RequiredTypeArguments (:ghc-ticket:`25004`) +- Fixed ``tyConStupidTheta`` to handle ``PromotedDataCon`` (:ghc-ticket:`25739`) +- Fixed unused import warnings for duplicate record fields (:ghc-ticket:`24035`) +- Fixed lexing of ``"\^\"`` (:ghc-ticket:`25937`) +- Fixed string gap collapsing (:ghc-ticket:`25784`) +- Fixed lexing of comments in multiline strings (:ghc-ticket:`25609`) +- Made unexpected LLVM versions a warning rather than an error (:ghc-ticket:`25915`) +- Disabled ``-fprof-late-overloaded-calls`` for join points to avoid invalid transformations +- Fixed bugs in ``integerRecipMod`` and ``integerPowMod`` (:ghc-ticket:`26017`) +- Fixed ``naturalAndNot`` for NB/NS case (:ghc-ticket:`26230`) +- Fixed ``ds_ev_typeable`` to use ``mkTrAppChecked`` (:ghc-ticket:`25998`) +- Fixed GHC settings to always unescape escaped spaces (:ghc-ticket:`25204`) +- Fixed issue with HasCallStack constraint caching (:ghc-ticket:`25529`) +- Fixed archive member size writing logic in ``GHC.SysTools.Ar`` (:ghc-ticket:`26120`, :ghc-ticket:`22586`) + +Runtime System +~~~~~~~~~~~~~~ + +- Fixed ``MessageBlackHole.link`` to always be a valid closure +- Fixed handling of WHITEHOLE in ``messageBlackHole`` (:ghc-ticket:`26205`) +- Fixed ``rts_clearMemory`` logic when sanity checks are enabled (:ghc-ticket:`26011`) +- Fixed underflow frame lookups in the bytecode interpreter (:ghc-ticket:`25750`) +- Fixed overflows and reentrancy in interpreter statistics calculation (:ghc-ticket:`25756`) +- Fixed INTERP_STATS profiling code (:ghc-ticket:`25695`) +- Removed problematic ``n_free`` variable from nonmovingGC (:ghc-ticket:`26186`) +- Fixed incorrect format specifiers in era profiling +- Improved documentation of SLIDE and PACK bytecode instructions +- Eliminated redundant ``SLIDE x 0`` bytecode instructions +- Fixed compile issues on powerpc64 ELF v1 + +Code Generation +~~~~~~~~~~~~~~~ + +- Fixed LLVM built-in variable predicate (was checking ``$llvm`` instead of ``@llvm``) +- Fixed linkage of built-in arrays for LLVM (:ghc-ticket:`25769`) +- Fixed code generation for SSE vector operations (:ghc-ticket:`25859`) +- Fixed ``bswap64`` code generation on i386 (:ghc-ticket:`25601`) +- Fixed sub-word arithmetic right shift on AArch64 (:ghc-ticket:`26061`) +- Fixed LLVM vector literal emission to include type information +- Fixed LLVM version detection +- Fixed typo in ``padLiveArgs`` that caused segfaults (:ghc-ticket:`25770`, :ghc-ticket:`25773`) +- Fixed constant-folding for Word->Float bitcasts +- Added surface syntax for Word/Float bitcast operations +- Fixed ``MOVD`` format in x86 NCG for ``unpackInt64X2#`` +- Added ``-finter-module-far-jumps`` flag for AArch64 +- Fixed RV64 J instruction handling for non-local jumps (:ghc-ticket:`25738`) +- Reapplied division by constants optimization +- Fixed TNTC to set CmmProc entry_label properly (:ghc-ticket:`25565`) + +Linker +~~~~~~ + +- Improved efficiency of proddable blocks structure (:ghc-ticket:`26009`) +- Fixed Windows DLL loading to avoid redundant ``LoadLibraryEx`` calls (:ghc-ticket:`26009`) +- Fixed incorrect use of ``break`` in nested for loop (:ghc-ticket:`26052`) +- Fixed linker to not fail due to ``RTLD_NOW`` (:ghc-ticket:`25943`) +- Dropped obsolete Windows XP compatibility checks + +GHCi +~~~~ + +- Fixed ``mkTopLevEnv`` to use ``loadInterfaceForModule`` instead of ``loadSrcInterface`` (:ghc-ticket:`25951`) + +Template Haskell +~~~~~~~~~~~~~~~~ + +- Added explicit export lists to all remaining template-haskell modules + +Build system +~~~~~~~~~~~~~~~~ + +- Exposed all of Backtraces' internals for ghc-internal (:ghc-ticket:`26049`) +- Fixed cross-compilation configuration override (:ghc-ticket:`26236`) Included libraries ~~~~~~~~~~~~~~~~~~ ===================================== libraries/base/changelog.md ===================================== @@ -1,5 +1,8 @@ # Changelog for [`base` package](http://hackage.haskell.org/package/base) +## 4.21.2.0 *Sept 2024* + * Fix bug where `naturalAndNot` was incorrectly truncating results ([CLC proposal #350](github.com/haskell/core-libraries-committee/issues/350)) + ## 4.21.1.0 *Sept 2024* * Fix incorrect results of `integerPowMod` when the base is 0 and the exponent is negative, and `integerRecipMod` when the modulus is zero ([#26017](https://gitlab.haskell.org/ghc/ghc/-/issues/26017)). ===================================== libraries/ghc-bignum/changelog.md ===================================== @@ -4,6 +4,7 @@ - Expose backendName - Add `naturalSetBit[#]` (#21173), `naturalClearBit[#]` (#21175), `naturalComplementBit[#]` (#21181) +- Fix bug where `naturalAndNot` was incorrectly truncating results (#26230) ## 1.2 ===================================== libraries/ghc-bignum/src/GHC/Num/Natural.hs ===================================== @@ -488,7 +488,7 @@ naturalAndNot :: Natural -> Natural -> Natural {-# NOINLINE naturalAndNot #-} naturalAndNot (NS n) (NS m) = NS (n `and#` not# m) naturalAndNot (NS n) (NB m) = NS (n `and#` not# (bigNatToWord# m)) -naturalAndNot (NB n) (NS m) = NS (bigNatToWord# n `and#` not# m) +naturalAndNot (NB n) (NS m) = NB (bigNatAndNotWord# n m) naturalAndNot (NB n) (NB m) = naturalFromBigNat# (bigNatAndNot n m) naturalOr :: Natural -> Natural -> Natural ===================================== rts/Messages.c ===================================== @@ -180,13 +180,22 @@ uint32_t messageBlackHole(Capability *cap, MessageBlackHole *msg) bh_info != &stg_CAF_BLACKHOLE_info && bh_info != &__stg_EAGER_BLACKHOLE_info && bh_info != &stg_WHITEHOLE_info) { - // if it is a WHITEHOLE, then a thread is in the process of - // trying to BLACKHOLE it. But we know that it was once a - // BLACKHOLE, so there is at least a valid pointer in the - // payload, so we can carry on. return 0; } + // If we see a WHITEHOLE then we should wait for it to turn into a BLACKHOLE. + // Otherwise we might look at the indirectee and segfault. + // See "Exception handling" in Note [Thunks, blackholes, and indirections] + // We might be looking at a *fresh* THUNK being WHITEHOLE-d so we can't + // guarantee that the indirectee is a valid pointer. +#if defined(THREADED_RTS) + if (bh_info == &stg_WHITEHOLE_info) { + while(ACQUIRE_LOAD(&bh->header.info) == &stg_WHITEHOLE_info) { + busy_wait_nop(); + } + } +#endif + // The blackhole must indirect to a TSO, a BLOCKING_QUEUE, an IND, // or a value. StgClosure *p; ===================================== rts/StgMiscClosures.cmm ===================================== @@ -31,6 +31,7 @@ import CLOSURE ENT_VIA_NODE_ctr; import CLOSURE RtsFlags; import CLOSURE stg_BLOCKING_QUEUE_CLEAN_info; import CLOSURE stg_BLOCKING_QUEUE_DIRTY_info; +import CLOSURE stg_END_TSO_QUEUE_closure; import CLOSURE stg_IND_info; import CLOSURE stg_MSG_BLACKHOLE_info; import CLOSURE stg_TSO_info; @@ -574,6 +575,9 @@ retry: MessageBlackHole_tso(msg) = CurrentTSO; MessageBlackHole_bh(msg) = node; + // Ensure that the link field is a valid closure, + // since we might turn this into an indirection in wakeBlockingQueue() + MessageBlackHole_link(msg) = stg_END_TSO_QUEUE_closure; SET_HDR(msg, stg_MSG_BLACKHOLE_info, CCS_SYSTEM); // messageBlackHole has appropriate memory barriers when this object is exposed. // See Note [Heap memory barriers]. ===================================== rts/Updates.h ===================================== @@ -333,6 +333,10 @@ * `AP_STACK` closure recording the aborted execution state. * See `RaiseAsync.c:raiseAsync` for details. * + * This can combine with indirection shortcutting during GC to replace a BLACKHOLE + * with a fresh THUNK. We should be very careful here since the THUNK will have an + * undefined value in the indirectee field. Looking at the indirectee field can then + * lead to a segfault such as #26205. * * CAFs * ---- ===================================== testsuite/driver/testlib.py ===================================== @@ -1725,7 +1725,7 @@ async def do_test(name: TestName, dst_makefile = in_testdir('Makefile') if src_makefile.exists(): makefile = src_makefile.read_text(encoding='UTF-8') - makefile = re.sub('TOP=.*', 'TOP=%s' % config.top, makefile, 1) + makefile = re.sub('TOP=.*', 'TOP=%s' % config.top, makefile, count=1) dst_makefile.write_text(makefile, encoding='UTF-8') if opts.pre_cmd: ===================================== testsuite/tests/numeric/should_run/T26230.hs ===================================== @@ -0,0 +1,8 @@ +import Data.Bits +import GHC.Num.Natural + +main = do + print $ naturalAndNot ((2 ^ 4) .|. (2 ^ 3)) (2 ^ 3) + print $ naturalAndNot ((2 ^ 129) .|. (2 ^ 65)) (2 ^ 65) + print $ naturalAndNot ((2 ^ 4) .|. (2 ^ 3)) ((2 ^ 65) .|. (2 ^ 3)) + print $ naturalAndNot ((2 ^ 65) .|. (2 ^ 3)) (2 ^ 3) ===================================== testsuite/tests/numeric/should_run/T26230.stdout ===================================== @@ -0,0 +1,4 @@ +16 +680564733841876926926749214863536422912 +16 +36893488147419103232 ===================================== testsuite/tests/numeric/should_run/all.T ===================================== @@ -87,3 +87,4 @@ test('T24066', normal, compile_and_run, ['']) test('div01', normal, compile_and_run, ['']) test('T24245', normal, compile_and_run, ['']) test('T25653', normal, compile_and_run, ['']) +test('T26230', normal, compile_and_run, ['']) ===================================== testsuite/tests/partial-sigs/should_compile/T26256.hs ===================================== @@ -0,0 +1,23 @@ +{-# LANGUAGE GHC2021 #-} +{-# LANGUAGE TypeFamilies #-} +{-# LANGUAGE PartialTypeSignatures #-} + +module M (go) where + +import Data.Kind + +type Apply :: (Type -> Type) -> Type +data Apply m + +type (:->) :: Type -> Type -> Type +type family (:->) where (:->) = (->) + +f :: forall (k :: Type -> Type -> Type) (m :: Type -> Type). + k Int (m Char) -> k Bool (Apply m) +f = f + +x :: Int :-> Maybe Char +x = x + +go :: Bool -> _ _ +go = f x ===================================== testsuite/tests/partial-sigs/should_compile/T26256.stderr ===================================== @@ -0,0 +1,8 @@ +T26256.hs:22:15: warning: [GHC-88464] [-Wpartial-type-signatures (in -Wdefault)] + • Found type wildcard ‘_’ standing for ‘Apply :: (* -> *) -> *’ + • In the type signature: go :: Bool -> _ _ + +T26256.hs:22:17: warning: [GHC-88464] [-Wpartial-type-signatures (in -Wdefault)] + • Found type wildcard ‘_’ standing for ‘Maybe :: * -> *’ + • In the first argument of ‘_’, namely ‘_’ + In the type signature: go :: Bool -> _ _ ===================================== testsuite/tests/partial-sigs/should_compile/all.T ===================================== @@ -108,3 +108,4 @@ test('T21667', normal, compile, ['']) test('T22065', normal, compile, ['']) test('T16152', normal, compile, ['']) test('T20076', expect_broken(20076), compile, ['']) +test('T26256', normal, compile, ['']) ===================================== testsuite/tests/polykinds/T14172.stderr ===================================== @@ -1,6 +1,6 @@ T14172.hs:7:46: error: [GHC-88464] - • Found type wildcard ‘_’ standing for ‘a'1 :: k0’ - Where: ‘k0’ is an ambiguous type variable + • Found type wildcard ‘_’ standing for ‘a'1 :: k30’ + Where: ‘k30’ is an ambiguous type variable ‘a'1’ is an ambiguous type variable To use the inferred type, enable PartialTypeSignatures • In the first argument of ‘h’, namely ‘_’ ===================================== testsuite/tests/typecheck/should_compile/T26256a.hs ===================================== @@ -0,0 +1,19 @@ +{-# LANGUAGE GHC2021 #-} +{-# LANGUAGE TypeFamilies #-} + +module T26256 (go) where + +import Data.Kind + +class Cat k where (<<<) :: k a b -> k x a -> k x b +instance Cat (->) where (<<<) = (.) +class Pro k p where pro :: k a b s t -> p a b -> p s t +data Hiding o a b s t = forall e. Hiding (s -> o e a) +newtype Apply e a = Apply (e a) + +type (:->) :: Type -> Type -> Type +type family (:->) where + (:->) = (->) + +go :: (Pro (Hiding Apply) p) => (s :-> e a) -> p a b -> p s t +go sea = pro (Hiding (Apply <<< sea)) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -935,3 +935,4 @@ test('T24845a', normal, compile, ['']) test('T23501a', normal, compile, ['']) test('T23501b', normal, compile, ['']) test('T25597', normal, compile, ['']) +test('T26256a', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d0629aa6d14710c2fbbd4709d0c54e… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d0629aa6d14710c2fbbd4709d0c54e… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc] Pushed new tag ghc-9.10.3-release
by Zubin (＠wz1000) 11 Sep '25

11 Sep '25

Zubin pushed new tag ghc-9.10.3-release at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/tree/ghc-9.10.3-release You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/fix-26065] Correcting LLVM linking of Intel BMI intrinsics pdep{8,16} and pext{8,16}.
by Andreas Klebinger (＠AndreasK) 11 Sep '25

11 Sep '25

Andreas Klebinger pushed to branch wip/fix-26065 at Glasgow Haskell Compiler / GHC Commits: d97f54f2 by Alex Washburn at 2025-09-11T13:38:05+02:00 Correcting LLVM linking of Intel BMI intrinsics pdep{8,16} and pext{8,16}. This patch fixes #26065. The LLVM interface does not expose bindings to: - llvm.x86.bmi.pdep.8 - llvm.x86.bmi.pdep.16 - llvm.x86.bmi.pext.8 - llvm.x86.bmi.pext.16 So calls are instead made to llvm.x86.bmi.{pdep,pext}.32 in these cases, with pre/post-operation truncation to constrain the logical value range. - - - - - 4 changed files: - compiler/GHC/CmmToLlvm/CodeGen.hs - + testsuite/tests/llvm/should_run/T26065.hs - + testsuite/tests/llvm/should_run/T26065.stdout - testsuite/tests/llvm/should_run/all.T Changes: ===================================== compiler/GHC/CmmToLlvm/CodeGen.hs ===================================== @@ -240,12 +240,25 @@ genCall (PrimTarget op@(MO_BRev w)) [dst] args = genCallSimpleCast w op dst args genCall (PrimTarget op@(MO_BSwap w)) [dst] args = genCallSimpleCast w op dst args -genCall (PrimTarget op@(MO_Pdep w)) [dst] args = - genCallSimpleCast w op dst args -genCall (PrimTarget op@(MO_Pext w)) [dst] args = - genCallSimpleCast w op dst args genCall (PrimTarget op@(MO_PopCnt w)) [dst] args = genCallSimpleCast w op dst args +{- Note [LLVM PDep/PExt intrinsics] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Since x86 PDep/PExt instructions only exist for 32/64 bit widths +we use the 32bit variant to compute the 8/16bit primops. +To do so we extend/truncate the argument/result around the +call. +-} +genCall (PrimTarget op@(MO_Pdep w)) [dst] args = do + cfg <- getConfig + if llvmCgBmiVersion cfg >= Just BMI2 + then genCallMinimumTruncationCast W32 w op dst args + else genCallSimpleCast w op dst args +genCall (PrimTarget op@(MO_Pext w)) [dst] args = do + cfg <- getConfig + if llvmCgBmiVersion cfg >= Just BMI2 + then genCallMinimumTruncationCast W32 w op dst args + else genCallSimpleCast w op dst args genCall (PrimTarget (MO_AtomicRMW width amop)) [dst] [addr, n] = runStmtsDecls $ do addrVar <- exprToVarW addr @@ -641,8 +654,15 @@ genCallExtract _ _ _ _ = -- from i32 to i8 explicitly as LLVM is strict about types. genCallSimpleCast :: Width -> CallishMachOp -> CmmFormal -> [CmmActual] -> LlvmM StmtData -genCallSimpleCast specW op dst args = do - let width = widthToLlvmInt specW +genCallSimpleCast w = genCallMinimumTruncationCast w w + +-- Given the minimum machine bit-width to use and the logical bit-width of the +-- value range, perform a type-cast truncation and extension before and after the +-- specified operation, respectively. +genCallMinimumTruncationCast :: Width -> Width -> CallishMachOp -> CmmFormal + -> [CmmActual] -> LlvmM StmtData +genCallMinimumTruncationCast minW specW op dst args = do + let width = widthToLlvmInt $ max minW specW argsW = const width <$> args dstType = cmmToLlvmType $ localRegType dst signage = cmmPrimOpRetValSignage op @@ -945,9 +965,10 @@ cmmPrimOpFunctions mop = do W256 -> fsLit "llvm.cttz.i256" W512 -> fsLit "llvm.cttz.i512" MO_Pdep w + -- See Note [LLVM PDep/PExt intrinsics] | isBmi2Enabled -> case w of - W8 -> fsLit "llvm.x86.bmi.pdep.8" - W16 -> fsLit "llvm.x86.bmi.pdep.16" + W8 -> fsLit "llvm.x86.bmi.pdep.32" + W16 -> fsLit "llvm.x86.bmi.pdep.32" W32 -> fsLit "llvm.x86.bmi.pdep.32" W64 -> fsLit "llvm.x86.bmi.pdep.64" W128 -> fsLit "llvm.x86.bmi.pdep.128" @@ -963,8 +984,9 @@ cmmPrimOpFunctions mop = do W512 -> fsLit "hs_pdep512" MO_Pext w | isBmi2Enabled -> case w of - W8 -> fsLit "llvm.x86.bmi.pext.8" - W16 -> fsLit "llvm.x86.bmi.pext.16" + -- See Note [LLVM PDep/PExt intrinsics] + W8 -> fsLit "llvm.x86.bmi.pext.32" + W16 -> fsLit "llvm.x86.bmi.pext.32" W32 -> fsLit "llvm.x86.bmi.pext.32" W64 -> fsLit "llvm.x86.bmi.pext.64" W128 -> fsLit "llvm.x86.bmi.pext.128" ===================================== testsuite/tests/llvm/should_run/T26065.hs ===================================== @@ -0,0 +1,68 @@ +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE ScopedTypeVariables #-} +import Data.Char (toUpper) +import GHC.Exts +import GHC.Word +import Numeric (showHex) + +pdep8 :: Word8 -> Word8 -> Word8 +pdep8 (W8# a) (W8# b) = W8# (wordToWord8# (pdep8# (word8ToWord# a) (word8ToWord# b))) +{-# NOINLINE pdep8 #-} + +pdep16 :: Word16 -> Word16 -> Word16 +pdep16 (W16# a) (W16# b) = W16# (wordToWord16# (pdep16# (word16ToWord# a) (word16ToWord# b))) +{-# NOINLINE pdep16 #-} + +pdep32 :: Word32 -> Word32 -> Word32 +pdep32 (W32# a) (W32# b) = W32# (wordToWord32# (pdep32# (word32ToWord# a) (word32ToWord# b))) +{-# NOINLINE pdep32 #-} + +pdep64 :: Word64 -> Word64 -> Word64 +pdep64 (W64# a) (W64# b) = W64# (pdep64# a b) +{-# NOINLINE pdep64 #-} + +pext8 :: Word8 -> Word8 -> Word8 +pext8 (W8# a) (W8# b) = W8# (wordToWord8# (pext8# (word8ToWord# a) (word8ToWord# b))) +{-# NOINLINE pext8 #-} + +pext16 :: Word16 -> Word16 -> Word16 +pext16 (W16# a) (W16# b) = W16# (wordToWord16# (pext16# (word16ToWord# a) (word16ToWord# b))) +{-# NOINLINE pext16 #-} + +pext32 :: Word32 -> Word32 -> Word32 +pext32 (W32# a) (W32# b) = W32# (wordToWord32# (pext32# (word32ToWord# a) (word32ToWord# b))) +{-# NOINLINE pext32 #-} + +pext64 :: Word64 -> Word64 -> Word64 +pext64 (W64# a) (W64# b) = W64# (pext64# a b) +{-# NOINLINE pext64 #-} + +valueSource :: Integral i => i +valueSource = fromInteger 0xA7F7A7F7A7F7A7F7 + +valueMask :: Integral i => i +valueMask = fromInteger 0x5555555555555555 + +printIntrinsicCall :: forall i. Integral i => String -> (i -> i -> i) -> IO () +printIntrinsicCall label f = + let op1 = valueSource + op2 = valueMask + pad s = + let hex :: Integral a => a -> String + hex = flip showHex "" + str = toUpper <$> hex s + len = length $ hex (maxBound :: Word64) + n = length str + in "0x" <> replicate (len - n) '0' <> str + in putStrLn $ unwords [ label, pad op1, pad op2, "=", pad (f op1 op2) ] + +main :: IO () +main = do + printIntrinsicCall "pdep8 " pdep8 + printIntrinsicCall "pdep16" pdep16 + printIntrinsicCall "pdep32" pdep32 + printIntrinsicCall "pdep64" pdep64 + printIntrinsicCall "pext8 " pext8 + printIntrinsicCall "pext16" pext16 + printIntrinsicCall "pext32" pext32 + printIntrinsicCall "pext64" pext64 ===================================== testsuite/tests/llvm/should_run/T26065.stdout ===================================== @@ -0,0 +1,8 @@ +pdep8 0x00000000000000F7 0x0000000000000055 = 0x0000000000000015 +pdep16 0x000000000000A7F7 0x0000000000005555 = 0x0000000000005515 +pdep32 0x00000000A7F7A7F7 0x0000000055555555 = 0x0000000044155515 +pdep64 0xA7F7A7F7A7F7A7F7 0x5555555555555555 = 0x4415551544155515 +pext8 0x00000000000000F7 0x0000000000000055 = 0x000000000000000F +pext16 0x000000000000A7F7 0x0000000000005555 = 0x000000000000003F +pext32 0x00000000A7F7A7F7 0x0000000055555555 = 0x0000000000003F3F +pext64 0xA7F7A7F7A7F7A7F7 0x5555555555555555 = 0x000000003F3F3F3F ===================================== testsuite/tests/llvm/should_run/all.T ===================================== @@ -18,3 +18,8 @@ test('T22033', [normal, normalise_errmsg_fun(ignore_llvm_and_vortex)], compile_a test('T25730', [req_c, unless(arch('x86_64'), skip), normalise_errmsg_fun(ignore_llvm_and_vortex)], compile_and_run, ['T25730C.c']) # T25730C.c contains Intel instrinsics, so only run this test on x86 test('T20645', [normal, normalise_errmsg_fun(ignore_llvm_and_vortex), when(have_llvm(), extra_ways(["optllvm"]))], compile_and_run, ['']) +# T26065.c tests LLVM linking of Intel instrinsics, so only run this test on x86 +test('T26065', [normal, normalise_errmsg_fun(ignore_llvm_and_vortex), when(have_llvm(), extra_ways(["optllvm"])), + unless((arch('x86_64') or arch('i386')) and have_cpu_feature('bmi2'),skip)], + compile_and_run, ['-mbmi2']) + View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/d97f54f22d60acb8bb869f589676f69… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/d97f54f22d60acb8bb869f589676f69… You're receiving this email because of your account on gitlab.haskell.org.

1 0

[Git][ghc/ghc][wip/fix-26065] 59 commits: Revert "STM: don't create a transaction in the rhs of catchRetry# (#26028)"
by Andreas Klebinger (＠AndreasK) 11 Sep '25

11 Sep '25

Andreas Klebinger pushed to branch wip/fix-26065 at Glasgow Haskell Compiler / GHC Commits: 5b5d9d47 by Ben Gamari at 2025-08-25T14:29:35-04:00 Revert "STM: don't create a transaction in the rhs of catchRetry# (#26028)" This reverts commit 0a5836891ca29836a24c306d2a364c2e4b5377fd - - - - - 10f06163 by Cheng Shao at 2025-08-25T14:30:16-04:00 wasm: ensure setKeepCAFs() is called in ghci This patch is a critical bugfix for #26106, see comment and linked issue for details. - - - - - bedc1004 by Cheng Shao at 2025-08-26T09:31:18-04:00 compiler: use zero cost coerce in hoopl setElems/mapToList This patch is a follow-up of !14680 and changes setElems/mapToList in GHC/Cmm/Dataflow/Label to use coerce instead of mapping mkHooplLabel over the keys. - - - - - 13250d97 by Ryan Scott at 2025-08-26T09:31:59-04:00 Reject infix promoted data constructors without DataKinds In the rename, make sure to apply the same `DataKinds` checks for both `HsTyVar` (for prefix promoted data constructors) and `HsOpTy` (for infix promoted data constructors) alike. Fixes #26318. - - - - - 37655c46 by Teo Camarasu at 2025-08-26T15:24:51-04:00 tests: disable T22859 under LLVM This test was failing under the LLVM backend since the allocations differ from the NCG. Resolves #26282 - - - - - 2cbba9d6 by Teo Camarasu at 2025-08-26T15:25:33-04:00 base-exports: update version numbers As the version of the compiler has been bumped, a lot of the embedded version numbers will need to be updated if we ever run this test with `--test-accept` so let's just update them now, and keep future diffs clean. - - - - - f9f2ffcf by Alexandre Esteves at 2025-08-27T07:19:14-04:00 Import new name for 'utimbuf' on windows to fix #26337 Fixes an `-Wincompatible-pointer-types` instance that turns into an error on recent toolchains and surfaced as such on nixpkgs when doing linux->ucrt cross. This long-standing warning has been present at least since 9.4: ``` C:\GitLabRunner\builds\0\1709189\tmp\ghc16652_0\ghc_4.c:26:115: error: warning: incompatible pointer types passing 'struct utimbuf *' to parameter of type 'struct _utimbuf *' [-Wincompatible-pointer-types] | 26 | HsInt32 ghczuwrapperZC9ZCbaseZCSystemziPosixziInternalsZCzuutime(char* a1, struct utimbuf* a2) {return _utime(a1, a2);} | ^ HsInt32 ghczuwrapperZC9ZCbaseZCSystemziPosixziInternalsZCzuutime(char* a1, struct utimbuf* a2) {return _utime(a1, a2);} ^~ C:\GitLabRunner\builds\0\1709189\_build\stage0\lib\..\..\mingw\x86_64-w64-mingw32\include\sys\utime.h:109:72: error: note: passing argument to parameter '_Utimbuf' here | 109 | __CRT_INLINE int __cdecl _utime(const char *_Filename,struct _utimbuf *_Utimbuf) { | ^ __CRT_INLINE int __cdecl _utime(const char *_Filename,struct _utimbuf *_Utimbuf) { ``` - - - - - ae89f000 by Hassan Al-Awwadi at 2025-08-27T07:19:56-04:00 Adds the fucnction addDependentDirectory to Q, resolving issue #26148. This function adds a new directory to the list of things a module depends upon. That means that when the contents of the directory change, the recompilation checker will notice this and the module will be recompiled. Documentation has also been added for addDependentFunction and addDependentDirectory in the user guide. - - - - - 00478944 by Simon Peyton Jones at 2025-08-27T16:48:30+01:00 Comments only - - - - - a7884589 by Simon Peyton Jones at 2025-08-28T11:08:23+01:00 Type-family occurs check in unification The occurs check in `GHC.Core.Unify.uVarOrFam` was inadequate in dealing with type families. Better now. See Note [The occurs check in the Core unifier]. As I did this I realised that the whole apartness thing is trickier than I thought: see the new Note [Shortcomings of the apartness test] - - - - - 8adfc222 by sheaf at 2025-08-28T19:47:17-04:00 Fix orientation in HsWrapper composition (<.>) This commit fixes the order in which WpCast HsWrappers are composed, fixing a bug introduced in commit 56b32c5a2d5d7cad89a12f4d74dc940e086069d1. Fixes #26350 - - - - - eb2ab1e2 by Oleg Grenrus at 2025-08-29T11:00:53-04:00 Generalise thNameToGhcName by adding HasHscEnv There were multiple single monad-specific `getHscEnv` across codebase. HasHscEnv is modelled on HasDynFlags. My first idea was to simply add thNameToGhcNameHsc and thNameToGhcNameTc, but those would been exactly the same as thNameToGhcName already. Also add an usage example to thNameToGhcName and mention that it's recommended way of looking up names in GHC plugins - - - - - 2d575a7f by fendor at 2025-08-29T11:01:36-04:00 configure: Bump minimal bootstrap GHC version to 9.10 - - - - - 716274a5 by Simon Peyton Jones at 2025-08-29T17:27:12-04:00 Fix deep subsumption again This commit fixed #26255: commit 56b32c5a2d5d7cad89a12f4d74dc940e086069d1 Author: sheaf <sam.derbyshire(a)gmail.com> Date: Mon Aug 11 15:50:47 2025 +0200 Improve deep subsumption This commit improves the DeepSubsumption sub-typing implementation in GHC.Tc.Utils.Unify.tc_sub_type_deep by being less eager to fall back to unification. But alas it still wasn't quite right for view patterns: #26331 This MR does a generalisation to fix it. A bit of a sledgehammer to crack a nut, but nice. * Add a field `ir_inst :: InferInstFlag` to `InferResult`, where ``` data InferInstFlag = IIF_Sigma | IIF_ShallowRho | IIF_DeepRho ``` * The flag says exactly how much `fillInferResult` should instantiate before filling the hole. * We can also use this to replace the previous very ad-hoc `tcInferSigma` that was used to implement GHCi's `:type` command. - - - - - 27206c5e by sheaf at 2025-08-29T17:28:14-04:00 Back-compat for TH SpecialiseP data-con of Pragma This commit improves the backwards-compatibility story for the SpecialiseP constructor of the Template Haskell 'Pragma' datatype. Instead of keeping the constructor but deprecating it, this commit makes it into a bundled pattern synonym of the Pragma datatype. We no longer deprecate it; it's useful for handling old-form specialise pragmas. - - - - - 26dbcf61 by fendor at 2025-08-30T05:10:08-04:00 Move stack decoding logic from ghc-heap to ghc-internal The stack decoding logic in `ghc-heap` is more sophisticated than the one currently employed in `CloneStack`. We want to use the stack decoding implementation from `ghc-heap` in `base`. We cannot simply depend on `ghc-heap` in `base` due do bootstrapping issues. Thus, we move the code that is necessary to implement stack decoding to `ghc-internal`. This is the right location, as we don't want to add a new API to `base`. Moving the stack decoding logic and re-exposing it in ghc-heap is insufficient, though, as we have a dependency cycle between. * ghc-heap depends on stage1:ghc-internal * stage0:ghc depends on stage0:ghc-heap To fix this, we remove ghc-heap from the set of `stage0` dependencies. This is not entirely straight-forward, as a couple of boot dependencies, such as `ghci` depend on `ghc-heap`. Luckily, the boot compiler of GHC is now >=9.10, so we can migrate `ghci` to use `ghc-internal` instead of `ghc-heap`, which already exports the relevant modules. However, we cannot 100% remove ghc's dependency on `ghc-heap`, since when we compile `stage0:ghc`, `stage1:ghc-internal` is not yet available. Thus, when we compile with the boot-compiler, we still depend on an older version of `ghc-heap`, and only use the modules from `ghc-internal`, if the `ghc-internal` version is recent enough. ------------------------- Metric Increase: T24602_perf_size T25046_perf_size_gzip T25046_perf_size_unicode T25046_perf_size_unicode_gzip size_hello_artifact size_hello_artifact_gzip size_hello_unicode size_hello_unicode_gzip ------------------------- These metric increases are unfortunate, they are most likely caused by the larger (literally in terms of lines of code) stack decoder implementation that are now linked into hello-word binaries. On linux, it is almost a 10% increase, which is considerable. - - - - - bd80bb70 by fendor at 2025-08-30T05:10:08-04:00 Implement `decode` in terms of `decodeStackWithIpe` Uses the more efficient stack decoder implementation. - - - - - 24441165 by fendor at 2025-08-30T05:10:08-04:00 Remove stg_decodeStackzh - - - - - fb9cc882 by Simon Peyton Jones at 2025-08-30T05:10:51-04:00 Fix a long standing bug in the coercion optimiser We were mis-optimising ForAllCo, leading to #26345 Part of the poblem was the tricky tower of abstractions leading to the dreadful GHC.Core.TyCo.Subst.substForAllCoTyVarBndrUsing This function was serving two masters: regular substitution, but also coercion optimsation. So tricky was it that it did so wrong. In this MR I locate all the fancy footwork for coercion optimisation in GHC.Core.Coercion.Opt, where it belongs. That leaves substitution free to be much simpler. - - - - - 6c78de2d by Sylvain Henry at 2025-09-01T08:46:19-04:00 Driver: substitute virtual Prim module in --make mode too When we build ghc-internal with --make (e.g. with cabal-install), we need to be careful to substitute the virtual interface file for GHC.Internal.Prim: - after code generation (we generate code for an empty module, so we get an empty interface) - when we try to reload its .hi file - - - - - 26e0db16 by fendor at 2025-09-01T08:47:01-04:00 Expose Stack Annotation frames in IPE backtraces by default When decoding the Haskell-native call stack and displaying the IPE information for the stack frames, we print the `StackAnnotation` of the `AnnFrame` by default. This means, when an exception is thrown, any intermediate stack annotations will be displayed in the `IPE Backtrace`. Example backtrace: ``` Exception: ghc-internal:GHC.Internal.Exception.ErrorCall: Oh no! IPE backtrace: annotateCallStackIO, called at app/Main.hs:48:10 in backtrace-0.1.0.0-inplace-server:Main annotateCallStackIO, called at app/Main.hs:46:13 in backtrace-0.1.0.0-inplace-server:Main Main.handler (app/Main.hs:(46,1)-(49,30)) Main.liftIO (src/Servant/Server/Internal/Handler.hs:30:36-42) Servant.Server.Internal.Delayed.runHandler' (src/Servant/Server/Internal/Handler.hs:27:31-41) Control.Monad.Trans.Resource.runResourceT (./Control/Monad/Trans/Resource.hs:(192,14)-(197,18)) Network.Wai.Handler.Warp.HTTP1.processRequest (./Network/Wai/Handler/Warp/HTTP1.hs:195:20-22) Network.Wai.Handler.Warp.HTTP1.processRequest (./Network/Wai/Handler/Warp/HTTP1.hs:(195,5)-(203,31)) Network.Wai.Handler.Warp.HTTP1.http1server.loop (./Network/Wai/Handler/Warp/HTTP1.hs:(141,9)-(157,42)) HasCallStack backtrace: error, called at app/Main.hs:48:32 in backtrace-0.1.0.0-inplace-server:Main ``` The first two entries have been added by `annotateCallStackIO`, defined in `annotateCallStackIO`. - - - - - a1567efd by Sylvain Henry at 2025-09-01T23:01:35-04:00 RTS: rely less on Hadrian for flag setting (#25843) Hadrian used to pass -Dfoo command-line flags directly to build the rts. We can replace most of these flags with CPP based on cabal flags. It makes building boot libraries with cabal-install simpler (cf #25843). - - - - - ca5b0283 by Sergey Vinokurov at 2025-09-01T23:02:23-04:00 Remove unnecessary irrefutable patterns from Bifunctor instances for tuples Implementation of https://github.com/haskell/core-libraries-committee/issues/339 Metric Decrease: mhu-perf - - - - - 2da84b7a by sheaf at 2025-09-01T23:03:23-04:00 Only use active rules when simplifying rule RHSs When we are simplifying the RHS of a rule, we make sure to only apply rewrites from rules that are active throughout the original rule's range of active phases. For example, if a rule is always active, we only fire rules that are themselves always active when simplifying the RHS. Ditto for inline activations. This is achieved by setting the simplifier phase to a range of phases, using the new SimplPhaseRange constructor. Then: 1. When simplifying the RHS of a rule, or of a stable unfolding, we set the simplifier phase to a range of phases, computed from the activation of the RULE/unfolding activation, using the function 'phaseFromActivation'. The details are explained in Note [What is active in the RHS of a RULE?] in GHC.Core.Opt.Simplify.Utils. 2. The activation check for other rules and inlinings is then: does the activation of the other rule/inlining cover the whole phase range set in sm_phase? This continues to use the 'isActive' function, which now accounts for phase ranges. On the way, this commit also moves the exact-print SourceText annotation from the Activation datatype to the ActivationAnn type. This keeps the main Activation datatype free of any extra cruft. Fixes #26323 - - - - - 79816cc4 by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 cleanup: Move dehydrateCgBreakInfo to Stg2Bc This no longer has anything to do with Core. - - - - - 53da94ff by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 rts/Disassembler: Fix spacing of BRK_FUN - - - - - 08c0cf85 by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 debugger: Fix bciPtr in Step-out We need to use `BCO_NEXT` to move bciPtr to ix=1, because ix=0 points to the instruction itself! I do not understand how this didn't crash before. - - - - - e7e021fa by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 debugger: Allow BRK_FUNs to head case continuation BCOs When we start executing a BCO, we may want to yield to the scheduler: this may be triggered by a heap/stack check, context switch, or a breakpoint. To yield, we need to put the stack in a state such that when execution is resumed we are back to where we yielded from. Previously, a BKR_FUN could only head a function BCO because we only knew how to construct a valid stack for yielding from one -- simply add `apply_interp_info` + the BCO to resume executing. This is valid because the stack at the start of run_BCO is headed by that BCO's arguments. However, in case continuation BCOs (as per Note [Case continuation BCOs]), we couldn't easily reconstruct a valid stack that could be resumed because we dropped too soon the stack frames regarding the value returned (stg_ret) and received (stg_ctoi) by that continuation. This is especially tricky because of the variable type and size return frames (e.g. pointer ret_p/ctoi_R1p vs a tuple ret_t/ctoi_t2). The trick to being able to yield from a BRK_FUN at the start of a case cont BCO is to stop removing the ret frame headers eagerly and instead keep them until the BCO starts executing. The new layout at the start of a case cont. BCO is described by the new Note [Stack layout when entering run_BCO]. Now, we keep the ret_* and ctoi_* frames when entering run_BCO. A BRK_FUN is then executed if found, and the stack is yielded as-is with the preserved ret and ctoi frames. Then, a case cont BCO's instructions always SLIDE off the headers of the ret and ctoi frames, in StgToByteCode.doCase, turning a stack like | .... | +---------------+ | fv2 | +---------------+ | fv1 | +---------------+ | BCO | +---------------+ | stg_ctoi_ret_ | +---------------+ | retval | +---------------+ | stg_ret_..... | +---------------+ into | .... | +---------------+ | fv2 | +---------------+ | fv1 | +---------------+ | retval | +---------------+ for the remainder of the BCO. Moreover, this more uniform approach of keeping the ret and ctoi frames means we need less ad-hoc logic concerning the variable size of ret_tuple vs ret_p/np frames in the code generator and interpreter: Always keep the return to cont. stack intact at the start of run_BCO, and the statically generated instructions will take care of adjusting it. Unlocks BRK_FUNs at the start of case cont. BCOs which will enable a better user-facing step-out (#26042) which is free of the bugs the current BRK_ALTS implementation suffers from (namely, using BRK_FUN rather than BRK_ALTS in a case cont. means we'll never accidentally end up in a breakpoint "deeper" than the continuation, because we stop at the case cont itself rather than on the first breakpoint we evaluate after it). - - - - - ade3c1e6 by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 BRK_FUN with InternalBreakLocs for code-generation time breakpoints At the start of a case continuation BCO, place a BRK_FUN. This BRK_FUN uses the new "internal breakpoint location" -- allowing us to come up with a valid source location for this breakpoint that is not associated with a source-level tick. For case continuation BCOs, we use the last tick seen before it as the source location. The reasoning is described in Note [Debugger: Stepout internal break locs]. Note how T26042c, which was broken because it displayed the incorrect behavior of the previous step out when we'd end up at a deeper level than the one from which we initiated step-out, is now fixed. As of this commit, BRK_ALTS is now dead code and is thus dropped. Note [Debugger: Stepout internal break locs] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Step-out tells the interpreter to run until the current function returns to where it was called from, and stop there. This is achieved by enabling the BRK_FUN found on the first RET_BCO frame on the stack (See [Note Debugger: Step-out]). Case continuation BCOs (which select an alternative branch) must therefore be headed by a BRK_FUN. An example: f x = case g x of <--- end up here 1 -> ... 2 -> ... g y = ... <--- step out from here - `g` will return a value to the case continuation BCO in `f` - The case continuation BCO will receive the value returned from g - Match on it and push the alternative continuation for that branch - And then enter that alternative. If we step-out of `g`, the first RET_BCO on the stack is the case continuation of `f` -- execution should stop at its start, before selecting an alternative. (One might ask, "why not enable the breakpoint in the alternative instead?", because the alternative continuation is only pushed to the stack *after* it is selected by the case cont. BCO) However, the case cont. BCO is not associated with any source-level tick, it is merely the glue code which selects alternatives which do have source level ticks. Therefore, we have to come up at code generation time with a breakpoint location ('InternalBreakLoc') to display to the user when it is stopped there. Our solution is to use the last tick seen just before reaching the case continuation. This is robust because a case continuation will thus always have a relevant breakpoint location: - The source location will be the last source-relevant expression executed before the continuation is pushed - So the source location will point to the thing you've just stepped out of - Doing :step-local from there will put you on the selected alternative (which at the source level may also be the e.g. next line in a do-block) Examples, using angle brackets (<<...>>) to denote the breakpoint span: f x = case <<g x>> {- step in here -} of 1 -> ... 2 -> ...> g y = <<...>> <--- step out from here ... f x = <<case g x of <--- end up here, whole case highlighted 1 -> ... 2 -> ...>> doing :step-local ... f x = case g x of 1 -> <<...>> <--- stop in the alternative 2 -> ... A second example based on T26042d2, where the source is a do-block IO action, optimised to a chain of `case expressions`. main = do putStrLn "hello1" <<f>> <--- step-in here putStrLn "hello3" putStrLn "hello4" f = do <<putStrLn "hello2.1">> <--- step-out from here putStrLn "hello2.2" ... main = do putStrLn "hello1" <<f>> <--- end up here again, the previously executed expression putStrLn "hello3" putStrLn "hello4" doing step/step-local ... main = do putStrLn "hello1" f <<putStrLn "hello3">> <--- straight to the next line putStrLn "hello4" Finishes #26042 - - - - - c66910c0 by Rodrigo Mesquita at 2025-09-02T12:19:59-04:00 debugger: Re-use the last BreakpointId whole in step-out Previously, to come up with a location to stop at for `:stepout`, we would store the location of the last BreakpointId surrounding the continuation, as described by Note [Debugger: Stepout internal break locs]. However, re-using just the location from the last source breakpoint isn't sufficient to provide the necessary information in the break location. Specifically, it wouldn't bind any variables at that location. Really, there is no reason not to re-use the last breakpoint wholesale, and re-use all the information we had there. Step-out should behave just as if we had stopped at the call, but s.t. continuing will not re-execute the call. This commit updates the CgBreakInfo to always store a BreakpointId, be it the original one or the one we're emulating (for step-out). It makes variable bindings on :stepout work - - - - - e4abed7b by sheaf at 2025-09-02T12:20:40-04:00 Revert accidental changes to hie.yaml - - - - - 003b715b by meooow25 at 2025-09-02T23:48:51+02:00 Adjust the strictness of Data.List.iterate' * Don't force the next element in advance when generating a (:). * Force the first element to WHNF like every other element. Now every element in the output list is forced to WHNF when the (:) containing it is forced. CLC proposal: https://github.com/haskell/core-libraries-committee/issues/335 - - - - - b2f6aad0 by Simon Hengel at 2025-09-03T04:36:10-04:00 Refactoring: More consistently use logOutput, logInfo, fatalErrorMsg - - - - - 60a16db7 by Rodrigo Mesquita at 2025-09-03T10:55:50+01:00 bytecode: Don't PUSH_L 0; SLIDE 1 1 While looking through bytecode I noticed a quite common unfortunate pattern: ... PUSH_L 0 SLIDE 1 1 We do this often by generically constructing a tail call from a function atom that may be somewhere arbitrary on the stack. However, for the special case that the function can be found directly on top of the stack, as part of the arguments, it's plain redundant to push then slide it. In this commit we add a small optimisation to the generation of tailcalls in bytecode. Simply: lookahead for the function in the stack. If it is the first thing on the stack and it is part of the arguments which would be dropped as we entered the tail call, then don't push then slide it. In a simple example (T26042b), this already produced a drastic improvement in generated code (left is old, right is with this patch): ```diff 3c3 < 2025-07-29 10:14:02.081277 UTC --- > 2025-07-29 10:50:36.560949 UTC 160,161c160 < PUSH_L 0 < SLIDE 1 2 --- > SLIDE 1 1 164,165d162 < PUSH_L 0 < SLIDE 1 1 175,176c172 < PUSH_L 0 < SLIDE 1 2 --- > SLIDE 1 1 179,180d174 < PUSH_L 0 < SLIDE 1 1 206,207d199 < PUSH_L 0 < SLIDE 1 1 210,211d201 < PUSH_L 0 < SLIDE 1 1 214,215d203 < PUSH_L 0 < SLIDE 1 1 218,219d205 < PUSH_L 0 < SLIDE 1 1 222,223d207 < PUSH_L 0 < SLIDE 1 1 ... 600,601c566 < PUSH_L 0 < SLIDE 1 2 --- > SLIDE 1 1 604,605d568 < PUSH_L 0 < SLIDE 1 1 632,633d594 < PUSH_L 0 < SLIDE 1 1 636,637d596 < PUSH_L 0 < SLIDE 1 1 640,641d598 < PUSH_L 0 < SLIDE 1 1 644,645d600 < PUSH_L 0 < SLIDE 1 1 648,649d602 < PUSH_L 0 < SLIDE 1 1 652,653d604 < PUSH_L 0 < SLIDE 1 1 656,657d606 < PUSH_L 0 < SLIDE 1 1 660,661d608 < PUSH_L 0 < SLIDE 1 1 664,665d610 < PUSH_L 0 < SLIDE 1 1 ``` I also compiled lib:Cabal to bytecode and counted the number of bytecode lines with `find dist-newstyle -name "*.dump-BCOs" -exec wc {} +`: with unoptimized core: 1190689 lines (before) - 1172891 lines (now) = 17798 less redundant instructions (-1.5% lines) with optimized core: 1924818 lines (before) - 1864836 lines (now) = 59982 less redundant instructions (-3.1% lines) - - - - - 8b2c72c0 by L0neGamer at 2025-09-04T06:32:03-04:00 Add Control.Monad.thenM and Control.Applicative.thenA - - - - - 39e1b7cb by Teo Camarasu at 2025-09-04T06:32:46-04:00 ghc-internal: invert dependency of GHC.Internal.TH.Syntax on Data.Data This means that Data.Data no longer blocks building TH.Syntax, which allows greater parallelism in our builds. We move the Data.Data.Data instances to Data.Data. Quasi depends on Data.Data for one of its methods, so, we split the Quasi/Q, etc definition out of GHC.Internal.TH.Syntax into its own module. This has the added benefit of splitting up this quite large module. Previously TH.Syntax was a bottleneck when compiling ghc-internal. Now it is less of a bottle-neck and is also slightly quicker to compile (since it no longer contains these instances) at the cost of making Data.Data slightly more expensive to compile. TH.Lift which depends on TH.Syntax can also compile quicker and no longer blocks ghc-internal finishing to compile. Resolves #26217 ------------------------- Metric Decrease: MultiLayerModulesTH_OneShot T13253 T21839c T24471 Metric Increase: T12227 ------------------------- - - - - - bdf82fd2 by Teo Camarasu at 2025-09-04T06:32:46-04:00 compiler: delete unused names in Builtins.Names.TH returnQ and bindQ are no longer used in the compiler. There was also a very old comment that referred to them that I have modernized - - - - - 41a448e5 by Ben Gamari at 2025-09-04T19:21:43-04:00 hadrian: Pass lib & include directories to ghc `Setup configure` - - - - - 46bb9a79 by Ben Gamari at 2025-09-04T19:21:44-04:00 rts/IPE: Fix compilation when zstd is enabled This was broken by the refactoring undertaken in c80dd91c0bf6ac034f0c592f16c548b9408a8481. Closes #26312. - - - - - 138a6e34 by sheaf at 2025-09-04T19:22:46-04:00 Make mkCast assertion a bit clearer This commit changes the assertion message that gets printed when one calls mkCast with a coercion whose kind does not match the type of the inner expression. I always found the assertion message a bit confusing, as it didn't clearly state what exactly was the error. - - - - - 9d626be1 by sheaf at 2025-09-04T19:22:46-04:00 Simplifier/rules: fix mistakes in Notes & comments - - - - - 94b62aa7 by Simon Peyton Jones at 2025-09-08T03:37:14-04:00 Refactor ForAllCo This is a pure refactor, addressing #26389. It arranges that the kind coercion in a ForAllCo is a MCoercion, rather than a plain Coercion, thus removing redundancy in the common case. See (FC8) in Note [ForAllCo] It's a nice cleanup. - - - - - 624afa4a by sheaf at 2025-09-08T03:38:05-04:00 Use tcMkScaledFunTys in matchExpectedFunTys We should use tcMkScaledFunTys rather than mkScaledFunTys in GHC.Tc.Utils.Unify.matchExpectedFunTys, as the latter crashes when the kind of the result type is a bare metavariable. We know the result is always Type-like, so we don't need scaledFunTys to try to rediscover that from the kind. Fixes #26277 - - - - - 0975d2b6 by sheaf at 2025-09-08T03:38:54-04:00 Revert "Remove hptAllFamInstances usage during upsweep" This reverts commit 3bf6720eff5e86e673568e756161e6d6150eb440. - - - - - 0cf34176 by soulomoon at 2025-09-08T03:38:54-04:00 Family consistency checks: add test for #26154 This commit adds the test T26154, to make sure that GHC doesn't crash when performing type family consistency checks. This test case was extracted from Agda. Fixes #26154 - - - - - ba210d98 by Simon Peyton Jones at 2025-09-08T16:26:36+01:00 Report solid equality errors before custom errors This MR fixes #26255 by * Reporting solid equality errors like Int ~ Bool before "custom type errors". See comments in `report1` in `reportWanteds` * Suppressing errors that arise from superclasses of Wanteds. See (SCE1) in Note [Suppressing confusing errors] More details in #26255. - - - - - b6249140 by Simon Peyton Jones at 2025-09-10T10:42:38-04:00 Fix a scoping error in Specialise This small patch fixes #26329, which triggered a scoping error. Test is in T21391, with -fpolymorphic-specialisation enabled - - - - - 45305ab8 by sheaf at 2025-09-10T10:43:29-04:00 Make rationalTo{Float,Double} inline in phase 0 We hold off on inlining these until phase 0 to allow constant-folding rules to fire. However, once we get to phase 0, we should inline them, e.g. to expose unboxing opportunities. See CLC proposal #356. - - - - - 0959d4bc by Andreas Klebinger at 2025-09-10T10:44:12-04:00 Add regression test for #26056 - - - - - dc79593d by sheaf at 2025-09-10T10:45:01-04:00 Deep subsumption: unify mults without tcEqMult As seen in #26332, we may well end up with a non-reflexive multiplicity coercion when doing deep subsumption. We should do the same thing that we do without deep subsumption: unify the multiplicities normally, without requiring that the coercion is reflexive (which is what 'tcEqMult' was doing). Fixes #26332 - - - - - 4bfe2269 by sheaf at 2025-09-10T10:45:50-04:00 lint-codes: fixup MSYS drive letter on Windows This change ensures that System.Directory.listDirectory doesn't trip up on an MSYS-style path like '/c/Foo' when trying to list all testsuite stdout/stderr files as required for testing coverage of GHC diagnostic codes in the testsuite. Fixes #25178 - - - - - 56540775 by Ben Gamari at 2025-09-10T10:46:32-04:00 gitlab-ci: Disable split sections on FreeBSD Due to #26303. - - - - - 1537784b by Moritz Angermann at 2025-09-10T10:47:13-04:00 Improve mach-o relocation information This change adds more information about the symbol and addresses we try to relocate in the linker. This significantly helps when deubbging relocation issues reported by users. - - - - - 4e67855b by Moritz Angermann at 2025-09-10T10:47:54-04:00 test.mk expect GhcLeadingUnderscore, not LeadingUnderscore (in line with the other Ghc prefixed variables. - - - - - c1cdd265 by Moritz Angermann at 2025-09-10T10:48:35-04:00 testsuite: Fix broken exec_signals_child.c There is no signal 0. The signal mask is 1-32. - - - - - 99ac335c by Moritz Angermann at 2025-09-10T10:49:15-04:00 testsuite: clarify Windows/Darwin locale rationale for skipping T6037 T2507 T8959a - - - - - 0e8fa77a by Moritz Angermann at 2025-09-10T10:49:56-04:00 Skip broken tests on macOS (due to leading underscore not handled properly in the expected output.) - - - - - 28570c59 by Zubin Duggal at 2025-09-10T10:50:37-04:00 docs(sphinx): fix links to reverse flags when using the :ghc-flag:`-fno-<flag>` syntax This solution is rather hacky and I suspect there is a better way to do this but I don't know enough about Sphinx to do better. Fixes #26352 - - - - - b5d734bd by Alex Washburn at 2025-09-11T11:29:45+00:00 Correcting LLVM linking of Intel BMI intrinsics pdep{8,16} and pext{8,16}. This patch fixes #26045. The LLVM interface does not expose bindings to: - llvm.x86.bmi.pdep.8 - llvm.x86.bmi.pdep.16 - llvm.x86.bmi.pext.8 - llvm.x86.bmi.pext.16 So calls are instead made to llvm.x86.bmi.{pdep,pext}.32 in these cases, with pre/post-operation truncation to constrain the logical value range. - - - - - 291 changed files: - .gitlab-ci.yml - .gitlab/generate-ci/gen_ci.hs - .gitlab/jobs.yaml - compiler/GHC/Builtin/Names/TH.hs - compiler/GHC/ByteCode/Asm.hs - compiler/GHC/ByteCode/Breakpoints.hs - compiler/GHC/ByteCode/Instr.hs - compiler/GHC/ByteCode/Types.hs - compiler/GHC/Cmm/Dataflow/Label.hs - compiler/GHC/CmmToLlvm/CodeGen.hs - compiler/GHC/Core/Coercion.hs - compiler/GHC/Core/Coercion.hs-boot - compiler/GHC/Core/Coercion/Opt.hs - compiler/GHC/Core/Lint.hs - compiler/GHC/Core/Opt/Arity.hs - compiler/GHC/Core/Opt/Monad.hs - compiler/GHC/Core/Opt/Pipeline/Types.hs - compiler/GHC/Core/Opt/Simplify.hs - compiler/GHC/Core/Opt/Simplify/Env.hs - compiler/GHC/Core/Opt/Simplify/Inline.hs - compiler/GHC/Core/Opt/Simplify/Iteration.hs - compiler/GHC/Core/Opt/Simplify/Utils.hs - compiler/GHC/Core/Opt/SpecConstr.hs - compiler/GHC/Core/Opt/Specialise.hs - compiler/GHC/Core/Opt/WorkWrap.hs - compiler/GHC/Core/Reduction.hs - compiler/GHC/Core/Rules.hs - compiler/GHC/Core/TyCo/Compare.hs - compiler/GHC/Core/TyCo/FVs.hs - compiler/GHC/Core/TyCo/Rep.hs - compiler/GHC/Core/TyCo/Subst.hs - compiler/GHC/Core/TyCo/Tidy.hs - compiler/GHC/Core/Type.hs - compiler/GHC/Core/Unify.hs - compiler/GHC/Core/Utils.hs - compiler/GHC/CoreToIface.hs - compiler/GHC/Data/IOEnv.hs - compiler/GHC/Driver/CodeOutput.hs - compiler/GHC/Driver/Config/Core/Lint.hs - compiler/GHC/Driver/Config/Core/Opt/Simplify.hs - compiler/GHC/Driver/Downsweep.hs - compiler/GHC/Driver/Env.hs - compiler/GHC/Driver/Env/Types.hs - compiler/GHC/Driver/Main.hs - compiler/GHC/Driver/Make.hs - compiler/GHC/Driver/Pipeline.hs - compiler/GHC/Driver/Pipeline/Execute.hs - compiler/GHC/Hs/Binds.hs - compiler/GHC/Hs/Expr.hs - compiler/GHC/HsToCore/Pmc/Solver.hs - compiler/GHC/HsToCore/Quote.hs - compiler/GHC/HsToCore/Usage.hs - compiler/GHC/Iface/Load.hs - compiler/GHC/Iface/Make.hs - compiler/GHC/Iface/Recomp.hs - compiler/GHC/Iface/Recomp/Types.hs - compiler/GHC/Iface/Rename.hs - compiler/GHC/Iface/Syntax.hs - compiler/GHC/Iface/Type.hs - compiler/GHC/IfaceToCore.hs - compiler/GHC/Linker/Loader.hs - compiler/GHC/Parser.y - compiler/GHC/Plugins.hs - compiler/GHC/Rename/HsType.hs - compiler/GHC/Rename/Splice.hs - compiler/GHC/Runtime/Debugger/Breakpoints.hs - compiler/GHC/Runtime/Eval.hs - compiler/GHC/Runtime/Heap/Inspect.hs - compiler/GHC/Stg/Lint.hs - compiler/GHC/StgToByteCode.hs - compiler/GHC/Tc/Deriv/Generics.hs - compiler/GHC/Tc/Errors.hs - compiler/GHC/Tc/Gen/App.hs - compiler/GHC/Tc/Gen/Bind.hs - compiler/GHC/Tc/Gen/Expr.hs - compiler/GHC/Tc/Gen/Expr.hs-boot - compiler/GHC/Tc/Gen/Head.hs - compiler/GHC/Tc/Gen/HsType.hs - compiler/GHC/Tc/Gen/Match.hs - compiler/GHC/Tc/Gen/Pat.hs - compiler/GHC/Tc/Gen/Splice.hs - compiler/GHC/Tc/Gen/Splice.hs-boot - compiler/GHC/Tc/Instance/Family.hs - compiler/GHC/Tc/Module.hs - compiler/GHC/Tc/TyCl/Utils.hs - compiler/GHC/Tc/Types.hs - compiler/GHC/Tc/Types/Constraint.hs - compiler/GHC/Tc/Types/Evidence.hs - compiler/GHC/Tc/Types/TH.hs - compiler/GHC/Tc/Utils/Monad.hs - compiler/GHC/Tc/Utils/TcMType.hs - compiler/GHC/Tc/Utils/TcType.hs - compiler/GHC/Tc/Utils/Unify.hs - compiler/GHC/ThToHs.hs - compiler/GHC/Types/Basic.hs - compiler/GHC/Types/Id/Make.hs - compiler/GHC/Unit/Finder.hs - compiler/GHC/Unit/Finder/Types.hs - compiler/GHC/Unit/Home/Graph.hs - compiler/GHC/Unit/Home/PackageTable.hs - compiler/GHC/Unit/Module/Deps.hs - compiler/GHC/Utils/Binary.hs - compiler/ghc.cabal.in - configure.ac - docs/users_guide/9.16.1-notes.rst - docs/users_guide/flags.py - docs/users_guide/separate_compilation.rst - ghc/GHCi/UI.hs - hadrian/src/Oracles/TestSettings.hs - hadrian/src/Rules/ToolArgs.hs - hadrian/src/Settings/Default.hs - hadrian/src/Settings/Packages.hs - libraries/base/changelog.md - libraries/base/src/Control/Applicative.hs - libraries/base/src/Control/Monad.hs - libraries/base/src/Data/Array/Byte.hs - libraries/base/src/Data/Bifunctor.hs - libraries/base/src/Data/Fixed.hs - libraries/base/src/GHC/Stack/CloneStack.hs - + libraries/ghc-boot-th/GHC/Boot/TH/Monad.hs - libraries/ghc-boot-th/GHC/Boot/TH/Ppr.hs - libraries/ghc-boot-th/ghc-boot-th.cabal.in - libraries/ghc-experimental/src/GHC/Stack/Annotation/Experimental.hs - libraries/ghc-heap/GHC/Exts/Heap/ClosureTypes.hs - libraries/ghc-heap/GHC/Exts/Heap/Closures.hs - + libraries/ghc-heap/GHC/Exts/Heap/Constants.hs - + libraries/ghc-heap/GHC/Exts/Heap/InfoTable.hs - + libraries/ghc-heap/GHC/Exts/Heap/InfoTable/Types.hs - + libraries/ghc-heap/GHC/Exts/Heap/InfoTableProf.hs - libraries/ghc-heap/GHC/Exts/Heap/ProfInfo/Types.hs - + libraries/ghc-heap/GHC/Exts/Stack/Constants.hs - libraries/ghc-heap/GHC/Exts/Stack/Decode.hs - libraries/ghc-heap/ghc-heap.cabal.in - − libraries/ghc-heap/tests/stack-annotation/ann_frame004.stdout - libraries/ghc-heap/cbits/HeapPrim.cmm → libraries/ghc-internal/cbits/HeapPrim.cmm - libraries/ghc-heap/cbits/Stack.cmm → libraries/ghc-internal/cbits/Stack.cmm - libraries/ghc-internal/cbits/StackCloningDecoding.cmm - libraries/ghc-heap/cbits/Stack_c.c → libraries/ghc-internal/cbits/Stack_c.c - libraries/ghc-internal/ghc-internal.cabal.in - libraries/ghc-internal/jsbits/base.js - libraries/ghc-internal/src/GHC/Internal/Base.hs - libraries/ghc-internal/src/GHC/Internal/Control/Monad.hs - libraries/ghc-internal/src/GHC/Internal/Data/Data.hs - libraries/ghc-internal/src/GHC/Internal/Exception/Backtrace.hs - libraries/ghc-internal/src/GHC/Internal/Float.hs - + libraries/ghc-internal/src/GHC/Internal/Heap/Closures.hs - libraries/ghc-heap/GHC/Exts/Heap/Constants.hsc → libraries/ghc-internal/src/GHC/Internal/Heap/Constants.hsc - libraries/ghc-heap/GHC/Exts/Heap/InfoTable.hsc → libraries/ghc-internal/src/GHC/Internal/Heap/InfoTable.hsc - libraries/ghc-heap/GHC/Exts/Heap/InfoTable/Types.hsc → libraries/ghc-internal/src/GHC/Internal/Heap/InfoTable/Types.hsc - libraries/ghc-heap/GHC/Exts/Heap/InfoTableProf.hsc → libraries/ghc-internal/src/GHC/Internal/Heap/InfoTableProf.hsc - + libraries/ghc-internal/src/GHC/Internal/Heap/ProfInfo/Types.hs - libraries/ghc-internal/src/GHC/Internal/List.hs - + libraries/ghc-internal/src/GHC/Internal/Stack/Annotation.hs - libraries/ghc-internal/src/GHC/Internal/Stack/CloneStack.hs - libraries/ghc-heap/GHC/Exts/Stack/Constants.hsc → libraries/ghc-internal/src/GHC/Internal/Stack/Constants.hsc - + libraries/ghc-internal/src/GHC/Internal/Stack/Decode.hs - libraries/ghc-internal/src/GHC/Internal/System/Posix/Internals.hs - libraries/ghc-internal/src/GHC/Internal/TH/Lib.hs - libraries/ghc-internal/src/GHC/Internal/TH/Lift.hs - + libraries/ghc-internal/src/GHC/Internal/TH/Monad.hs - libraries/ghc-internal/src/GHC/Internal/TH/Quote.hs - libraries/ghc-internal/src/GHC/Internal/TH/Syntax.hs - libraries/ghc-heap/tests/stack-annotation/Makefile → libraries/ghc-internal/tests/stack-annotation/Makefile - libraries/ghc-heap/tests/stack-annotation/TestUtils.hs → libraries/ghc-internal/tests/stack-annotation/TestUtils.hs - libraries/ghc-heap/tests/stack-annotation/all.T → libraries/ghc-internal/tests/stack-annotation/all.T - libraries/ghc-heap/tests/stack-annotation/ann_frame001.hs → libraries/ghc-internal/tests/stack-annotation/ann_frame001.hs - libraries/ghc-heap/tests/stack-annotation/ann_frame001.stdout → libraries/ghc-internal/tests/stack-annotation/ann_frame001.stdout - libraries/ghc-heap/tests/stack-annotation/ann_frame002.hs → libraries/ghc-internal/tests/stack-annotation/ann_frame002.hs - libraries/ghc-heap/tests/stack-annotation/ann_frame002.stdout → libraries/ghc-internal/tests/stack-annotation/ann_frame002.stdout - libraries/ghc-heap/tests/stack-annotation/ann_frame003.hs → libraries/ghc-internal/tests/stack-annotation/ann_frame003.hs - libraries/ghc-heap/tests/stack-annotation/ann_frame003.stdout → libraries/ghc-internal/tests/stack-annotation/ann_frame003.stdout - libraries/ghc-heap/tests/stack-annotation/ann_frame004.hs → libraries/ghc-internal/tests/stack-annotation/ann_frame004.hs - + libraries/ghc-internal/tests/stack-annotation/ann_frame004.stdout - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs - libraries/ghci/GHCi/TH.hs - libraries/ghci/ghci.cabal.in - libraries/template-haskell/Language/Haskell/TH/Lib.hs - libraries/template-haskell/Language/Haskell/TH/Quote.hs - libraries/template-haskell/Language/Haskell/TH/Syntax.hs - libraries/template-haskell/changelog.md - linters/lint-codes/LintCodes/Coverage.hs - rts/CloneStack.c - rts/CloneStack.h - rts/Disassembler.c - rts/IPE.c - rts/Interpreter.c - rts/PrimOps.cmm - rts/Profiling.c - rts/RaiseAsync.c - rts/RtsMessages.c - rts/RtsSymbols.c - rts/RtsUtils.c - rts/STM.c - rts/Trace.c - rts/include/rts/Bytecodes.h - rts/linker/MachO.c - testsuite/.gitignore - testsuite/ghc-config/ghc-config.hs - testsuite/tests/count-deps/CountDepsAst.stdout - testsuite/tests/count-deps/CountDepsParser.stdout - testsuite/tests/deriving/should_compile/T14682.stderr - testsuite/tests/deriving/should_compile/drv-empty-data.stderr - testsuite/tests/driver/all.T - + testsuite/tests/driver/make-prim/GHC/Internal/Prim.hs - + testsuite/tests/driver/make-prim/Makefile - + testsuite/tests/driver/make-prim/Test.hs - + testsuite/tests/driver/make-prim/Test2.hs - + testsuite/tests/driver/make-prim/all.T - testsuite/tests/ghci.debugger/scripts/T26042b.script - testsuite/tests/ghci.debugger/scripts/T26042b.stdout - testsuite/tests/ghci.debugger/scripts/T26042c.script - testsuite/tests/ghci.debugger/scripts/T26042c.stdout - + testsuite/tests/ghci.debugger/scripts/T26042d2.hs - + testsuite/tests/ghci.debugger/scripts/T26042d2.script - + testsuite/tests/ghci.debugger/scripts/T26042d2.stdout - testsuite/tests/ghci.debugger/scripts/T26042e.stdout - testsuite/tests/ghci.debugger/scripts/T26042f.script - testsuite/tests/ghci.debugger/scripts/T26042f1.stdout - testsuite/tests/ghci.debugger/scripts/T26042f2.stdout - testsuite/tests/ghci.debugger/scripts/T26042g.stdout - testsuite/tests/ghci.debugger/scripts/all.T - testsuite/tests/interface-stability/base-exports.stdout - testsuite/tests/interface-stability/base-exports.stdout-javascript-unknown-ghcjs - testsuite/tests/interface-stability/base-exports.stdout-mingw32 - testsuite/tests/interface-stability/base-exports.stdout-ws-32 - testsuite/tests/interface-stability/ghc-experimental-exports.stdout - testsuite/tests/interface-stability/ghc-experimental-exports.stdout-mingw32 - testsuite/tests/interface-stability/template-haskell-exports.stdout - − testsuite/tests/lib/stm/T26028.hs - − testsuite/tests/lib/stm/T26028.stdout - − testsuite/tests/lib/stm/all.T - + testsuite/tests/linear/should_compile/T26332.hs - testsuite/tests/linear/should_compile/all.T - + testsuite/tests/llvm/should_run/T26065.hs - + testsuite/tests/llvm/should_run/T26065.stdout - testsuite/tests/llvm/should_run/all.T - + testsuite/tests/patsyn/should_compile/T26331.hs - + testsuite/tests/patsyn/should_compile/T26331a.hs - testsuite/tests/patsyn/should_compile/all.T - testsuite/tests/perf/compiler/T4007.stdout - testsuite/tests/plugins/plugins10.stdout - + testsuite/tests/profiling/should_compile/T26056.hs - testsuite/tests/profiling/should_compile/all.T - testsuite/tests/profiling/should_run/callstack001.stdout - testsuite/tests/rts/all.T - testsuite/tests/rts/exec_signals_child.c - testsuite/tests/rts/linker/T11223/all.T - testsuite/tests/simplCore/should_compile/OpaqueNoCastWW.stderr - testsuite/tests/simplCore/should_compile/T15056.stderr - testsuite/tests/simplCore/should_compile/T15445.stderr - + testsuite/tests/simplCore/should_compile/T26323b.hs - testsuite/tests/simplCore/should_compile/all.T - + testsuite/tests/simplCore/should_run/T26323.hs - + testsuite/tests/simplCore/should_run/T26323.stdout - testsuite/tests/simplCore/should_run/all.T - testsuite/tests/splice-imports/SI29.stderr - testsuite/tests/th/Makefile - testsuite/tests/th/T11452.stderr - testsuite/tests/th/T15321.stderr - testsuite/tests/th/T7276.stderr - + testsuite/tests/th/TH_Depends_Dir.hs - + testsuite/tests/th/TH_Depends_Dir.stdout - + testsuite/tests/th/TH_Depends_Dir_External.hs - testsuite/tests/th/TH_NestedSplicesFail3.stderr - testsuite/tests/th/TH_NestedSplicesFail4.stderr - testsuite/tests/th/all.T - + testsuite/tests/typecheck/should_compile/T26154.hs - + testsuite/tests/typecheck/should_compile/T26154_A.hs - + testsuite/tests/typecheck/should_compile/T26154_B.hs - + testsuite/tests/typecheck/should_compile/T26154_B.hs-boot - + testsuite/tests/typecheck/should_compile/T26154_Other.hs - + testsuite/tests/typecheck/should_compile/T26277.hs - + testsuite/tests/typecheck/should_compile/T26345.hs - + testsuite/tests/typecheck/should_compile/T26346.hs - + testsuite/tests/typecheck/should_compile/T26350.hs - + testsuite/tests/typecheck/should_compile/T26358.hs - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_fail/T18851.hs - + testsuite/tests/typecheck/should_fail/T26255a.hs - + testsuite/tests/typecheck/should_fail/T26255a.stderr - + testsuite/tests/typecheck/should_fail/T26255b.hs - + testsuite/tests/typecheck/should_fail/T26255b.stderr - + testsuite/tests/typecheck/should_fail/T26255c.hs - + testsuite/tests/typecheck/should_fail/T26255c.stderr - + testsuite/tests/typecheck/should_fail/T26318.hs - + testsuite/tests/typecheck/should_fail/T26318.stderr - testsuite/tests/typecheck/should_fail/UnliftedNewtypesFamilyKindFail2.stderr - testsuite/tests/typecheck/should_fail/all.T - utils/check-exact/ExactPrint.hs - utils/jsffi/dyld.mjs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1db5d2a82682d4b6306bc5a3a1c8f9… -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1db5d2a82682d4b6306bc5a3a1c8f9… You're receiving this email because of your account on gitlab.haskell.org.

1 0