Simon Peyton Jones pushed to branch wip/T26346 at Glasgow Haskell Compiler / GHC Commits: b9d19411 by Simon Peyton Jones at 2025-08-27T12:35:53+01:00 Responding to Sam - - - - - 3 changed files: - compiler/GHC/Core/Unify.hs - + testsuite/tests/typecheck/should_compile/T26358.hs - testsuite/tests/typecheck/should_compile/all.T Changes: ===================================== compiler/GHC/Core/Unify.hs ===================================== @@ -431,24 +431,7 @@ Wrinkles (ATF12) There is a horrid exception for the injectivity check. See (UR1) in in Note [Specification of unification]. -(ATF13) Consider unifying - [F a, F Int, Int] ~ [Bool, Char, a] - Working left to right you might think we would build the mapping - F a :-> Bool - F Int :-> Char - Now we discover that `a` unifies with `Int`. So really these two lists are Apart - because F Int can't be both Bool and Char. - - But that is very tricky! Perhaps whenever we unify a type variable we should - run it over the domain and (maybe range) of the type-family mapping too? - Sigh. - - For we make no such attempt. The um_fam_env has only pre-substituted types. - Fortunately, while this may make use say MaybeApart when we could say SurelyApart, - it has no effect on the correctness of unification: if we return Unifiable, it - really is Unifiable. - -(ATF14) We have to be careful about the occurs check. +(ATF13) We have to be careful about the occurs check. See Note [The occurs check in the Core unifier] SIDE NOTE. The paper "Closed type families with overlapping equations" @@ -474,6 +457,49 @@ and all is lost. But with the current algorithm we have that a a ~ (Var A) (Var B) is SurelyApart, so the first equation definitely doesn't match and we can try the second, which does. END OF SIDE NOTE. + +Note [Shortcomings of the apartness test] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Note [Apartness and type families] is very clever. + +But it still has shortcomings (#26358). Consider unifying + [F a, F Int, Int] ~ [Bool, Char, a] +Working left to right you might think we would build the mapping + F a :-> Bool + F Int :-> Char +Now we discover that `a` unifies with `Int`. So really these two lists are Apart +because F Int can't be both Bool and Char. + +Just the same applies when adding a type-family binding to um_fam_env: + [F (G Float), F Int, G Float] ~ [Bool, Char, Iont] +Again these are Apart, because (G Float = Int), +and (F Int) can't be both Bool and Char + +But achieving this is very tricky! Perhaps whenever we unify a type variable, +or a type family, we should run it over the domain and (maybe range) of the +type-family mapping too? Sigh. + +For now we make no such attempt. +* The um_fam_env has only /un-substituted/ types. +* We look up on ly /un-substituted/ types in um_fam_env + +This may make us say MaybeApart when we could say SurelyApart, but it has no +effect on the correctness of unification: if we return Unifiable, it really is +Unifiable. + +This is all quite subtle. suppose we have: + um_tv_env: c :-> b + um_fam_env F b :-> a +and we are trying to add a :-> F c. We will call lookupFamEnv on (F, [c]), which will +fail because b and c are not equal. So we go ahead and add a :-> F c as a new tyvar eq, +getting: + um_tv_env: a :-> F c, c :-> b + um_fam_env F b :-> a + +Does that loop, like this: + a --> F c --> F b --> a? +No, becuase we do not substitute (F c) to (F b) and then look up in um_fam_env; +we look up only un-substituted types. -} {- ********************************************************************* @@ -1810,6 +1836,7 @@ uVarOrFam env ty1 ty2 kco ----------------------------- -- LHS is a type variable -- The sequence of tests is very similar to go_tv + go :: SwapFlag -> UMState -> CanEqLHS -> InType -> OutCoercion -> UM () go swapped substs lhs@(TyVarLHS tv1) ty2 kco | Just ty1' <- lookupVarEnv (um_tv_env substs) tv1' = -- We already have a substitution for tv1 @@ -1863,7 +1890,7 @@ uVarOrFam env ty1 ty2 kco tv1' = umRnOccL env tv1 ty2_fvs = tyCoVarsOfType ty2 rhs = ty2 `mkCastTy` mkSymCo kco - tv1_is_bindable | not (tv1' `elemVarSet` um_foralls env) + tv1_is_bindable | not (tv1' `elemVarSet` foralld_tvs) -- tv1' is not forall-bound, but tv1 can still differ -- from tv1; see Note [Cloning the template binders] -- in GHC.Core.Rules. So give tv1' to um_bind_tv_fun. @@ -1872,7 +1899,8 @@ uVarOrFam env ty1 ty2 kco | otherwise = False - occurs_check = um_unif env && uOccursCheck substs lhs rhs + foralld_tvs = um_foralls env + occurs_check = um_unif env && uOccursCheck substs foralld_tvs lhs rhs -- Occurs check, only when unifying -- see Note [Infinitary substitutions] -- Make sure you include `kco` in rhs #14846 @@ -1906,9 +1934,11 @@ uVarOrFam env ty1 ty2 kco -- Now check if we can bind the (F tys) to the RHS -- This can happen even when matching: see (ATF7) | BindMe <- um_bind_fam_fun env tc1 tys1 rhs - = if uOccursCheck substs lhs rhs + = if uOccursCheck substs emptyVarSet lhs rhs then maybeApart MARInfinite - else do { extendFamEnv tc1 tys1 rhs -- We don't substitue tys1; see (ATF13) + else do { extendFamEnv tc1 tys1 rhs + -- We don't substitute tys1 before extending + -- See Note [Shortcomings of the apartness test] ; maybeApart MARTypeFamily } -- Swap in case of (F a b) ~ (G c d e) @@ -1929,6 +1959,7 @@ uVarOrFam env ty1 ty2 kco ----------------------------- -- go_fam_fam: LHS and RHS are both saturated type-family applications, -- for the same type-family F + -- Precondition: um_foralls is empty go_fam_fam substs tc tys1 tys2 kco -- Decompose (F tys1 ~ F tys2): (ATF9) -- Use injectivity information of F: (ATF10) @@ -1950,11 +1981,11 @@ uVarOrFam env ty1 ty2 kco | not (um_unif env) -- Not when matching (ATF11-1) = return () | BindMe <- um_bind_fam_fun env tc tys1 rhs1 - = unless (uOccursCheck substs (TyFamLHS tc tys1) rhs1) $ + = unless (uOccursCheck substs emptyVarSet (TyFamLHS tc tys1) rhs1) $ extendFamEnv tc tys1 rhs1 -- At this point um_unif=True, so we can unify either way | BindMe <- um_bind_fam_fun env tc tys2 rhs2 - = unless (uOccursCheck substs (TyFamLHS tc tys2) rhs2) $ + = unless (uOccursCheck substs emptyVarSet (TyFamLHS tc tys2) rhs2) $ extendFamEnv tc tys2 rhs2 | otherwise = return () @@ -1963,12 +1994,15 @@ uVarOrFam env ty1 ty2 kco rhs2 = mkTyConApp tc tys1 `mkCastTy` kco -uOccursCheck :: UMState -> CanEqLHS -> Type -> Bool --- See Note [The occurs check in the Core unifier] and (ATF14) -uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) lhs ty - = go emptyVarSet ty +uOccursCheck :: UMState + -> TyVarSet -- Bound by enclosing foralls; see (OCU1) + -> CanEqLHS -> Type -- Can we unify (lhs := ty)? + -> Bool +-- See Note [The occurs check in the Core unifier] and (ATF13) +uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) bvs lhs ty + = go bvs ty where - go :: TyCoVarSet -- Bound by enclosing foralls + go :: TyCoVarSet -- Bound by enclosing foralls; see (OCU1) -> Type -> Bool go bvs ty | Just ty' <- coreView ty = go bvs ty' go bvs (TyVarTy tv) | Just ty' <- lookupVarEnv tv_env tv @@ -1993,8 +2027,11 @@ uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) lhs ty go _ (CoercionTy _co) = False -- ToDo: should we worry about `co`? go_tc bvs tc tys - | isTypeFamilyTyCon tc + | isEmptyVarSet bvs -- Never look up in um_fam_env under a forall (ATF3) + , isTypeFamilyTyCon tc , Just ty' <- lookupFamEnv fam_env tc (take arity tys) + -- NB: we look up /un-substituted/ types; + -- See Note [Shortcomings of the apartness test] = go bvs ty' || any (go bvs) (drop arity tys) | TyFamLHS tc' tys' <- lhs @@ -2022,6 +2059,11 @@ could lead to a loop. That is, could there by a type `s` such that It's vital that we do both at once: we might have (1) already and add (2); or we might have (2) already and add (1). +(OCU1) We keep track of the forall-bound variables because the um_fam_env is inactive + under a forall; indeed it is /unsound/ to consult it becuase we have have a binding + (F a :-> Int), and then unify (forall a. ...(F a)...) with something. We don't + want to map that (F a) to Int! + Note [Unifying coercion-foralls] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Suppose we try to unify (forall cv. t1) ~ (forall cv. t2). ===================================== testsuite/tests/typecheck/should_compile/T26358.hs ===================================== @@ -0,0 +1,48 @@ +{-# LANGUAGE TypeFamilies #-} + +module T26358 where +import Data.Kind +import Data.Proxy + +{- Two failing tests, described in GHC.Core.Unify + Note [Shortcomings of the apartness test] + +Explanation for TF2 +* We try to reduce + (TF2 (F (G Float)) (F Int) (G Float)) +* We can only do so if those arguments are apart from the first + equation of TF2, namely (Bool,Char,Int). +* So we try to unify + [F (G Float), F Int, G Float] ~ [Bool, Char, Int] +* They really are apart, but we can't quite spot that yet; + hence #26358 + +TF1 is similar. +-} + + +type TF1 :: Type -> Type -> Type -> Type +type family TF1 a b c where + TF1 Bool Char a = Word + TF1 a b c = (a,b,c) + +type F :: Type -> Type +type family F a where + +foo :: Proxy a + -> Proxy (TF1 (F a) (F Int) Int) + -> Proxy (F a, F Int, Int) +foo _ px = px + +type TF2 :: Type -> Type -> Type -> Type +type family TF2 a b c where + TF2 Bool Char Int = Word + TF2 a b c = (a,b,c) + +type G :: Type -> Type +type family G a where + +bar :: Proxy (TF2 (F (G Float)) (F Int) (G Float)) + -> Proxy (F (G Float), F Int, G Float) +bar px = px + ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -946,3 +946,4 @@ test('T14010', normal, compile, ['']) test('T26256a', normal, compile, ['']) test('T25992a', normal, compile, ['']) test('T26346', normal, compile, ['']) +test('T26358', expect_broken(26358), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/b9d194118242f03283813c7b82343992... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/b9d194118242f03283813c7b82343992... You're receiving this email because of your account on gitlab.haskell.org.