Simon Peyton Jones pushed to branch wip/T26346 at Glasgow Haskell Compiler / GHC

Commits:

3 changed files:

Changes:

  • compiler/GHC/Core/Unify.hs
    ... ... @@ -431,24 +431,7 @@ Wrinkles
    431 431
     (ATF12) There is a horrid exception for the injectivity check. See (UR1) in
    
    432 432
       in Note [Specification of unification].
    
    433 433
     
    
    434
    -(ATF13) Consider unifying
    
    435
    -    [F a, F Int, Int]  ~  [Bool, Char, a]
    
    436
    -  Working left to right you might think we would build the mapping
    
    437
    -    F a   :-> Bool
    
    438
    -    F Int :-> Char
    
    439
    -  Now we discover that `a` unifies with `Int`. So really these two lists are Apart
    
    440
    -  because F Int can't be both Bool and Char.
    
    441
    -
    
    442
    -  But that is very tricky! Perhaps whenever we unify a type variable we should
    
    443
    -  run it over the domain and (maybe range) of the type-family mapping too?
    
    444
    -  Sigh.
    
    445
    -
    
    446
    -  For we make no such attempt.  The um_fam_env has only pre-substituted types.
    
    447
    -  Fortunately, while this may make use say MaybeApart when we could say SurelyApart,
    
    448
    -  it has no effect on the correctness of unification: if we return Unifiable, it
    
    449
    -  really is Unifiable.
    
    450
    -
    
    451
    -(ATF14) We have to be careful about the occurs check.
    
    434
    +(ATF13) We have to be careful about the occurs check.
    
    452 435
       See Note [The occurs check in the Core unifier]
    
    453 436
     
    
    454 437
     SIDE NOTE.  The paper "Closed type families with overlapping equations"
    
    ... ... @@ -474,6 +457,49 @@ and all is lost. But with the current algorithm we have that
    474 457
         a a   ~    (Var A) (Var B)
    
    475 458
     is SurelyApart, so the first equation definitely doesn't match and we can try the
    
    476 459
     second, which does.  END OF SIDE NOTE.
    
    460
    +
    
    461
    +Note [Shortcomings of the apartness test]
    
    462
    +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    463
    +Note [Apartness and type families] is very clever.
    
    464
    +
    
    465
    +But it still has shortcomings (#26358).  Consider unifying
    
    466
    +    [F a, F Int, Int]  ~  [Bool, Char, a]
    
    467
    +Working left to right you might think we would build the mapping
    
    468
    +  F a   :-> Bool
    
    469
    +  F Int :-> Char
    
    470
    +Now we discover that `a` unifies with `Int`. So really these two lists are Apart
    
    471
    +because F Int can't be both Bool and Char.
    
    472
    +
    
    473
    +Just the same applies when adding a type-family binding to um_fam_env:
    
    474
    +  [F (G Float), F Int, G Float] ~ [Bool, Char, Iont]
    
    475
    +Again these are Apart, because (G Float = Int),
    
    476
    +and (F Int) can't be both Bool and Char
    
    477
    +
    
    478
    +But achieving this is very tricky! Perhaps whenever we unify a type variable,
    
    479
    +or a type family, we should run it over the domain and (maybe range) of the
    
    480
    +type-family mapping too?  Sigh.
    
    481
    +
    
    482
    +For now we make no such attempt.
    
    483
    +* The um_fam_env has only /un-substituted/ types.
    
    484
    +* We look up on ly /un-substituted/ types in um_fam_env
    
    485
    +
    
    486
    +This may make us say MaybeApart when we could say SurelyApart, but it has no
    
    487
    +effect on the correctness of unification: if we return Unifiable, it really is
    
    488
    +Unifiable.
    
    489
    +
    
    490
    +This is all quite subtle. suppose we have:
    
    491
    +    um_tv_env:   c :-> b
    
    492
    +    um_fam_env   F b :-> a
    
    493
    +and we are trying to add a :-> F c. We will call lookupFamEnv on (F, [c]), which will
    
    494
    +fail because b and c are not equal. So we go ahead and add a :-> F c as a new tyvar eq,
    
    495
    +getting:
    
    496
    +    um_tv_env:   a :-> F c, c :-> b
    
    497
    +    um_fam_env   F b :-> a
    
    498
    +
    
    499
    +Does that loop, like this:
    
    500
    +   a --> F c --> F b --> a?
    
    501
    +No, becuase we do not substitute (F c) to (F b) and then look up in um_fam_env;
    
    502
    +we look up only un-substituted types.
    
    477 503
     -}
    
    478 504
     
    
    479 505
     {- *********************************************************************
    
    ... ... @@ -1810,6 +1836,7 @@ uVarOrFam env ty1 ty2 kco
    1810 1836
         -----------------------------
    
    1811 1837
         -- LHS is a type variable
    
    1812 1838
         -- The sequence of tests is very similar to go_tv
    
    1839
    +    go :: SwapFlag -> UMState -> CanEqLHS -> InType -> OutCoercion -> UM ()
    
    1813 1840
         go swapped substs lhs@(TyVarLHS tv1) ty2 kco
    
    1814 1841
           | Just ty1' <- lookupVarEnv (um_tv_env substs) tv1'
    
    1815 1842
           = -- We already have a substitution for tv1
    
    ... ... @@ -1863,7 +1890,7 @@ uVarOrFam env ty1 ty2 kco
    1863 1890
             tv1'            = umRnOccL env tv1
    
    1864 1891
             ty2_fvs         = tyCoVarsOfType ty2
    
    1865 1892
             rhs             = ty2 `mkCastTy` mkSymCo kco
    
    1866
    -        tv1_is_bindable | not (tv1' `elemVarSet` um_foralls env)
    
    1893
    +        tv1_is_bindable | not (tv1' `elemVarSet` foralld_tvs)
    
    1867 1894
                               -- tv1' is not forall-bound, but tv1 can still differ
    
    1868 1895
                               -- from tv1; see Note [Cloning the template binders]
    
    1869 1896
                               -- in GHC.Core.Rules.  So give tv1' to um_bind_tv_fun.
    
    ... ... @@ -1872,7 +1899,8 @@ uVarOrFam env ty1 ty2 kco
    1872 1899
                             | otherwise
    
    1873 1900
                             = False
    
    1874 1901
     
    
    1875
    -        occurs_check = um_unif env && uOccursCheck substs lhs rhs
    
    1902
    +        foralld_tvs  = um_foralls env
    
    1903
    +        occurs_check = um_unif env && uOccursCheck substs foralld_tvs lhs rhs
    
    1876 1904
               -- Occurs check, only when unifying
    
    1877 1905
               -- see Note [Infinitary substitutions]
    
    1878 1906
               -- Make sure you include `kco` in rhs #14846
    
    ... ... @@ -1906,9 +1934,11 @@ uVarOrFam env ty1 ty2 kco
    1906 1934
           -- Now check if we can bind the (F tys) to the RHS
    
    1907 1935
           -- This can happen even when matching: see (ATF7)
    
    1908 1936
           | BindMe <- um_bind_fam_fun env tc1 tys1 rhs
    
    1909
    -      = if uOccursCheck substs lhs rhs
    
    1937
    +      = if uOccursCheck substs emptyVarSet lhs rhs
    
    1910 1938
             then maybeApart MARInfinite
    
    1911
    -        else do { extendFamEnv tc1 tys1 rhs  -- We don't substitue tys1; see (ATF13)
    
    1939
    +        else do { extendFamEnv tc1 tys1 rhs
    
    1940
    +                     -- We don't substitute tys1 before extending
    
    1941
    +                     -- See Note [Shortcomings of the apartness test]
    
    1912 1942
                     ; maybeApart MARTypeFamily }
    
    1913 1943
     
    
    1914 1944
           -- Swap in case of (F a b) ~ (G c d e)
    
    ... ... @@ -1929,6 +1959,7 @@ uVarOrFam env ty1 ty2 kco
    1929 1959
         -----------------------------
    
    1930 1960
         -- go_fam_fam: LHS and RHS are both saturated type-family applications,
    
    1931 1961
         --             for the same type-family F
    
    1962
    +    -- Precondition: um_foralls is empty
    
    1932 1963
         go_fam_fam substs tc tys1 tys2 kco
    
    1933 1964
            -- Decompose (F tys1 ~ F tys2): (ATF9)
    
    1934 1965
            -- Use injectivity information of F: (ATF10)
    
    ... ... @@ -1950,11 +1981,11 @@ uVarOrFam env ty1 ty2 kco
    1950 1981
              | not (um_unif env)  -- Not when matching (ATF11-1)
    
    1951 1982
              = return ()
    
    1952 1983
              | BindMe <- um_bind_fam_fun env tc tys1 rhs1
    
    1953
    -         = unless (uOccursCheck substs (TyFamLHS tc tys1) rhs1) $
    
    1984
    +         = unless (uOccursCheck substs emptyVarSet (TyFamLHS tc tys1) rhs1) $
    
    1954 1985
                extendFamEnv tc tys1 rhs1
    
    1955 1986
              -- At this point um_unif=True, so we can unify either way
    
    1956 1987
              | BindMe <- um_bind_fam_fun env tc tys2 rhs2
    
    1957
    -         = unless (uOccursCheck substs (TyFamLHS tc tys2) rhs2) $
    
    1988
    +         = unless (uOccursCheck substs emptyVarSet (TyFamLHS tc tys2) rhs2) $
    
    1958 1989
                extendFamEnv tc tys2 rhs2
    
    1959 1990
              | otherwise
    
    1960 1991
              = return ()
    
    ... ... @@ -1963,12 +1994,15 @@ uVarOrFam env ty1 ty2 kco
    1963 1994
            rhs2 = mkTyConApp tc tys1 `mkCastTy` kco
    
    1964 1995
     
    
    1965 1996
     
    
    1966
    -uOccursCheck :: UMState -> CanEqLHS -> Type -> Bool
    
    1967
    --- See Note [The occurs check in the Core unifier] and (ATF14)
    
    1968
    -uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) lhs ty
    
    1969
    -  = go emptyVarSet ty
    
    1997
    +uOccursCheck :: UMState
    
    1998
    +             -> TyVarSet -- Bound by enclosing foralls; see (OCU1)
    
    1999
    +             -> CanEqLHS -> Type   -- Can we unify (lhs := ty)?
    
    2000
    +             -> Bool
    
    2001
    +-- See Note [The occurs check in the Core unifier] and (ATF13)
    
    2002
    +uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) bvs lhs ty
    
    2003
    +  = go bvs ty
    
    1970 2004
       where
    
    1971
    -    go :: TyCoVarSet   -- Bound by enclosing foralls
    
    2005
    +    go :: TyCoVarSet   -- Bound by enclosing foralls; see (OCU1)
    
    1972 2006
            -> Type -> Bool
    
    1973 2007
         go bvs ty | Just ty' <- coreView ty = go bvs ty'
    
    1974 2008
         go bvs (TyVarTy tv) | Just ty' <- lookupVarEnv tv_env tv
    
    ... ... @@ -1993,8 +2027,11 @@ uOccursCheck (UMState { um_tv_env = tv_env, um_fam_env = fam_env }) lhs ty
    1993 2027
         go _   (CoercionTy _co) = False      -- ToDo: should we worry about `co`?
    
    1994 2028
     
    
    1995 2029
         go_tc bvs tc tys
    
    1996
    -      | isTypeFamilyTyCon tc
    
    2030
    +      | isEmptyVarSet bvs   -- Never look up in um_fam_env under a forall (ATF3)
    
    2031
    +      , isTypeFamilyTyCon tc
    
    1997 2032
           , Just ty' <- lookupFamEnv fam_env tc (take arity tys)
    
    2033
    +             -- NB: we look up /un-substituted/ types;
    
    2034
    +             -- See Note [Shortcomings of the apartness test]
    
    1998 2035
           = go bvs ty' || any (go bvs) (drop arity tys)
    
    1999 2036
     
    
    2000 2037
           | TyFamLHS tc' tys' <- lhs
    
    ... ... @@ -2022,6 +2059,11 @@ could lead to a loop. That is, could there by a type `s` such that
    2022 2059
     It's vital that we do both at once: we might have (1) already and add (2);
    
    2023 2060
     or we might have (2) already and add (1).
    
    2024 2061
     
    
    2062
    +(OCU1) We keep track of the forall-bound variables because the um_fam_env is inactive
    
    2063
    +  under a forall; indeed it is /unsound/ to consult it becuase we have have a binding
    
    2064
    +  (F a :-> Int), and then unify (forall a. ...(F a)...) with something.  We don't
    
    2065
    +  want to map that (F a) to Int!
    
    2066
    +
    
    2025 2067
     Note [Unifying coercion-foralls]
    
    2026 2068
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    2027 2069
     Suppose we try to unify (forall cv. t1) ~ (forall cv. t2).
    

  • testsuite/tests/typecheck/should_compile/T26358.hs
    1
    +{-# LANGUAGE TypeFamilies #-}
    
    2
    +
    
    3
    +module T26358 where
    
    4
    +import Data.Kind
    
    5
    +import Data.Proxy
    
    6
    +
    
    7
    +{- Two failing tests, described in GHC.Core.Unify
    
    8
    +     Note [Shortcomings of the apartness test]
    
    9
    +
    
    10
    +Explanation for TF2
    
    11
    +* We try to reduce
    
    12
    +    (TF2 (F (G Float)) (F Int) (G Float))
    
    13
    +* We can only do so if those arguments are apart from the first
    
    14
    +  equation of TF2, namely (Bool,Char,Int).
    
    15
    +* So we try to unify
    
    16
    +    [F (G Float), F Int, G Float] ~ [Bool, Char, Int]
    
    17
    +* They really are apart, but we can't quite spot that yet;
    
    18
    +  hence #26358
    
    19
    +
    
    20
    +TF1 is similar.
    
    21
    +-}
    
    22
    +
    
    23
    +
    
    24
    +type TF1 :: Type -> Type -> Type -> Type
    
    25
    +type family TF1 a b c where
    
    26
    +  TF1 Bool Char a = Word
    
    27
    +  TF1 a    b    c = (a,b,c)
    
    28
    +
    
    29
    +type F :: Type -> Type
    
    30
    +type family F a where
    
    31
    +
    
    32
    +foo :: Proxy a
    
    33
    +    -> Proxy (TF1 (F a) (F Int) Int)
    
    34
    +    -> Proxy (F a, F Int, Int)
    
    35
    +foo _ px = px
    
    36
    +
    
    37
    +type TF2 :: Type -> Type -> Type -> Type
    
    38
    +type family TF2 a b c where
    
    39
    +  TF2 Bool Char Int = Word
    
    40
    +  TF2 a    b    c   = (a,b,c)
    
    41
    +
    
    42
    +type G :: Type -> Type
    
    43
    +type family G a where
    
    44
    +
    
    45
    +bar :: Proxy (TF2 (F (G Float)) (F Int) (G Float))
    
    46
    +    -> Proxy (F (G Float), F Int, G Float)
    
    47
    +bar px = px
    
    48
    +

  • testsuite/tests/typecheck/should_compile/all.T
    ... ... @@ -946,3 +946,4 @@ test('T14010', normal, compile, [''])
    946 946
     test('T26256a', normal, compile, [''])
    
    947 947
     test('T25992a', normal, compile, [''])
    
    948 948
     test('T26346', normal, compile, [''])
    
    949
    +test('T26358', expect_broken(26358), compile, [''])