[Git][ghc/ghc][wip/T23162-spj] More refactoring

11 Sep 2025

Simon Peyton Jones pushed to branch wip/T23162-spj at Glasgow Haskell Compiler / GHC


Commits:
dc8360ef by Simon Peyton Jones at 2025-09-11T13:59:30+01:00
More refactoring

- - - - -


10 changed files:

- compiler/GHC/Core/TyCon.hs
- compiler/GHC/Tc/Instance/FunDeps.hs
- compiler/GHC/Tc/Solver/Default.hs
- compiler/GHC/Tc/Solver/Dict.hs
- compiler/GHC/Tc/Solver/Equality.hs
- compiler/GHC/Tc/Solver/FunDeps.hs
- compiler/GHC/Tc/Solver/Monad.hs
- compiler/GHC/Tc/Solver/Solve.hs
- compiler/GHC/Tc/Utils/Monad.hs
- compiler/GHC/Tc/Utils/Unify.hs


Changes:

=====================================
compiler/GHC/Core/TyCon.hs
=====================================
@@ -1282,6 +1282,7 @@ isNoParent _                   = False
 data Injectivity
   = NotInjective
   | Injective [Bool]   -- 1-1 with tyConTyVars (incl kind vars)
+                       -- INVARIANT: not all False
   deriving( Eq )
 
 -- | Information pertaining to the expansion of a type synonym (@type@)


=====================================
compiler/GHC/Tc/Instance/FunDeps.hs
=====================================
@@ -94,7 +94,7 @@ an equality for the RHS.
 
 Wrinkles:
 
-(1) meta_tvs: sometimes the instance mentions variables in the RHS that
+(IMP1) fd_qtvs: sometimes the instance mentions variables in the RHS that
     are not bound in the LHS.  For example
 
      class C a b | a -> b
@@ -109,7 +109,7 @@ Wrinkles:
     Note that the fd_qtvs can be free in the /first/ component of the Pair,
     but not in the second (which comes from the [W] constraint).
 
-(2) Multi-range fundeps. When these meta_tvs are involved, there is a subtle
+(IMP2) Multi-range fundeps. When these meta_tvs are involved, there is a subtle
     difference between the fundep (a -> b c) and the two fundeps (a->b, a->c).
     Consider
        class D a b c | a -> b c
@@ -125,15 +125,15 @@ Wrinkles:
        FDEqn { fd_qtvs = [x2], fd_eqs = [ Maybe x2 ~ ty ] }
     with two FDEqns, generating two separate unification variables.
 
-(3) improveFromInstEnv doesn't return any equations that already hold.
-    Reason: then we know if any actual improvement has happened, in
-    which case we need to iterate the solver
+(IMP3) improveFromInstEnv doesn't return any equations that already hold.
+    Reason: just an optimisation; the caller does the same thing, but
+    with a bit more ceremony.
 -}
 
 data FunDepEqn
   = FDEqn { fd_qtvs :: [TyVar]   -- Instantiate these type and kind vars
                                  --   to fresh unification vars,
-                                 -- Non-empty only for FunDepEqns arising from instance decls
+                                 -- See (IMP2) in Note [Improving against instances]
 
           , fd_eqs   :: [TypeEqn]  -- Make these pairs of types equal
                                    -- Invariant: In each (Pair ty1 ty2), the fd_qtvs may be
@@ -193,7 +193,8 @@ zipAndComputeFDEqs :: (Type -> Type -> Bool) -- Discard this FDEq if true
 -- Create a list of (Type,Type) pairs from two lists of types,
 -- making sure that the types are not already equal
 zipAndComputeFDEqs discard (ty1:tys1) (ty2:tys2)
- | discard ty1 ty2 = zipAndComputeFDEqs discard tys1 tys2
+ | discard ty1 ty2 = -- See (IMP3) in Note [Improving against instances]
+                     zipAndComputeFDEqs discard tys1 tys2
  | otherwise       = Pair ty1 ty2 : zipAndComputeFDEqs discard tys1 tys2
 zipAndComputeFDEqs _ _ _ = []
 


=====================================
compiler/GHC/Tc/Solver/Default.hs
=====================================
@@ -543,10 +543,9 @@ defaultEquality ct
             -- This handles cases such as @IO alpha[tau] ~R# IO Int@
             -- by defaulting @alpha := Int@, which is useful in practice
             -- (see Note [Defaulting representational equalities]).
-           ; (co, new_eqs, _unifs) <-
-                wrapUnifierX (ctEvidence ct) Nominal $ \uenv ->
-                -- NB: nominal equality!
-                uType uenv z_ty1 z_ty2
+           ; (co, new_eqs) <- wrapUnifier (ctEvidence ct) Nominal $ \uenv ->
+                              -- NB: nominal equality!
+                              uType uenv z_ty1 z_ty2
 
             -- Only accept this solution if no new equalities are produced
             -- by the unifier.


=====================================
compiler/GHC/Tc/Solver/Dict.hs
=====================================
@@ -473,8 +473,8 @@ solveEqualityDict ev cls tys
     do { let (role, t1, t2) = matchEqualityInst cls tys
          -- Unify t1~t2, putting anything that can't be solved
          -- immediately into the work list
-       ; (co, _, _) <- wrapUnifierTcS ev role $ \uenv ->
-                       uType uenv t1 t2
+       ; co <- wrapUnifierAndEmit ev role $ \uenv ->
+               uType uenv t1 t2
          -- Set  d :: (t1~t2) = Eq# co
        ; setWantedEvTerm dest EvCanonical $
          evDictApp cls tys [Coercion co]


=====================================
compiler/GHC/Tc/Solver/Equality.hs
=====================================
@@ -544,7 +544,7 @@ can_eq_nc_forall ev eq_rel s1 s2
       -- Generate the constraints that live in the body of the implication
       -- See (SF5) in Note [Solving forall equalities]
       ; (lvl, (all_co, wanteds)) <- pushLevelNoWorkList (ppr skol_info)   $
-                                    unifyForAllBody ev (eqRelRole eq_rel) $ \uenv ->
+                                    wrapUnifier ev (eqRelRole eq_rel) $ \uenv ->
                                     go uenv skol_tvs init_subst2 bndrs1 bndrs2
 
       -- Solve the implication right away, using `trySolveImplication`
@@ -634,9 +634,9 @@ There are lots of wrinkles of course:
 
 (SF5) Rather than manually gather the constraints needed in the body of the
    implication, we use `uType`.  That way we can solve some of them on the fly,
-   especially Refl ones.  We use the `unifyForAllBody` wrapper for `uType`,
+   especially Refl ones.  We use the `wrapUnifier` wrapper for `uType`,
    because we want to /gather/ the equality constraint (to put in the implication)
-   rather than /emit/ them into the monad, as `wrapUnifierTcS` does.
+   rather than /emit/ them into the monad, as `wrapUnifierAndEmit` does.
 
 (SF6) We solve the implication on the spot, using `trySolveImplication`.  In
    the past we instead generated an `Implication` to be solved later.  Nice in
@@ -808,7 +808,7 @@ can_eq_app ev s1 t1 s2 t2
   = do { traceTcS "can_eq_app" (vcat [ text "s1:" <+> ppr s1, text "t1:" <+> ppr t1
                                      , text "s2:" <+> ppr s2, text "t2:" <+> ppr t2
                                      , text "vis:" <+> ppr (isNextArgVisible s1) ])
-       ; (co,_,_) <- wrapUnifierTcS ev Nominal $ \uenv ->
+       ; co <- wrapUnifierAndEmit ev Nominal $ \uenv ->
             -- Unify arguments t1/t2 before function s1/s2, because
             -- the former have smaller kinds, and hence simpler error messages
             -- c.f. GHC.Tc.Utils.Unify.uType (go_app)
@@ -966,7 +966,7 @@ then we will just decompose s1~s2, and it might be better to
 do so on the spot.  An important special case is where s1=s2,
 and we get just Refl.
 
-So canDecomposableTyConAppOK uses wrapUnifierTcS etc to short-cut
+So canDecomposableTyConAppOK uses wrapUnifierAndEmit etc to short-cut
 that work.  See also Note [Work-list ordering].
 
 Note [Decomposing TyConApp equalities]
@@ -1090,7 +1090,7 @@ up in the complexities of canEqLHSHetero.  To do this:
 * `uType` keeps the bag of emitted constraints in the same
   left-to-right order.  See the use of `snocBag` in `uType_defer`.
 
-* `wrapUnifierTcS` adds the bag of deferred constraints from
+* `wrapUnifierAndEmit` adds the bag of deferred constraints from
   `do_unifications` to the work-list using `extendWorkListChildEqs`.
 
 * `extendWorkListChildEqs` and `selectWorkItem` together arrange that the
@@ -1394,7 +1394,7 @@ canDecomposableTyConAppOK ev eq_rel tc (ty1,tys1) (ty2,tys2)
              -- new_locs and tc_roles are both infinite, so we are
              -- guaranteed that cos has the same length as tys1 and tys2
              -- See Note [Fast path when decomposing TyConApps]
-             -> do { (co, _, _) <- wrapUnifierTcS ev role $ \uenv ->
+             -> do { co <- wrapUnifierAndEmit ev role $ \uenv ->
                         do { cos <- zipWith4M (u_arg uenv) new_locs tc_roles tys1 tys2
                                     -- zipWith4M: see Note [Work-list ordering]
                            ; return (mkTyConAppCo role tc cos) }
@@ -1449,7 +1449,7 @@ canDecomposableFunTy ev eq_rel af f1@(ty1,m1,a1,r1) f2@(ty2,m2,a2,r2)
                   (ppr ev $$ ppr eq_rel $$ ppr f1 $$ ppr f2)
        ; case ev of
            CtWanted (WantedCt { ctev_dest = dest })
-             -> do { (co, _, _) <- wrapUnifierTcS ev Nominal $ \ uenv ->
+             -> do { co <- wrapUnifierAndEmit ev Nominal $ \ uenv ->
                         do { let mult_env = uenv `updUEnvLoc` toInvisibleLoc
                                                  `setUEnvRole` funRole role SelMult
                            ; mult <- uType mult_env m1 m2
@@ -1694,12 +1694,18 @@ canEqCanLHSHetero ev eq_rel swapped lhs1 ps_xi1 ki1 xi2 ps_xi2 ki2
               ; finish emptyRewriterSet (givenCtEvCoercion kind_ev) }
 
       CtWanted {}
-         -> do { (kind_co, cts, unifs) <- wrapUnifierTcS ev Nominal $ \uenv ->
-                                          let uenv' = updUEnvLoc uenv (mkKindEqLoc xi1 xi2)
-                                          in unSwap swapped (uType uenv') ki1 ki2
+         -> do { (unifs, (kind_co, cts)) <- reportUnifications $
+                                            wrapUnifier ev Nominal $ \uenv ->
+                                            let uenv' = updUEnvLoc uenv (mkKindEqLoc xi1 xi2)
+                                            in unSwap swapped (uType uenv') ki1 ki2
                       -- mkKindEqLoc: any new constraints, arising from the kind
                       -- unification, say they thay come from unifying xi1~xi2
-               ; if not (null unifs)
+
+               -- Emit any unsolved kind equalities
+               ; unless (isEmptyBag cts) $
+                 updWorkListTcS (extendWorkListChildEqs ev cts)
+
+               ; if unifs
                  then -- Unifications happened, so start again to do the zonking
                       -- Otherwise we might put something in the inert set that isn't inert
                       startAgainWith (mkNonCanonical ev)
@@ -2037,9 +2043,6 @@ canEqCanLHSFinish_try_unification ev eq_rel swapped lhs rhs
          ; setEvBindIfWanted new_ev EvCanonical $
            evCoercion (mkNomReflCo final_rhs)
 
-         -- Kick out any constraints that can now be rewritten
-         ; recordUnification tv
-
          ; return (Stop new_ev (text "Solved by unification")) }
 
 ---------------------------
@@ -2405,7 +2408,7 @@ FamAppBreaker.
 Why TauTvs? See [Why TauTvs] below.
 
 Critically, we emit the two new constraints (the last two above)
-directly instead of calling wrapUnifierTcS. (Otherwise, we'd end up
+directly instead of calling wrapUnifier. (Otherwise, we'd end up
 unifying cbv1 and cbv2 immediately, achieving nothing.)  Next, we
 unify alpha := cbv1 -> cbv2, having eliminated the occurs check. This
 unification happens immediately following a successful call to


=====================================
compiler/GHC/Tc/Solver/FunDeps.hs
=====================================
@@ -20,7 +20,6 @@ import GHC.Tc.Utils.Unify( UnifyEnv(..) )
 import GHC.Tc.Utils.Monad    as TcM
 import GHC.Tc.Types.Evidence
 import GHC.Tc.Types.Constraint
-import GHC.Tc.Types.CtLoc
 
 import GHC.Core.FamInstEnv
 import GHC.Core.Coercion
@@ -39,27 +38,57 @@ import GHC.Utils.Misc( filterOut )
 import GHC.Data.Pair
 
 
-{- *********************************************************************
-*                                                                      *
-*          Functional dependencies for dictionaries
-*                                                                      *
-************************************************************************
+{- Note [Overview of fundeps]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Here is our plan for dealing with functional dependencies
 
-When we spot an equality arising from a functional dependency,
-we now use that equality (a "wanted") to rewrite the work-item
-constraint right away.  This avoids two dangers
+* When we have failed to solve a Wanted constraint, do this
+  1. Generate any fundep-equalities [FunDepEqn] from that constraint.
+  2. Try to solve that [FunDepEqn]
+  3. If any unifications happened, send the constraint back to the
+     start of the pipeline
 
- Danger 1: If we send the original constraint on down the pipeline
-           it may react with an instance declaration, and in delicate
-           situations (when a Given overlaps with an instance) that
-           may produce new insoluble goals: see #4952
+* Step (1) How we generate those [FunDepEqn] varies:
+       - tryDictFunDeps: for class constraints (C t1 .. tn)
+         we look at top-level instances and inert Givens
+       - tryEqFunDeps: for type-family equalities (F t1 .. tn ~ ty)
+         we look at top-level family instances
+                    and inert Given family equalities
 
- Danger 2: If we don't rewrite the constraint, it may re-react
-           with the same thing later, and produce the same equality
-           again --> termination worries.
+* Step (2). We use `solveFunDeps` to solve the [FunDepEqn] in a nested
+  solver.  Key property:
+
+      The ONLY effect of `solveFunDeps` is possibly to perform unifications:
 
-To achieve this required some refactoring of GHC.Tc.Instance.FunDeps (nicer
-now!).
+      - It entirely discards any unsolved fundep equalities.
+
+      - Ite entirely discards any evidence arising from solving fundep equalities
+
+* Step (3) if we did any unifications in Step (2), we start again with the
+  current unsolved Wanted.  It might now be soluble!
+
+* For Given constraints, things are different:
+    - tryDictFunDeps: we do nothing
+    - tryEqFunDeps: for type-family equalities, we can produce new
+        actual evidence for built-in type families.  E.g.
+             [G] co : 3 ~ x + 1
+        We can produce new evidence
+             [G] co' : x ~ 2
+  So we generate and emit fresh Givens.  See
+     `improveGivenTopFunEqs` and `improveGivenLocalFunEqs`
+  No unification is involved here, just emitting new Givens.
+
+(FD1) Consequences for error messages.
+  Because we discard any unsolved FunDepEqns, we get better error messages.
+  Consider   class C a b | a -> b
+             instance C Int Bool
+  and [W] C Int Char
+  We'll get an insoluble fundep-equality  (Char ~ Bool), but it's very
+  unhelpful to report it.  Much better just to say
+     No instance for C Int Bool
+
+  Similarly if had [W] C Int S, [W] C Int T, it is not helpful to
+  complain about insoluble (S ~ T).
 
 Note [FunDep and implicit parameter reactions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -107,141 +136,65 @@ Then it is solvable, but its very hard to detect this on the spot.
 It's exactly the same with implicit parameters, except that the
 "aggressive" approach would be much easier to implement.
 
-Note [Fundeps with instances, and equality orientation]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This Note describes a delicate interaction that constrains the orientation of
-equalities. This one is about fundeps, but the /exact/ same thing arises for
-type-family injectivity constraints: see Note [Improvement orientation].
-
-doTopFunDepImprovement compares the constraint with all the instance
-declarations, to see if we can produce any equalities. E.g
-   class C2 a b | a -> b
-   instance C Int Bool
-Then the constraint (C Int ty) generates the equality [W] ty ~ Bool.
-
-There is a nasty corner in #19415 which led to the typechecker looping:
-   class C s t b | s -> t
-   instance ... => C (T kx x) (T ky y) Int
-   T :: forall k. k -> Type
-
-   work_item: dwrk :: C (T @ka (a::ka)) (T @kb0 (b0::kb0)) Char
-      where kb0, b0 are unification vars
-
-   ==> {doTopFunDepImprovement: compare work_item with instance,
-        generate /fresh/ unification variables kfresh0, yfresh0,
-        emit a new Wanted, and add dwrk to inert set}
-
-   Suppose we emit this new Wanted from the fundep:
-       [W] T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0)
-
-   ==> {solve that equality kb0 := kfresh0, b0 := yfresh0}
-   Now kick out dwrk, since it mentions kb0
-   But now we are back to the start!  Loop!
-
-NB1: This example relies on an instance that does not satisfy the
-     coverage condition (although it may satisfy the weak coverage
-     condition), and hence whose fundeps generate fresh unification
-     variables.  Not satisfying the coverage condition is known to
-     lead to termination trouble, but in this case it's plain silly.
-
-NB2: In this example, the third parameter to C ensures that the
-     instance doesn't actually match the Wanted, so we can't use it to
-     solve the Wanted
-
-We solve the problem by (#21703):
+Note [Partial functional dependencies]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider this (#12522):
+  type family F x = t | t -> x
+  type instance F (a, Int) = (Int, G a)
+where G is injective; and wanted constraints
+  [W] F (alpha, beta) ~ (Int, <some type>)
 
-    carefully orienting the new Wanted so that all the
-    freshly-generated unification variables are on the LHS.
+The injectivity will give rise to fundep equalities
+  [W] gamma1 ~ alpha
+  [W] Int ~ beta
 
-    Thus we call unifyWanteds on
-       T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0)
-    and /NOT/
-       T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0)
+The fresh unification variable `gamma1` comes from the fact that we can only do
+"partial improvement" here; see Section 5.2 of "Injective type families for
+Haskell" (HS'15).
 
-Now we'll unify kfresh0:=kb0, yfresh0:=b0, and all is well.  The general idea
-is that we want to preferentially eliminate those freshly-generated
-unification variables, rather than unifying older variables, which causes
-kick-out etc.
+Now it is crucial that, when solving,
+  we unify    gamma1 := alpha    (YES)
+  and not     alpha := gamma1    (NO)
 
-Keeping younger variables on the left also gives very minor improvement in
-the compiler performance by having less kick-outs and allocations (-0.1% on
-average).  Indeed Historical Note [Eliminate younger unification variables]
-in GHC.Tc.Utils.Unify describes an earlier attempt to do so systematically,
-apparently now in abeyance.
+Why?  Because if we do (YES) we'll think we have made some progress
+(some unification has happened), and hence go round again; but actually all we
+have done is to replace `alpha` with `gamma1`.
 
-But this is is a delicate solution. We must take care to /preserve/
-orientation during solving. Wrinkles:
+These "fresh unification variables" in fundep-equalities are ubituitous.
+For example
+      class C a b | a -> b
+      instance .. => C Int [x]
+If we see
+      [W] C Int alpha
+we'll generate a fundep-equality   [W] alpha ~ [beta1]
+where `beta1` is one of those "fresh unification variables
 
-(W1) We start with
-       [W] T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0)
-     Decompose to
-       [W] kfresh0 ~ kb0
-       [W] (yfresh0::kfresh0) ~ (b0::kb0)
-     Preserve orientation when decomposing!!
+This problem shows up in several guises; see (at the bottom)
+  * Historical Note [Improvement orientation]
+  * Historical Note [Fundeps with instances, and equality orientation]
 
-(W2) Suppose we happen to tackle the second Wanted from (W1)
-     first. Then in canEqCanLHSHetero we emit a /kind/ equality, as
-     well as a now-homogeneous type equality
-       [W] kco : kfresh0 ~ kb0
-       [W] (yfresh0::kfresh0) ~ (b0::kb0) |> (sym kco)
-     Preserve orientation in canEqCanLHSHetero!!  (Failing to
-     preserve orientation here was the immediate cause of #21703.)
+The solution is super-simple:
 
-(W3) There is a potential interaction with the swapping done by
-     GHC.Tc.Utils.Unify.swapOverTyVars.  We think it's fine, but it's
-     a slight worry.  See especially Note [TyVar/TyVar orientation] in
-     that module.
+  * A fundep-equality is described by `FunDepEqn`, whose `fd_qtvs` field explicitly
+    lists the "fresh variables"
 
-The trouble is that "preserving orientation" is a rather global invariant,
-and sometimes we definitely do want to swap (e.g. Int ~ alpha), so we don't
-even have a precise statement of what the invariant is.  The advantage
-of the preserve-orientation plan is that it is extremely cheap to implement,
-and apparently works beautifully.
+  * Function `instantiateFunDepEqn` instantiates a `FunDepEqn`, and CRUCIALLY
+    gives the new unification variables a level one deeper than the current
+    level.
 
---- Alternative plan (1) ---
-Rather than have an ill-defined invariant, another possiblity is to
-elminate those fresh unification variables at birth, when generating
-the new fundep-inspired equalities.
+  * Now, given `alpha ~ beta`, all the unification machinery guarantees, to
+    unify the variable with the deeper level.  See GHC.Tc.Utils.Unify
+    Note [Deeper level on the left].  That ensures that the fresh `gamma1`
+    will be eliminated in favour of `alpha`. Hooray.
 
-The key idea is to call `instFlexiX` in `emitFunDepWanteds` on only those
-type variables that are guaranteed to give us some progress. This means we
-have to locally (without calling emitWanteds) identify the type variables
-that do not give us any progress.  In the above example, we _know_ that
-emitting the two wanteds `kco` and `co` is fruitless.
+  * Better still, we solve the [FunDepEqn] with
+      solveFunDeps :: CtEvidence -> [FunDepEqn] -> TcS Bool
+    It uses `reportUnifications` to see if any unification happened at this
+    level or outside -- that is, it does NOT report unifications to the fresh
+    unification variables.  So `solveFunDeps` returns True only if it
+    unifies a variable /other than/ the fresh ones.  Bingo.
 
-  Q: How do we identify such no-ops?
-
-  1. Generate a matching substitution from LHS to RHS
-        ɸ = [kb0 :-> k0, b0 :->  y0]
-  2. Call `instFlexiX` on only those type variables that do not appear in the domain of ɸ
-        ɸ' = instFlexiX ɸ (tvs - domain ɸ)
-  3. Apply ɸ' on LHS and then call emitWanteds
-        unifyWanteds ... (subst ɸ' LHS) RHS
-
-Why will this work?  The matching substitution ɸ will be a best effort
-substitution that gives us all the easy solutions. It can be generated with
-modified version of `Core/Unify.unify_tys` where we run it in a matching mode
-and never generate `SurelyApart` and always return a `MaybeApart Subst`
-instead.
-
-The same alternative plan would work for type-family injectivity constraints:
-see Note [Improvement orientation] in GHC.Tc.Solver.Equality.
---- End of Alternative plan (1) ---
-
---- Alternative plan (2) ---
-We could have a new flavour of TcTyVar (like `TauTv`, `TyVarTv` etc; see GHC.Tc.Utils.TcType.MetaInfo)
-for the fresh unification variables introduced by functional dependencies.  Say `FunDepTv`.  Then in
-GHC.Tc.Utils.Unify.swapOverTyVars we could arrange to keep a `FunDepTv` on the left if possible.
-Looks possible, but it's one more complication.
---- End of Alternative plan (2) ---
-
-
---- Historical note: Failed Alternative Plan (3) ---
-Previously we used a flag `cc_fundeps` in `CDictCan`. It would flip to False
-once we used a fun dep to hint the solver to break and to stop emitting more
-wanteds.  This solution was not complete, and caused a failures while trying
-to solve for transitive functional dependencies (test case: T21703)

--- End of Historical note: Failed Alternative Plan (3) --
+Another victory for levels numbers!
 
 Note [Do fundeps last]
 ~~~~~~~~~~~~~~~~~~~~~~
@@ -260,7 +213,7 @@ Consider T4254b:
   If we interact that Wanted with /both/ the top-level instance, /and/ the
   local Given, we'll get
       beta ~ Int   and     beta ~ b
-  respectively.  That would generate (b~Bool), which would fai.  I think
+  respectively.  That would generate (b~Bool), which would fail.  I think
   it doesn't matter which of the two we pick, but historically we have
   picked the local-fundeps first.
 
@@ -273,7 +226,6 @@ Consider T4254b:
 
   (DFL2) is achieved by trying fundeps only on /unsolved/ Wanteds.
 
-
 Note [Weird fundeps]
 ~~~~~~~~~~~~~~~~~~~~
 Consider   class Het a b | a -> b where
@@ -296,6 +248,13 @@ as the fundeps.
 #7875 is a case in point.
 -}
 
+
+{- *********************************************************************
+*                                                                      *
+*          Functional dependencies for dictionaries
+*                                                                      *
+********************************************************************* -}
+
 tryDictFunDeps :: DictCt -> SolverStage ()
 -- (tryDictFunDeps inst_envs cts)
 --   * Generate the fundeps from interacting the
@@ -334,6 +293,7 @@ tryDictFunDepsLocal dict_ct@(DictCt { di_cls = cls, di_ev = work_ev })
          text "imp =" <+> ppr imp $$ text "eqns = " <+> ppr eqns
 
        ; if imp then startAgainWith (CDictCan dict_ct)
+                     -- See (DFL1) of Note [Do fundeps last]
                 else continueWith () }
   where
     work_pred     = ctEvPred work_ev
@@ -436,88 +396,6 @@ and Given/instance fundeps entirely.
     Functional dependencies for type families
 *                                                                    *
 **********************************************************************
-
-Note [Reverse order of fundep equations]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider this scenario (from dependent/should_fail/T13135_simple):
-
-  type Sig :: Type -> Type
-  data Sig a = SigFun a (Sig a)
-
-  type SmartFun :: forall (t :: Type). Sig t -> Type
-  type family SmartFun sig = r | r -> sig where
-    SmartFun @Type (SigFun @Type a sig) = a -> SmartFun @Type sig
-
-  [W] SmartFun @kappa sigma ~ (Int -> Bool)
-
-The injectivity of SmartFun allows us to produce two new equalities:
-
-  [W] w1 :: Type ~ kappa
-  [W] w2 :: SigFun @Type Int beta ~ sigma
-
-for some fresh (beta :: SigType). The second Wanted here is actually
-heterogeneous: the LHS has type Sig Type while the RHS has type Sig kappa.
-Of course, if we solve the first wanted first, the second becomes homogeneous.
-
-When looking for injectivity-inspired equalities, we work left-to-right,
-producing the two equalities in the order written above. However, these
-equalities are then passed into wrapUnifierTcS, which will fail, adding these
-to the work list. However, crucially, the work list operates like a *stack*.
-So, because we add w1 and then w2, we process w2 first. This is silly: solving
-w1 would unlock w2. So we make sure to add equalities to the work
-list in left-to-right order, which requires a few key calls to 'reverse'.
-
-This treatment is also used for class-based functional dependencies, although
-we do not have a program yet known to exhibit a loop there. It just seems
-like the right thing to do.
-
-When this was originally conceived, it was necessary to avoid a loop in T13135.
-That loop is now avoided by continuing with the kind equality (not the type
-equality) in canEqCanLHSHetero (see Note [Equalities with heterogeneous kinds]).
-However, the idea of working left-to-right still seems worthwhile, and so the calls
-to 'reverse' remain.
-
-Note [Improvement orientation]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-See also Note [Fundeps with instances, and equality orientation], which describes
-the Exact Same Problem, with the same solution, but for functional dependencies.
-
-A very delicate point is the orientation of equalities
-arising from injectivity improvement (#12522).  Suppose we have
-  type family F x = t | t -> x
-  type instance F (a, Int) = (Int, G a)
-where G is injective; and wanted constraints
-
-  [W] F (alpha, beta) ~ (Int, <some type>)
-
-The injectivity will give rise to constraints
-
-  [W] gamma1 ~ alpha
-  [W] Int ~ beta
-
-The fresh unification variable gamma1 comes from the fact that we
-can only do "partial improvement" here; see Section 5.2 of
-"Injective type families for Haskell" (HS'15).
-
-Now, it's very important to orient the equations this way round,
-so that the fresh unification variable will be eliminated in
-favour of alpha.  If we instead had
-   [W] alpha ~ gamma1
-then we would unify alpha := gamma1; and kick out the wanted
-constraint.  But when we substitute it back in, it'd look like
-   [W] F (gamma1, beta) ~ fuv
-and exactly the same thing would happen again!  Infinite loop.
-
---->  ToDo: all this fragility has gone away!   Fix the Note! <---
-
-This all seems fragile, and it might seem more robust to avoid
-introducing gamma1 in the first place, in the case where the
-actual argument (alpha, beta) partly matches the improvement
-template.  But that's a bit tricky, esp when we remember that the
-kinds much match too; so it's easier to let the normal machinery
-handle it.  Instead we are careful to orient the new
-equality with the template on the left.  Delicate, but it works.
-
 -}
 
 --------------------
@@ -562,27 +440,18 @@ improveWantedTopFunEqs :: TyCon -> [TcType] -> CtEvidence -> Xi -> TcS Bool
 -- TyCon is definitely a type family
 -- Work-item is a Wanted
 improveWantedTopFunEqs fam_tc args ev rhs_ty
-  = do { eqns <- improve_wanted_top_fun_eqs fam_tc args rhs_ty
+  = do { fd_eqns <- improve_wanted_top_fun_eqs fam_tc args rhs_ty
        ; traceTcS "improveTopFunEqs" (vcat [ text "lhs:" <+> ppr fam_tc <+> ppr args
                                            , text "rhs:" <+> ppr rhs_ty
-                                           , text "eqns:" <+> ppr eqns ])
-       ; unifyFunDeps ev Nominal $ \uenv ->
-         uPairsTcM (bump_depth uenv) (reverse eqns) }
-         -- Missing that `reverse` causes T13135 and T13135_simple to loop.
-         -- See Note [Reverse order of fundep equations]
-               -- ToDo: is this still a problem?
+                                           , text "eqns:" <+> ppr fd_eqns ])
+       ; solveFunDeps ev fd_eqns }
 
-  where
-    bump_depth env = env { u_loc = bumpCtLocDepth (u_loc env) }
-        -- ToDo: this location is wrong; it should be FunDepOrigin2
-        -- See #14778
-
-improve_wanted_top_fun_eqs :: TyCon -> [TcType] -> Xi
-                           -> TcS [TypeEqn]
+improve_wanted_top_fun_eqs :: TyCon -> [TcType] -> Xi -> TcS [FunDepEqn]
 -- TyCon is definitely a type family
 improve_wanted_top_fun_eqs fam_tc lhs_tys rhs_ty
   | Just ops <- isBuiltInSynFamTyCon_maybe fam_tc
-  = return (map snd $ tryInteractTopFam ops fam_tc lhs_tys rhs_ty)
+  = return [FDEqn { fd_qtvs = []
+                  , fd_eqs = map snd $ tryInteractTopFam ops fam_tc lhs_tys rhs_ty }]
 
   -- ToDo: use ideas in #23162 for closed type families; injectivity only for open
 
@@ -593,16 +462,20 @@ improve_wanted_top_fun_eqs fam_tc lhs_tys rhs_ty
        ; top_eqns <- improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty
        ; let local_eqns = improve_injective_wanted_famfam  inj_args fam_tc lhs_tys rhs_ty
        ; traceTcS "improve_wanted_top_fun_eqs" $
-         vcat [ ppr fam_tc, text "local_eqns" <+> ppr local_eqns, text "top_eqns" <+> ppr top_eqns ]
-  -- xxx ToDo: this does both local and top => bug?
+         vcat [ ppr fam_tc
+              , text "local_eqns" <+> ppr local_eqns
+              , text "top_eqns" <+> ppr top_eqns ]
+              -- xxx ToDo: this does both local and top => bug?
        ; return (local_eqns ++ top_eqns) }
 
   | otherwise  -- No injectivity
   = return []
 
-improve_injective_wanted_top :: FamInstEnvs -> [Bool] -> TyCon -> [TcType] -> Xi -> TcS [TypeEqn]
+improve_injective_wanted_top :: FamInstEnvs -> [Bool] -> TyCon
+                             -> [TcType] -> Xi -> TcS [FunDepEqn]
 -- Interact with top-level instance declarations
 -- See Section 5.2 in the Injective Type Families paper
+-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are
 improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty
   = concatMapM do_one branches
   where
@@ -617,7 +490,7 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty
              | otherwise
              = []
 
-    do_one :: CoAxBranch -> TcS [TypeEqn]
+    do_one :: CoAxBranch -> TcS [FunDepEqn]
     do_one branch@(CoAxBranch { cab_tvs = branch_tvs, cab_lhs = branch_lhs_tys, cab_rhs = branch_rhs })
       | let in_scope1 = in_scope `extendInScopeSetList` branch_tvs
       , Just subst <- tcUnifyTyForInjectivity False in_scope1 branch_rhs rhs_ty
@@ -638,9 +511,10 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty
                   , text "rhs_ty" <+> ppr rhs_ty
                   , text "subst" <+> ppr subst
                   , text "subst1" <+> ppr subst1 ]
-           ; if apartnessCheck (substTys subst1 branch_lhs_tys) branch
-             then do { traceTcS "improv_inj_top1" (ppr branch_lhs_tys)
-                     ; return (mkInjectivityEqns inj_args (map (substTy subst1) branch_lhs_tys) lhs_tys) }
+           ; let branch_lhs_tys' = substTys subst1 branch_lhs_tys
+           ; if apartnessCheck branch_lhs_tys' branch
+             then do { traceTcS "improv_inj_top1" (ppr branch_lhs_tys')
+                     ; return [mkInjectivityFDEqn inj_args branch_lhs_tys' lhs_tys] }
                   -- NB: The fresh unification variables (from unsubstTvs) are on the left
                   --     See Note [Improvement orientation]
              else do { traceTcS "improve_inj_top2" empty; return []  } }
@@ -651,20 +525,25 @@ improve_injective_wanted_top fam_envs inj_args fam_tc lhs_tys rhs_ty
     in_scope = mkInScopeSet (tyCoVarsOfType rhs_ty)
 
 
-improve_injective_wanted_famfam :: [Bool] -> TyCon -> [TcType] -> Xi -> [TypeEqn]
+improve_injective_wanted_famfam :: [Bool] -> TyCon -> [TcType] -> Xi -> [FunDepEqn]
 -- Interact with itself, specifically  F s1 s2 ~ F t1 t2
+-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are
 improve_injective_wanted_famfam inj_args fam_tc lhs_tys rhs_ty
   | Just (tc, rhs_tys) <- tcSplitTyConApp_maybe rhs_ty
   , tc == fam_tc
-  = mkInjectivityEqns inj_args lhs_tys rhs_tys
+  = [mkInjectivityFDEqn inj_args lhs_tys rhs_tys]
   | otherwise
   = []
 
-mkInjectivityEqns :: [Bool] -> [TcType] -> [TcType] -> [TypeEqn]
+mkInjectivityFDEqn :: [Bool] -> [TcType] -> [TcType] -> FunDepEqn
 -- When F s1 s2 s3 ~ F t1 t2 t3, and F has injectivity info [True,False,True]
--- return the equations [Pair s1 t1, Pair s3 t3]
-mkInjectivityEqns inj_args lhs_args rhs_args
-  = [ Pair lhs_arg rhs_arg | (True, lhs_arg, rhs_arg) <- zip3 inj_args lhs_args rhs_args ]
+--   return the FDEqn { fd_eqs = [Pair s1 t1, Pair s3 t3] }
+-- The injectivity flags [Bool] will not all be False, but nothing goes wrong if they are
+mkInjectivityFDEqn inj_args lhs_args rhs_args
+  = FDEqn { fd_qtvs = [], fd_eqs = eqs }
+  where
+    eqs = [ Pair lhs_arg rhs_arg
+          | (True, lhs_arg, rhs_arg) <- zip3 inj_args lhs_args rhs_args ]
 
 ---------------------------------------------
 improveLocalFunEqs :: TyCon -> [TcType] -> EqCt   -- F args ~ rhs
@@ -765,30 +644,23 @@ improveWantedLocalFunEqs funeqs_for_tc fam_tc args work_ev rhs
       = []
 
     --------------------
-    do_one_built_in ops rhs (EqCt { eq_lhs = TyFamLHS _ iargs, eq_rhs = irhs, eq_ev = inert_ev })
+    do_one_built_in ops rhs (EqCt { eq_lhs = TyFamLHS _ iargs, eq_rhs = irhs })
       | irhs `tcEqType` rhs
-      = mk_fd_eqns inert_ev (map snd $ tryInteractInertFam ops fam_tc args iargs)
+      = [FDEqn { fd_qtvs = [], fd_eqs = map snd $ tryInteractInertFam ops fam_tc args iargs }]
       | otherwise
       = []
     do_one_built_in _ _ _ = pprPanic "interactFunEq 1" (ppr fam_tc) -- TyVarLHS
 
     --------------------
     -- See Note [Type inference for type families with injectivity]
-    do_one_injective inj_args rhs (EqCt { eq_lhs = TyFamLHS _ inert_args
-                                        , eq_rhs = irhs, eq_ev = inert_ev })
+    do_one_injective inj_args rhs (EqCt { eq_lhs = TyFamLHS _ inert_args, eq_rhs = irhs })
       | rhs `tcEqType` irhs
-      = mk_fd_eqns inert_ev $ mkInjectivityEqns inj_args args inert_args
+      = [mkInjectivityFDEqn inj_args args inert_args]
       | otherwise
       = []
 
     do_one_injective _ _ _ = pprPanic "interactFunEq 2" (ppr fam_tc)  -- TyVarLHS
 
-    --------------------
-    -- ToDO: fix me
-    mk_fd_eqns :: CtEvidence -> [TypeEqn] -> [FunDepEqn]
-    mk_fd_eqns _inert_ev eqns
-      | null eqns  = []
-      | otherwise  = [ FDEqn { fd_qtvs = [], fd_eqs = eqns } ]
 
 {- Note [Type inference for type families with injectivity]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -870,13 +742,11 @@ just an optimization so we don't lose anything in terms of completeness of
 solving.
 -}
 
-{-
-************************************************************************
+{- *********************************************************************
 *                                                                      *
               Emitting equalities arising from fundeps
 *                                                                      *
-************************************************************************
--}
+********************************************************************* -}
 
 solveFunDeps :: CtEvidence  -- The work item
              -> [FunDepEqn]
@@ -885,16 +755,18 @@ solveFunDeps :: CtEvidence  -- The work item
 -- By "solve" we mean: (only) do unifications.  We do not generate evidence, and
 -- other than unifications there should be no effects whatsoever
 --
--- Return True if some unifications happened
--- See Note [FunDep and implicit parameter reactions]
+-- The returned Bool is True if some unifications happened
+--
+-- See Note [Overview of fundeps]
 solveFunDeps work_ev fd_eqns
   | null fd_eqns
-  = return False -- common case noop
+  = return False -- Common case no-op
 
   | otherwise
   = do { (unif_happened, _res)
-             <- nestFunDepsTcS $
-                do { (_, eqs) <- unifyForAllBody work_ev Nominal do_fundeps
+             <- reportUnifications $
+                nestFunDepsTcS     $
+                do { (_, eqs) <- wrapUnifier work_ev Nominal do_fundeps
                    ; solveSimpleWanteds eqs }
     -- ToDo: why solveSimpleWanteds?  Answer
     --     (a) don't rely on eager unifier
@@ -920,6 +792,7 @@ instantiateFunDepEqn (FDEqn { fd_qtvs = tvs, fd_eqs = eqs })
   where
     rev_eqs = reverse eqs
        -- (reverse eqs): See Note [Reverse order of fundep equations]
+       -- ToDo: is this still a problem?
 
     subst_pair subst (Pair ty1 ty2)
        = Pair (substTyUnchecked subst' ty1) ty2
@@ -934,3 +807,257 @@ instantiateFunDepEqn (FDEqn { fd_qtvs = tvs, fd_eqs = eqs })
                   -- though ty1 will never (currently) be a poytype, so this
                   -- InScopeSet will never be looked at.
 
+
+{- Note [Reverse order of fundep equations]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider this scenario (from dependent/should_fail/T13135_simple):
+
+  type Sig :: Type -> Type
+  data Sig a = SigFun a (Sig a)
+
+  type SmartFun :: forall (t :: Type). Sig t -> Type
+  type family SmartFun sig = r | r -> sig where
+    SmartFun @Type (SigFun @Type a sig) = a -> SmartFun @Type sig
+
+  [W] SmartFun @kappa sigma ~ (Int -> Bool)
+
+The injectivity of SmartFun allows us to produce two new equalities:
+
+  [W] w1 :: Type ~ kappa
+  [W] w2 :: SigFun @Type Int beta ~ sigma
+
+for some fresh (beta :: SigType). The second Wanted here is actually
+heterogeneous: the LHS has type Sig Type while the RHS has type Sig kappa.
+Of course, if we solve the first wanted first, the second becomes homogeneous.
+
+When looking for injectivity-inspired equalities, we work left-to-right,
+producing the two equalities in the order written above. However, these
+equalities are then passed into wrapUnifierAndEmit, which will fail, adding these
+to the work list. However, the work list operates like a *stack*.
+So, because we add w1 and then w2, we process w2 first. This is silly: solving
+w1 would unlock w2. So we make sure to add equalities to the work
+list in left-to-right order, which requires a few key calls to 'reverse'.
+
+When this was originally conceived, it was necessary to avoid a loop in T13135.
+That loop is now avoided by continuing with the kind equality (not the type
+equality) in canEqCanLHSHetero (see Note [Equalities with heterogeneous kinds]).
+However, the idea of working left-to-right still seems worthwhile, and so the calls
+to 'reverse' remain.
+
+This treatment is also used for class-based functional dependencies, although
+we do not have a program yet known to exhibit a loop there. It just seems
+like the right thing to do.
+
+In general, I believe this is (now, anyway) just an optimisation, not required
+to avoid loops.
+-}
+
+{- *********************************************************************
+*                                                                      *
+                 Historical notes
+
+     Here are a bunch of Notes that are rendered obselete by
+          Note [Partial functional dependencies]
+
+*                                                                      *
+********************************************************************* -}
+
+{-
+Historical Note [Improvement orientation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+See also Note [Fundeps with instances, and equality orientation], which describes
+the Exact Same Problem, with the same solution, but for functional dependencies.
+
+A very delicate point is the orientation of equalities
+arising from injectivity improvement (#12522).  Suppose we have
+  type family F x = t | t -> x
+  type instance F (a, Int) = (Int, G a)
+where G is injective; and wanted constraints
+
+  [W] F (alpha, beta) ~ (Int, <some type>)
+
+The injectivity will give rise to constraints
+
+  [W] gamma1 ~ alpha
+  [W] Int ~ beta
+
+The fresh unification variable gamma1 comes from the fact that we
+can only do "partial improvement" here; see Section 5.2 of
+"Injective type families for Haskell" (HS'15).
+
+Now, it's very important to orient the equations this way round,
+so that the fresh unification variable will be eliminated in
+favour of alpha.  If we instead had
+   [W] alpha ~ gamma1
+then we would unify alpha := gamma1; and kick out the wanted
+constraint.  But when we substitute it back in, it'd look like
+   [W] F (gamma1, beta) ~ fuv
+and exactly the same thing would happen again!  Infinite loop.
+
+--->  ToDo: all this fragility has gone away!   Fix the Note! <---
+
+This all seems fragile, and it might seem more robust to avoid
+introducing gamma1 in the first place, in the case where the
+actual argument (alpha, beta) partly matches the improvement
+template.  But that's a bit tricky, esp when we remember that the
+kinds much match too; so it's easier to let the normal machinery
+handle it.  Instead we are careful to orient the new
+equality with the template on the left.  Delicate, but it works.
+
+Historical Note [Fundeps with instances, and equality orientation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This Note describes a delicate interaction that constrains the orientation of
+equalities. This one is about fundeps, but the /exact/ same thing arises for
+type-family injectivity constraints: see Note [Improvement orientation].
+
+doTopFunDepImprovement compares the constraint with all the instance
+declarations, to see if we can produce any equalities. E.g
+   class C2 a b | a -> b
+   instance C Int Bool
+Then the constraint (C Int ty) generates the equality [W] ty ~ Bool.
+
+There is a nasty corner in #19415 which led to the typechecker looping:
+   class C s t b | s -> t
+   instance ... => C (T kx x) (T ky y) Int
+   T :: forall k. k -> Type
+
+   work_item: dwrk :: C (T @ka (a::ka)) (T @kb0 (b0::kb0)) Char
+      where kb0, b0 are unification vars
+
+   ==> {doTopFunDepImprovement: compare work_item with instance,
+        generate /fresh/ unification variables kfresh0, yfresh0,
+        emit a new Wanted, and add dwrk to inert set}
+
+   Suppose we emit this new Wanted from the fundep:
+       [W] T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0)
+
+   ==> {solve that equality kb0 := kfresh0, b0 := yfresh0}
+   Now kick out dwrk, since it mentions kb0
+   But now we are back to the start!  Loop!
+
+NB1: This example relies on an instance that does not satisfy the
+     coverage condition (although it may satisfy the weak coverage
+     condition), and hence whose fundeps generate fresh unification
+     variables.  Not satisfying the coverage condition is known to
+     lead to termination trouble, but in this case it's plain silly.
+
+NB2: In this example, the third parameter to C ensures that the
+     instance doesn't actually match the Wanted, so we can't use it to
+     solve the Wanted
+
+We solve the problem by (#21703):
+
+    carefully orienting the new Wanted so that all the
+    freshly-generated unification variables are on the LHS.
+
+    Thus we call unifyWanteds on
+       T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0)
+    and /NOT/
+       T kb0 (b0::kb0) ~ T kfresh0 (yfresh0::kfresh0)
+
+Now we'll unify kfresh0:=kb0, yfresh0:=b0, and all is well.  The general idea
+is that we want to preferentially eliminate those freshly-generated
+unification variables, rather than unifying older variables, which causes
+kick-out etc.
+
+Keeping younger variables on the left also gives very minor improvement in
+the compiler performance by having less kick-outs and allocations (-0.1% on
+average).  Indeed Historical Note [Eliminate younger unification variables]
+in GHC.Tc.Utils.Unify describes an earlier attempt to do so systematically,
+apparently now in abeyance.
+
+But this is is a delicate solution. We must take care to /preserve/
+orientation during solving. Wrinkles:
+
+(W1) We start with
+       [W] T kfresh0 (yfresh0::kfresh0) ~ T kb0 (b0::kb0)
+     Decompose to
+       [W] kfresh0 ~ kb0
+       [W] (yfresh0::kfresh0) ~ (b0::kb0)
+     Preserve orientation when decomposing!!
+
+(W2) Suppose we happen to tackle the second Wanted from (W1)
+     first. Then in canEqCanLHSHetero we emit a /kind/ equality, as
+     well as a now-homogeneous type equality
+       [W] kco : kfresh0 ~ kb0
+       [W] (yfresh0::kfresh0) ~ (b0::kb0) |> (sym kco)
+     Preserve orientation in canEqCanLHSHetero!!  (Failing to
+     preserve orientation here was the immediate cause of #21703.)
+
+(W3) There is a potential interaction with the swapping done by
+     GHC.Tc.Utils.Unify.swapOverTyVars.  We think it's fine, but it's
+     a slight worry.  See especially Note [TyVar/TyVar orientation] in
+     that module.
+
+The trouble is that "preserving orientation" is a rather global invariant,
+and sometimes we definitely do want to swap (e.g. Int ~ alpha), so we don't
+even have a precise statement of what the invariant is.  The advantage
+of the preserve-orientation plan is that it is extremely cheap to implement,
+and apparently works beautifully.
+
+--- Alternative plan (1) ---
+Rather than have an ill-defined invariant, another possiblity is to
+elminate those fresh unification variables at birth, when generating
+the new fundep-inspired equalities.
+
+The key idea is to call `instFlexiX` in `emitFunDepWanteds` on only those
+type variables that are guaranteed to give us some progress. This means we
+have to locally (without calling emitWanteds) identify the type variables
+that do not give us any progress.  In the above example, we _know_ that
+emitting the two wanteds `kco` and `co` is fruitless.
+
+  Q: How do we identify such no-ops?
+
+  1. Generate a matching substitution from LHS to RHS
+        ɸ = [kb0 :-> k0, b0 :->  y0]
+  2. Call `instFlexiX` on only those type variables that do not appear in the domain of ɸ
+        ɸ' = instFlexiX ɸ (tvs - domain ɸ)
+  3. Apply ɸ' on LHS and then call emitWanteds
+        unifyWanteds ... (subst ɸ' LHS) RHS
+
+Why will this work?  The matching substitution ɸ will be a best effort
+substitution that gives us all the easy solutions. It can be generated with
+modified version of `Core/Unify.unify_tys` where we run it in a matching mode
+and never generate `SurelyApart` and always return a `MaybeApart Subst`
+instead.
+
+The same alternative plan would work for type-family injectivity constraints:
+see Note [Improvement orientation] in GHC.Tc.Solver.Equality.
+--- End of Alternative plan (1) ---
+
+--- Alternative plan (2) ---
+We could have a new flavour of TcTyVar (like `TauTv`, `TyVarTv` etc; see GHC.Tc.Utils.TcType.MetaInfo)
+for the fresh unification variables introduced by functional dependencies.  Say `FunDepTv`.  Then in
+GHC.Tc.Utils.Unify.swapOverTyVars we could arrange to keep a `FunDepTv` on the left if possible.
+Looks possible, but it's one more complication.
+--- End of Alternative plan (2) ---
+
+
+--- Historical note: Failed Alternative Plan (3) ---
+Previously we used a flag `cc_fundeps` in `CDictCan`. It would flip to False
+once we used a fun dep to hint the solver to break and to stop emitting more
+wanteds.  This solution was not complete, and caused a failures while trying
+to solve for transitive functional dependencies (test case: T21703)
+-- End of Historical note: Failed Alternative Plan (3) --
+
+
+Historical Note
+~~~~~~~~~~~~~~~
+This Note (anonymous, but related to dict-solving) is rendered obselete by
+ - Danger 1: solved by Note [Instance and Given overlap]
+ - Danger 2: solved by fundeps being idempotent
+
+When we spot an equality arising from a functional dependency,
+we now use that equality (a "wanted") to rewrite the work-item
+constraint right away.  This avoids two dangers
+
+ Danger 1: If we send the original constraint on down the pipeline
+           it may react with an instance declaration, and in delicate
+           situations (when a Given overlaps with an instance) that
+           may produce new insoluble goals: see #4952
+
+ Danger 2: If we don't rewrite the constraint, it may re-react
+           with the same thing later, and produce the same equality
+           again --> termination worries.
+
+-}


=====================================
compiler/GHC/Tc/Solver/Monad.hs
=====================================
@@ -81,7 +81,7 @@ module GHC.Tc.Solver.Monad (
     lookupInertDict,
 
     -- The Model
-    recordUnification, recordUnifications, kickOutRewritable,
+    recordUnification, kickOutRewritable,
 
     -- Inert Safe Haskell safe-overlap failures
     insertSafeOverlapFailureTcS,
@@ -102,7 +102,7 @@ module GHC.Tc.Solver.Monad (
     instDFunType,
 
     -- Unification
-    wrapUnifierX, wrapUnifierTcS, unifyFunDeps, uPairsTcM, unifyForAllBody,
+    wrapUnifier, wrapUnifierAndEmit, uPairsTcM,
 
     -- MetaTyVars
     newFlexiTcSTy, instFlexiX, instFlexiXTcM,
@@ -908,21 +908,19 @@ data TcSEnv
   = TcSEnv {
       tcs_ev_binds    :: EvBindsVar,
 
-      tcs_unif_lvl  :: IORef (Maybe TcLevel),
-         -- The Unification Level Flag
-         -- Outermost level at which we have unified a meta tyvar
-         -- Starts at Nothing, then (Just i), then (Just j) where j TcS (Bool, a)
+nestFunDepsTcS :: TcS a -> TcS a
 nestFunDepsTcS (TcS thing_inside)
-  = reportUnifications $
-    TcS $ \ env@(TcSEnv { tcs_inerts = inerts_var }) ->
+  = TcS $ \ env@(TcSEnv { tcs_inerts = inerts_var }) ->
     TcM.pushTcLevelM_  $
          -- pushTcLevelTcM: increase the level so that unification variables
          -- allocated by the fundep-creation itself don't count as useful unifications
@@ -1220,6 +1217,10 @@ nestFunDepsTcS (TcS thing_inside)
        ; TcM.traceTc "nestFunDepsTcS {" empty
        ; res <- thing_inside nest_env
        ; TcM.traceTc "nestFunDepsTcS }" empty
+
+       -- Unlike nestTcS, do /not/ do `updateInertsWith`; we are going to
+       -- abandon everything about this sub-computation except its unifications
+
        ; return res }
 
 nestTcS :: TcS a -> TcS a
@@ -1733,72 +1734,22 @@ pushLevelNoWorkList _ (TcS thing_inside)
 *                                                                      *
 ********************************************************************* -}
 
-{- Note [The Unification Level Flag]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider a deep tree of implication constraints
-   forall[1] a.                              -- Outer-implic
-      C alpha[1]                               -- Simple
-      forall[2] c. ....(C alpha[1])....        -- Implic-1
-      forall[2] b. ....(alpha[1] ~ Int)....    -- Implic-2
-
-The (C alpha) is insoluble until we know alpha.  We solve alpha
-by unifying alpha:=Int somewhere deep inside Implic-2. But then we
-must try to solve the Outer-implic all over again. This time we can
-solve (C alpha) both in Outer-implic, and nested inside Implic-1.
-
-When should we iterate solving a level-n implication?
-Answer: if any unification of a tyvar at level n takes place
-        in the ic_implics of that implication.
-
-* What if a unification takes place at level n-1? Then don't iterate
-  level n, because we'll iterate level n-1, and that will in turn iterate
-  level n.
-
-* What if a unification takes place at level n, in the ic_simples of
-  level n?  No need to track this, because the kick-out mechanism deals
-  with it.  (We can't drop kick-out in favour of iteration, because kick-out
-  works for skolem-equalities, not just unifications.)
-
-So the monad-global Unification Level Flag, kept in tcs_unif_lvl keeps
-track of
-  - Whether any unifications at all have taken place (Nothing => no unifications)
-  - If so, what is the outermost level that has seen a unification (Just lvl)
-
-The iteration is done in the simplify_loop/maybe_simplify_again loop in GHC.Tc.Solver.
-
-It is helpful not to iterate unless there is a chance of progress.  #8474 is
-an example:
-
-  * There's a deeply-nested chain of implication constraints.
-       ?x:alpha => ?y1:beta1 => ... ?yn:betan => [W] ?x:Int
-
-  * From the innermost one we get a [W] alpha[1] ~ Int,
-    so we can unify.
-
-  * It's better not to iterate the inner implications, but go all the
-    way out to level 1 before iterating -- because iterating level 1
-    will iterate the inner levels anyway.
-
-(In the olden days when we "floated" thse Derived constraints, this was
-much, much more important -- we got exponential behaviour, as each iteration
-produced the same Derived constraint.)
--}
-
-
 unifyTyVar :: TcTyVar -> TcType -> TcS ()
 -- Unify a meta-tyvar with a type
 -- We should never unify the same variable twice!
+-- C.f. GHC.Tc.Utils.Unify.unifyTyVar
 unifyTyVar tv ty
   = assertPpr (isMetaTyVar tv) (ppr tv) $
     do { liftZonkTcS (TcM.writeMetaTyVar tv ty)  -- Produces a trace message
-       ; recordUnification tv }
+       ; uni_ref <- getWhatUnifications
+       ; wrapTcS $ recordUnification uni_ref tv }
 
 reportUnifications :: TcS a -> TcS (Bool, a)
--- Record whether any unifications are done by thing_inside
+-- Record whether any useful unifications are done by thing_inside
 -- Remember to propagate the information to the enclosing context
 reportUnifications (TcS thing_inside)
   = TcS $ \ env@(TcSEnv { tcs_unif_lvl = outer_ul_var }) ->
-    do { inner_ul_var <- TcM.newTcRef Nothing
+    do { inner_ul_var <- TcM.newTcRef NoUnificationsYet
 
        ; res <- thing_inside (env { tcs_unif_lvl = inner_ul_var })
 
@@ -1806,25 +1757,19 @@ reportUnifications (TcS thing_inside)
        ; mb_inner_lvl <- TcM.readTcRef inner_ul_var
 
        ; case mb_inner_lvl of
-           Just unif_lvl
+           UnificationsDone unif_lvl
              | ambient_lvl `deeperThanOrSame` unif_lvl
              -> -- Some useful unifications took place
-                do { mb_outer_lvl <- TcM.readTcRef outer_ul_var
-                   ; TcM.traceTc "reportUnifications" $
-                     vcat [ text "ambient =" <+> ppr ambient_lvl
-                          , text "unif_lvl =" <+> ppr unif_lvl
-                          , text "mb_outer =" <+> ppr mb_outer_lvl ]
-                   ; case mb_outer_lvl of
-                       Just outer_unif_lvl | unif_lvl `deeperThanOrSame` outer_unif_lvl
-                         -> -- No need to update: outer_unif_lvl is already shallower
-                            return ()
-                       _ -> -- Update the outer level
-                            TcM.writeTcRef outer_ul_var (Just unif_lvl)
+                do { recordUnificationLevel outer_ul_var unif_lvl
                    ; return (True, res) }
 
            _  -> -- No useful unifications
                  return (False, res) }
 
+getWhatUnifications :: TcS (TcRef WhatUnifications)
+getWhatUnifications
+  = TcS $ \env -> return (tcs_unif_lvl env)
+
 traceUnificationFlag :: String -> TcS ()
 traceUnificationFlag str
   = TcS $ \env ->
@@ -1837,7 +1782,8 @@ traceUnificationFlag str
 
 getUnificationFlag :: TcS Bool
 -- We are at ambient level i
--- If the unification flag = Just i, reset it to Nothing and return True
+-- If the unification flag = UnificationsDone i,
+--    reset it to NoUnificationsYet, and return True
 -- Otherwise leave it unchanged and return False
 getUnificationFlag
   = TcS $ \env ->
@@ -1848,39 +1794,13 @@ getUnificationFlag
          vcat [ text "ambient:" <+> ppr ambient_lvl
               , text "unif_lvl:" <+> ppr mb_lvl ]
        ; case mb_lvl of
-           Nothing       -> return False
-           Just unif_lvl | ambient_lvl `strictlyDeeperThan` unif_lvl
-                         -> return False
-                         | otherwise
-                         -> do { TcM.writeTcRef ref Nothing
-                               ; return True } }
-
-recordUnification :: TcTyVar -> TcS ()
-recordUnification tv = setUnificationFlagTo (tcTyVarLevel tv)
-
-recordUnifications :: [TcTyVar] -> TcS ()
-recordUnifications tvs
-  = case tvs of
-      [] -> return ()
-      (tv:tvs) -> do { traceTcS "recordUnifications" (ppr min_tv_lvl $$ ppr tvs)
-                     ; setUnificationFlagTo min_tv_lvl }
-        where
-          min_tv_lvl = foldr (minTcLevel . tcTyVarLevel) (tcTyVarLevel tv) tvs
-
-setUnificationFlagTo :: TcLevel -> TcS ()
--- (setUnificationFlag i) sets the unification level to (Just i)
--- unless it already is (Just j) where j <= i
-setUnificationFlagTo lvl
-  = TcS $ \env ->
-    do { let ref = tcs_unif_lvl env
-       ; mb_lvl <- TcM.readTcRef ref
-       ; case mb_lvl of
-           Just unif_lvl | lvl `deeperThanOrSame` unif_lvl
-                         -> do { TcM.traceTc "set-uni-flag skip" $
-                                 vcat [ text "lvl" <+> ppr lvl, text "unif_lvl" <+> ppr unif_lvl ]
-                               ; return () }
-           _ -> do { TcM.traceTc "set-uni-flag" (ppr lvl)
-                   ; TcM.writeTcRef ref (Just lvl) } }
+           NoUnificationsYet -> return False
+           UnificationsDone unif_lvl
+             | ambient_lvl `strictlyDeeperThan` unif_lvl
+             -> return False
+             | otherwise
+             -> do { TcM.writeTcRef ref NoUnificationsYet
+                   ; return True } }
 
 
 {- *********************************************************************
@@ -2182,77 +2102,30 @@ solverDepthError loc ty
 *                                                                      *
 ************************************************************************
 
-Note [wrapUnifierTcS]
-~~~~~~~~~~~~~~~~~~~
+Note [wrapUnifier]
+~~~~~~~~~~~~~~~~~~
 When decomposing equalities we often create new wanted constraints for
 (s ~ t).  But what if s=t?  Then it'd be faster to return Refl right away.
 
 Rather than making an equality test (which traverses the structure of the type,
-perhaps fruitlessly), we call uType (via wrapUnifierTcS) to traverse the common
+perhaps fruitlessly), we call uType (via wrapUnifier) to traverse the common
 structure, and bales out when it finds a difference by creating a new deferred
 Wanted constraint.  But where it succeeds in finding common structure, it just
 builds a coercion to reflect it.
 
 This is all much faster than creating a new constraint, putting it in the
 work list, picking it out, canonicalising it, etc etc.
-
-Note [unifyFunDeps]
-~~~~~~~~~~~~~~~~~~~
-The Bool returned by `unifyFunDeps` is True if we have unified a variable
-that occurs in the constraint we are trying to solve; it is not in the
-inert set so `wrapUnifierTcS` won't kick it out.  Instead we want to send it
-back to the start of the pipeline.  Hence the Bool.
-
-It's vital that we don't return (not (null unified)) because the fundeps
-may create fresh variables; unifying them (alone) should not make us send
-the constraint back to the start, or we'll get an infinite loop.  See
-Note [Fundeps with instances, and equality orientation] in GHC.Tc.Solver.Dict
-and Note [Improvement orientation] in GHC.Tc.Solver.Equality.
 -}
 
 uPairsTcM :: UnifyEnv -> [TypeEqn] -> TcM ()
 uPairsTcM uenv eqns = mapM_ (\(Pair ty1 ty2) -> uType uenv ty1 ty2) eqns
 
-unifyFunDeps :: CtEvidence -> Role
-             -> (UnifyEnv -> TcM ())
-             -> TcS Bool
-unifyFunDeps ev role do_unifications
-  = do { (_, _, unified) <- wrapUnifierTcS ev role do_unifications
-       ; return (any (`elemVarSet` fvs) unified) }
-         -- See Note [unifyFunDeps]
-  where
-    fvs = tyCoVarsOfType (ctEvPred ev)
-
-unifyForAllBody :: CtEvidence -> Role -> (UnifyEnv -> TcM a)
-                -> TcS (a, Cts)
--- We /return/ the equality constraints we generate,
--- rather than emitting them into the monad.
--- See See (SF5) in Note [Solving forall equalities] in GHC.Tc.Solver.Equality
-unifyForAllBody ev role unify_body
-  = do { (res, cts, unified) <- wrapUnifierX ev role unify_body
-
-       -- Record the unificaions we have done
-       ; recordUnifications unified
-
-       ; return (res, cts) }
-
-wrapUnifierTcS :: CtEvidence -> Role
-               -> (UnifyEnv -> TcM a)  -- Some calls to uType
-               -> TcS (a, Bag Ct, [TcTyVar])
--- Invokes the do_unifications argument, with a suitable UnifyEnv.
--- Emit deferred equalities and kick-out from the inert set as a
--- result of any unifications.
--- Very good short-cut when the two types are equal, or nearly so
--- See Note [wrapUnifierTcS]
---
--- The [TcTyVar] is the list of unification variables that were
--- unified the process; the (Bag Ct) are the deferred constraints.
-
-wrapUnifierTcS ev role do_unifications
-  = do { (res, cts, unified) <- wrapUnifierX ev role do_unifications
-
-       -- Record the unificaions we have done
-       ; recordUnifications unified
+wrapUnifierAndEmit :: CtEvidence -> Role
+                   -> (UnifyEnv -> TcM a)  -- Some calls to uType
+                   -> TcS a
+-- Like wrapUnifier, but emits any unsolved equalities into the work-list
+wrapUnifierAndEmit ev role do_unifications
+  = do { (res, cts) <- wrapUnifier ev role do_unifications
 
        -- Emit the deferred constraints
        -- See Note [Work-list ordering] in GHC.Tc.Solved.Equality
@@ -2263,31 +2136,40 @@ wrapUnifierTcS ev role do_unifications
        ; unless (isEmptyBag cts) $
          updWorkListTcS (extendWorkListChildEqs ev cts)
 
-       ; return (res, cts, unified) }
+       ; return res }
 
-wrapUnifierX :: CtEvidence -> Role
+wrapUnifier :: CtEvidence -> Role
              -> (UnifyEnv -> TcM a)  -- Some calls to uType
-             -> TcS (a, Bag Ct, [TcTyVar])
-wrapUnifierX ev role do_unifications
+             -> TcS (a, Bag Ct)
+-- Invokes the do_unifications argument, with a suitable UnifyEnv.
+-- Very good short-cut when the two types are equal, or nearly so
+--    See Note [wrapUnifier]
+-- The (Bag Ct) are the deferred constraints; we emit them but
+-- also return them
+wrapUnifier ev role do_unifications
   = do { given_eq_lvl <- getInnermostGivenEqLevel
+       ; what_uni_ref <- getWhatUnifications
+
        ; wrapTcS $
-         do { defer_ref   <- TcM.newTcRef emptyBag
-            ; unified_ref <- TcM.newTcRef []
+         do { defer_ref    <- TcM.newTcRef emptyBag
             ; let env = UE { u_role         = role
                            , u_given_eq_lvl = given_eq_lvl
                            , u_rewriters    = ctEvRewriters ev
                            , u_loc          = ctEvLoc ev
                            , u_defer        = defer_ref
-                           , u_unified      = Just unified_ref}
+                           , u_what         = Just what_uni_ref }
               -- u_rewriters: the rewriter set and location from
               -- the parent constraint `ev` are inherited in any
               -- new constraints spat out by the unifier
+              --
+              -- u_what: likewise inherit the WhatUnifications flag,
+              --         so that unifications done here are visible
+              --         to the caller
 
             ; res <- do_unifications env
 
             ; cts     <- TcM.readTcRef defer_ref
-            ; unified <- TcM.readTcRef unified_ref
-            ; return (res, cts, unified) } }
+            ; return (res, cts) } }
 
 
 {-


=====================================
compiler/GHC/Tc/Solver/Solve.hs
=====================================
@@ -132,9 +132,10 @@ simplify_loop n limit definitely_redo_implications
                         ; return (wc { wc_simple = simples1
                                      , wc_impl   = implics1 }) }
 
+       -- See Note [When to iterate: unifications]
        ; unif_happened <- getUnificationFlag
        ; csTraceTcS $ text "unif_happened" <+> ppr unif_happened
-         -- Note [The Unification Level Flag] in GHC.Tc.Solver.Monad
+
        ; maybe_simplify_again (n+1) limit unif_happened wc2 }
 
 data NextAction
@@ -225,10 +226,59 @@ any new unifications, and iterate the implications only if so.
 
 "RAE": Add comment here about fundeps also using this mechanism. And probably
 update name of Note.
--}
 
-{- Note [Expanding Recursive Superclasses and ExpansionFuel]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [When to iterate the solver: unifications]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider a deep tree of implication constraints
+   forall[1] a.                              -- Outer-implic
+      C alpha[1]                               -- Simple
+      forall[2] c. ....(C alpha[1])....        -- Implic-1
+      forall[2] b. ....(alpha[1] ~ Int)....    -- Implic-2
+
+The (C alpha) is insoluble until we know alpha.  We solve alpha
+by unifying alpha:=Int somewhere deep inside Implic-2. But then we
+must try to solve the Outer-implic all over again. This time we can
+solve (C alpha) both in Outer-implic, and nested inside Implic-1.
+
+When should we iterate solving a level-n implication?
+Answer: if any unification of a tyvar at level n takes place
+        in the ic_implics of that implication.
+
+* What if a unification takes place at level n-1? Then don't iterate
+  level n, because we'll iterate level n-1, and that will in turn iterate
+  level n.
+
+* What if a unification takes place at level n, in the ic_simples of
+  level n?  No need to track this, because the kick-out mechanism deals
+  with it.  (We can't drop kick-out in favour of iteration, because kick-out
+  works for skolem-equalities, not just unifications.)
+
+So the monad-global `WhatUnifications` flag, kept in `tcs_unif_lvl` keeps
+track of whether any unifications at all have taken place, and if so, what
+is the outermost level that has seen a unification. Seee GHC.Tc.Utils.Unify
+Note [WhatUnifications].
+
+The iteration is done in the simplify_loop/maybe_simplify_again loop.
+
+It is helpful not to iterate unless there is a chance of progress.  #8474 is
+an example:
+
+  * There's a deeply-nested chain of implication constraints.
+       ?x:alpha => ?y1:beta1 => ... ?yn:betan => [W] ?x:Int
+
+  * From the innermost one we get a [W] alpha[1] ~ Int,
+    so we can unify.
+
+  * It's better not to iterate the inner implications, but go all the
+    way out to level 1 before iterating -- because iterating level 1
+    will iterate the inner levels anyway.
+
+(In the olden days when we "floated" these Derived constraints, this was
+much, much more important -- we got exponential behaviour, as each iteration
+produced the same Derived constraint.)
+
+Note [Expanding Recursive Superclasses and ExpansionFuel]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Consider the class declaration (T21909)
 
     class C [a] => C a where


=====================================
compiler/GHC/Tc/Utils/Monad.hs
=====================================
@@ -1907,6 +1907,9 @@ emitSimple ct
 
 emitSimples :: Cts -> TcM ()
 emitSimples cts
+  | null cts
+  = return ()
+  | otherwise
   = do { lie_var <- getConstraintVar ;
          updTcRef lie_var (`addSimples` cts) }
 


=====================================
compiler/GHC/Tc/Utils/Unify.hs
=====================================
@@ -30,14 +30,15 @@ module GHC.Tc.Utils.Unify (
   dsInstantiate,
 
   -- Various unifications
-  unifyType, unifyKind, unifyInvisibleType,
+  uType, unifyType, unifyKind, unifyInvisibleType,
   unifyExprType, unifyTypeAndEmit, promoteTcType,
   swapOverTyVars, touchabilityTest, checkTopShape, lhsPriority,
-  UnifyEnv(..), updUEnvLoc, setUEnvRole,
-  uType,
   mightEqualLater,
   makeTypeConcrete,
 
+  UnifyEnv(..), updUEnvLoc, setUEnvRole,
+  WhatUnifications(..), recordUnification, recordUnificationLevel,
+
   --------------------------------
   -- Holes
   matchExpectedListTy,
@@ -2296,15 +2297,75 @@ unifyTypeAndEmit t_or_k orig ty1 ty2
        ; let env = UE { u_loc = loc, u_role = Nominal
                       , u_given_eq_lvl = cur_lvl
                       , u_rewriters = emptyRewriterSet  -- ToDo: check this
-                      , u_defer = ref, u_unified = Nothing }
+                      , u_defer = ref, u_what = Nothing }
 
        -- The hard work happens here
        ; co <- uType env ty1 ty2
 
+       -- Emit any deferred constraints
        ; cts <- readTcRef ref
-       ; unless (null cts) (emitSimples cts)
+       ; emitSimples cts
+
        ; return co }
 
+
+{- *********************************************************************
+*                                                                      *
+                 WhatUnifications
+*                                                                      *
+**********************************************************************-}
+
+data WhatUnifications
+  = NoUnificationsYet
+  | UnificationsDone TcLevel
+
+{- Note [WhatUnifications]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+We record, in mutable variable carried by the monad, the `WhatUnifications` flag.
+
+* In the eager unifier (this module) it is held the
+    u_what :: Maybe (TcRef WhatUnificatons)
+  field of `UnifyEnv`
+
+* In TcS monad, it is held in the
+    tcs_unif_lvl :: IORef WhatUnifications
+  field of `TcSEnv`.
+
+In all cases the idea is this:
+
+    ---------------------------------------
+    `WhatUnifications` records the level of the
+     outermost meta-tyvar that we have unified
+    ----------------------------------------
+
+It starts life as `NoUnificationsYet`.  Then when we unify a tyvar at level j,
+we set the flag to `UnificationsDone j`, unless the flag is /already/ set to
+`UnificationsDone i` where i<=j.
+
+Why do all this?
+ * See Note [When to iterate the solver: unifications] in GHC.Tc.Solver.Solve
+-}
+
+recordUnification :: TcRef WhatUnifications -> TcTyVar -> TcM ()
+recordUnification what_ref tv = recordUnificationLevel what_ref (tcTyVarLevel tv)
+
+recordUnificationLevel :: TcRef WhatUnifications -> TcLevel -> TcM ()
+recordUnificationLevel what_ref tv_lvl
+  = do { what <- readTcRef what_ref
+       ; case what of
+           UnificationsDone unif_lvl
+             | tv_lvl `deeperThanOrSame` unif_lvl
+             -> do { traceTc "set-uni-flag: no-op" $
+                     vcat [ text "lvl" <+> ppr tv_lvl, text "unif_lvl" <+> ppr unif_lvl ]
+                   ; return () }
+           _ -> do { traceTc "set-uni-flag" (ppr tv_lvl)
+                   ; writeTcRef what_ref (UnificationsDone tv_lvl) } }
+
+
+instance Outputable WhatUnifications where
+   ppr NoUnificationsYet      = text "NoUniYet"
+   ppr (UnificationsDone lvl) = text "UniDone" <> braces (ppr lvl)
+
 {-
 %************************************************************************
 %*                                                                      *
@@ -2320,7 +2381,7 @@ The eager unifier, `uType`, is called by
     via the wrappers `unifyType`, `unifyKind` etc
 
   * The constraint solver (e.g. in GHC.Tc.Solver.Equality),
-    via `GHC.Tc.Solver.Monad.wrapUnifierTcS`.
+    via `GHC.Tc.Solver.Monad.wrapUnifie`.
 
 `uType` runs in the TcM monad, but it carries a UnifyEnv that tells it
 what to do when unifying a variable or deferring a constraint. Specifically,
@@ -2355,7 +2416,7 @@ data UnifyEnv
 
          -- Which variables are unified;
          -- if Nothing, we don't care
-       , u_unified :: Maybe (TcRef [TcTyVar])
+       , u_what :: Maybe (TcRef WhatUnifications)
     }
 
 setUEnvRole :: UnifyEnv -> Role -> UnifyEnv
@@ -2752,10 +2813,7 @@ uUnfilledVar2 env@(UE { u_defer = def_eq_ref, u_given_eq_lvl = given_eq_lvl })
            -- Only proceed if the kinds match
            -- NB: tv1 should still be unfilled, despite the kind unification
            --     because tv1 is not free in ty2' (or, hence, in its kind)
-         then do { liftZonkM $ writeMetaTyVar tv1 ty2
-                 ; case u_unified env of
-                     Nothing -> return ()
-                     Just uref -> updTcRef uref (tv1 :)
+         then do { unifyTyVar env tv1 ty2
                  ; return (mkNomReflCo ty2) }  -- Unification is always Nominal
 
          else -- The kinds don't match yet, so defer instead.
@@ -2770,6 +2828,14 @@ uUnfilledVar2 env@(UE { u_defer = def_eq_ref, u_given_eq_lvl = given_eq_lvl })
     ty1 = mkTyVarTy tv1
     defer = unSwap swapped (uType_defer env) ty1 ty2
 
+unifyTyVar :: UnifyEnv -> TcTyVar -> TcType -> TcM ()
+-- Actually do the unification, and record it in WhatUnifications
+unifyTyVar (UE { u_what = mb_what_unifications }) tv ty
+  = do { liftZonkM $ writeMetaTyVar tv ty
+       ; case mb_what_unifications of
+           Nothing -> return ()
+           Just wu -> recordUnification wu tv }
+
 swapOverTyVars :: Bool -> TcTyVar -> TcTyVar -> Bool
 swapOverTyVars is_given tv1 tv2
   -- See Note [Unification variables on the left]
@@ -3011,8 +3077,14 @@ The most important thing is that we want to put tyvars with
 the deepest level on the left.  The reason to do so differs for
 Wanteds and Givens, but either way, deepest wins!  Simple.
 
-* Wanteds.  Putting the deepest variable on the left maximise the
+* Wanteds.  Putting the deepest variable on the left maximises the
   chances that it's a touchable meta-tyvar which can be solved.
+  It also /crucial/ for skolem escape.  Consider
+      [W] alpha[7] ~ beta[8]
+      [W] beta[8] ~ a[8]       -- `a` is a skolem
+  If we unify alpha[7]:=beta[8], we will then happily unify
+  beta[8]:=a[8].  But that's wrong because now alpha[7]
+  is unified with an inner skolem a[8].  Disaster.
 
 * Givens. Suppose we have something like
      forall a[2]. b[1] ~ a[2] => beta[1] ~ a[2]



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc8360efa255fb74ed2b90567657409d...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dc8360efa255fb74ed2b90567657409d...
You're receiving this email because of your account on gitlab.haskell.org.


    

[Git][ghc/ghc][wip/T23162-spj] More refactoring

Simon Peyton Jones (＠simonpj)