GitLab

Simon Peyton Jones pushed to branch wip/T26115 at Glasgow Haskell Compiler / GHC

Commits:

0d1363c0
by Simon Peyton Jones at 2025-06-28T23:00:40+01:00
```
Wibbles
```
66114346
by Simon Peyton Jones at 2025-06-28T23:37:33+01:00
```
Wibble
```

6 changed files:

compiler/GHC/Core/Opt/Specialise.hs
compiler/GHC/HsToCore/Binds.hs
compiler/GHC/Tc/Gen/Sig.hs
compiler/GHC/Tc/Solver/Dict.hs
compiler/GHC/Tc/Solver/Monad.hs
compiler/GHC/Tc/Solver/Solve.hs

Changes:

@@ -3375,7 +3375,7 @@ beats_or_same (CI { ci_key = args1 }) (CI { ci_key = args2 })
      go_arg (SpecDict {})  (SpecDict {})  = True
      go_arg UnspecType     UnspecType     = True
      go_arg UnspecArg      UnspecArg      = True
 -    go_arg _               _             = False
 +    go_arg _              _              = False
  ----------------------
  splitDictBinds :: FloatedDictBinds -> IdSet -> (FloatedDictBinds, OrdList DictBind, IdSet)
@@ -3491,7 +3491,6 @@ newSpecIdSM old_name new_ty details info
          ; return (assert (not (isCoVarType new_ty)) $
                    mkLocalVar details new_name ManyTy new_ty info) }
+-
  {-
                  Old (but interesting) stuff about unboxed bindings
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compiler/GHC/HsToCore/Binds.hs

@@ -796,216 +796,6 @@ Note [Desugaring new-form SPECIALISE pragmas]
  which is desugared in this module by `dsSpec`.  For the context see
  Note [Handling new-form SPECIALISE pragmas] in GHC.Tc.Gen.Sig
 -Suppose we have f :: forall p q. (Ord p, Eq q) => p -> q -> q, and a pragma
+-
 -  {-# SPECIALISE forall x. f @[a] @[Int] x [3,4] #-}
+-
 -In `dsSpec` the `SpecPragE` will look something like this:
+-
 -  SpecPragE { spe_fn_id = f
 -            , spe_bndrs = @a (d:Ord a) (x:[a])
 -            , spe_call  = let d2:Ord [a] = $dfOrdList d
 -                              d3:Eq [Int] = $dfEqList $dfEqInt
 -                          in f @[a] @[Int] d2 d3 x [3,4] }
 -We want to get
+-
 -    RULE  forall a (d2:Ord a) (d3:Eq [Int]) (x:[a]).
 -             f @[a] @[Int] d2 d3 x [3,4] = $sf d2 x
+-
 -    $sf :: forall a. Ord [a] => a -> Int
 -    $sf = /\a. \d2 x.
 -             let d3 = $dfEqList $dfEqInt
 -             in <f-rhs> @[a] @[Int] d2 d3 x [3,4]
+-
 -Notice that
+-
 -(SP1) If the expression in the SPECIALISE pragma had a type signature, such as
 -     SPECIALISE f :: Eq b => Int -> b -> b
 -  then the desugared expression may have type abstractions and applications
 -  "in the way", like this:
 -     (/\b. (\d:Eq b). let d1 = $dfOrdInt in f @Int @b d1 d) @b (d2:Eq b)
 -  The lambdas come from the type signature, which is then re-instantiated,
 -  hence the applications of those lambdas.
+-
 -  We use the simple optimiser to simplify this to
 -     let { d = d2; d1 = $dfOrdInt } in f @Int @b d1 d
+-
 -  Important: do no inlining in this "simple optimiser" phase:
 -  use `simpleOptExprNoInline`. E.g. we don't want to turn it into
 -     f @Int @b $dfOrdInt d2
 -  because the latter is harder to match.
+-
 -  Similarly if we have
 -     let { d1=d; d2=d } in f d1 d2
 -  we don't want to inline d1/d2 to get this
 -     f d d
+-
 -(SP2) $sf does not simply quantify over (d:Ord a). Instead, to figure out what
 -  it should quantify over, and to include the 'd3' binding in the body of $sf,
 -  we use the function `prepareSpecLHS`. It takes the simplified LHS `core_call`,
 -  and uses the dictionary bindings to figure out the RULE LHS and RHS.
+-
 -  This is described in Note [prepareSpecLHS].
+-
 -Note [prepareSpecLHS]
 -~~~~~~~~~~~~~~~~~~~~~
 -To compute the appropriate RULE LHS and RHS for a new-form specialise pragma,
 -as described in Note [Desugaring new-form SPECIALISE pragmas], we use a function
 -called prepareSpecLHS.
 -It takes as input a list of (type and evidence) binders, and a Core expression.
 -For example, suppose its input is of the following form:
+-
 -  spe_bndrs = @a (d:Ord a)
 -  spe_call =
 -    let
 -      -- call these bindings the call_binds
 -      d1 :: Num Int
 -      d1 = $dfNumInt
 -      d2 :: Ord [a]
 -      d2 = $dfOrdList d
 -      d3 :: Eq a
 -      d3 = $p1Ord d3
 -      d4 :: Ord (F a)
 -      d4 = d |> co1
 -      d5 :: Ord (G a)
 -      d5 = d4 |> co2
 -    in
 -      f @[a] @Int d1 d2 d3 d5
+-
 -The goal of prepareSpecLHS is then to generate the following two things:
+-
 -  - A specialisation, of the form:
+-
 -      $sf <spec_args> =
 -        let <spec_binds>
 -        in <f-rhs> @[a] @Int d1 d2 d3 d5
+-
 -  - A rule, of the form:
+-
 -      RULE forall a d1 d2 d3 d5. f @[a] @Int d1 d2 d3 d5 =
 -        let <rule_binds>
 -        in $sf <spec_args>
+-
 -That is, we must compute 'spec_args', 'rule_binds' and 'spec_binds'. A first
 -approach might be:
+-
 -  - take spec_args = spe_bndrs,
 -  - spec_binds = call_binds.
+-
 -If we did so, the RULE would look like:
+-
 -  RULE forall a d1 d2 d3 d5. f @[a] @Int d1 d2 d3 d5 =
 -    let d = <???>
 -    in $sf @a d
+-
 -The problem is: how do we recover 'd' from 'd1', 'd2', 'd3', 'd5'? Essentially,
 -we need to run call_binds in reverse. In this example, we had:
+-
 -  d1 :: Num Int
 -  d1 = $dfNumInt
 -  d2 :: Ord [a]
 -  d2 = $dfOrdList d
 -  d3 :: Eq a
 -  d3 = $p1Ord d3
 -  d4 :: Ord (F a)
 -  d4 = d |> co1
 -  d5 :: Ord (G a)
 -  d5 = d4 |> co2
+-
 -Let's try to recover (d: Ord a) from 'd1', 'd2', 'd4', 'd5':
+-
 -  - d1 is a constant binding, so it doesn't help us.
 -  - d2 uses a top-level instance, which we can't run in reverse; we can't
 -    obtain Ord a from Ord [a].
 -  - d3 uses a superclass selector which prevents us from making progress.
 -  - d5 is defined using d4, and both involve a cast.
 -    In theory we could define d = d5 |> sym (co1 ; co2), but this gets
 -    pretty complicated.
+-
 -This demonstrates the following:
+-
 -  1. The bindings can't necessarily be "run in reverse".
 -  2. Even if the bindings theoretically can be "run in reverse", it is not
 -     straightforward to do so.
+-
 -Now, we could strive to make the let-bindings reversible. We already do this
 -to some extent for quantified constraints, as explained in
 -Note [Fully solving constraints for specialisation] in GHC.Tc.Gen.Sig,
 -using the TcSSpecPrag solver mode described in Note [TcSSpecPrag] in GHC.Tc.Solver.Monad.
 -However, given (2), we don't bother ensuring that e.g. we don't use top-level
 -class instances like in d2 above. Instead, we handle these bindings in
 -prepareSpecLHS as follows:
+-
 -  (a) Go through the bindings in order.
+-
 -    (1) Bindings like
 -          d1 = $dfNumInt
 -        depend only on constants and move to the specialised function body.
 -        That is crucial -- it makes those specialised methods available in the
 -        specialised body. These are the `spec_const_binds`.
+-
 -        Note that these binds /can/ depend on locally-quantified /type/ variables.
 -        For example, if we have
 -          instance Monad (ST s) where ...
 -        then the dictionary for (Monad (ST s)) is effectively a constant dictionary.
 -        This is important to get specialisation for such types. Example in test T8331.
+-
 -    (2) Other bindings, like
 -          d2:Ord [a] = $dfOrdList d
 -          d3 = d
 -        depend on a locally-quantifed evidence variable `d`.
 -        Surprisingly, /we want to drop these bindings entirely!/
 -        This is because, as explained above, it is too difficult to run these
 -        in reverse. Instead, we drop them, and re-compute which dictionaries
 -        we will quantify over.
+-
 -    (3) Finally, inside those dictionary bindings we should find the call of the
 -        function itself
 -            f @[a] @[Int] d2 d3 x [3,4]
 -        'prepareSpecLHS' takes the call apart and returns its arguments.
+-
 -  (b) Now, (a)(2) means that the RULE does not quantify over 'd' any more; it
 -      quantifies over 'd1' 'd2' 'd3' 'd5'. So we recompute the `rule_bndrs`
 -      from scratch.
+-
 -      Moreover, the specialised function also no longer quantifies over 'd',
 -      it quantifies over 'd2' 'd3' 'd5'. This set of binders is computed by
 -      taking the RULE binders and subtracting off the binders from
 -      the `spec_const_binds`.
+-
 -[Shortcoming] This whole approach does have one downside, compared to running
 -the let-bindings in reverse: it doesn't allow us to common-up dictionaries.
 -Consider for example:
+-
 -  g :: forall a b. ( Eq a, Ord b ) => a -> b -> Bool
 -  {-# SPECIALISE g :: forall c. Ord c => c -> c -> Bool #-}
+-
 -The input to prepareSpecLHS will be (more or less):
+-
 -  spe_bndrs: @c (d:Ord c)
 -  spe_call =
 -    let
 -      d1 :: Eq c
 -      d1 = $p1Ord d
 -      d2 :: Ord c
 -      d2 = d
 -    in
 -      g @c @c d1 d2
+-
 -The approach described in (2) will thus lead us to generate:
+-
 -  RULE g @c @c d1 d2 = $sg @c d1 d2
 -  $sg @c d1 d2 = <g-rhs> @c @c d1 d2
+-
 -when we would rather avoid passing both dictionaries, and instead generate:
+-
 -  RULE g @c @c d1 d2 = let { d = d2 } in $sg @c d
 -  $sg @c d = let { d1 = $p1Ord d; d2 = d } in <g-rhs> @c @c d1 d2
+-
 -For now, we accept this infelicity.
+-
 -Note [Desugaring new-form SPECIALISE pragmas] -- Take 2
 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Suppose we have
    f :: forall a b c d. (Ord a, Ord b, Eq c, Ix d) => ...
    f = rhs
@@ -1027,11 +817,11 @@ We /could/ generate
     RULE  f d1 d2 d3 d4 e1..en = $sf d1 d2 d3 d4
     $sf d1 d2 d3 d4 = <rhs> d1 d2 d3 d4
 -But that would do no specialisation! What we want is this:
 +But that would do no specialisation at all! What we want is this:
     RULE  f d1 _d2 _d3 d4 e1..en = $sf d1 d4
     $sf d1 d4 =  let d7 = d1   -- Renaming
 -                    dx1 = d7  -- Renaming
 -                    d6 = dx1
 +                    dx1 = d1  -- Renaming
 +                    d6 = d1   -- Renaming
                      d2 = $fOrdList d6
                      d3 = $fEqList $fEqInt
                  in rhs d1 d2 d3 d4
@@ -1045,46 +835,61 @@ Notice that:
      to line things up
  The transformation goes in these steps
++
  (S1) decomposeCall: decomopose `the_call` into
 -     - `rev_binds`: the enclosing let-bindings (actually reversed)
 +     - `binds`: the enclosing let-bindings
       - `rule_lhs_args`: the arguments of the call itself
 -    We carefully arrange that the dictionary arguments of the actual
 -    call, `rule_lhs_args` are all distinct dictionary variables,
 -    not expressions. How? We use `simpleOptExprNoInline` to avoid
 -    inlining the let-bindings.
++
 +  If the expression in the SPECIALISE pragma had a type signature, such as
 +     SPECIALISE g :: Eq b => Int -> b -> b
 +  then the desugared expression may have type abstractions and applications
 +  "in the way", like this:
 +     (/\b. (\d:Eq b). let d1 = $dfOrdInt in f @Int @b d1 d) @b (d2:Eq b)
 +  The lambdas come from the type signature, which is then re-instantiated,
 +  hence the applications of those lambdas.
++
 +  so `decomposeCall` uses the simple optimiser to simplify this to
 +     let { d = d2; d1 = $dfOrdInt } in f @Int @b d1 d
++
 +  Wrinkle (S1a): do no inlining in this "simple optimiser" phase:
 +  use `simpleOptExprNoInline`. E.g. we don't want to turn it into
 +     f @Int @b $dfOrdInt d2
 +  because the latter is harder to match. Similarly if we have
 +     let { d1=d; d2=d } in f d1 d2
 +  we don't want to inline d1/d2 to get this
 +     f d d
++
 +  TL;DR: as a result the dictionary arguments of the actual call,
 +    `rule_lhs_args` are all distinct dictionary variables, not
 +    expressions.
  (S2) Compute `rule_bndrs`: the free vars of `rule_lhs_args`, which
     will be the forall'd template variables of the RULE.  In the example,
         rule_bndrs = d1,d2,d3,d4
+-
 -(S3) grabSpecBinds: transform `rev_binds` into `spec_binds`: the
 -   bindings we will wrap around the call in the RHS of `$sf`
+-
 -(S4) Find `spec_bndrs`, the subset of `rule_bndrs` that we actually
 -   need to pass to `$sf`, simply by filtering out those that are
 -   bound by `spec_binds`.  In the example
 -      spec_bndrs = d1,d4
+-
+-
 -     Working inner
 -* Grab any bindings we can that will "shadow" the forall'd
 -  rule-bndrs, giving specialised bindings for them.
 -  * We keep a set of known_bndrs starting with {d1,..,dn}
 -  * We keep a binding iff no free var is
 -      (a) in orig_bndrs (i.e. not totally free)
 -      (b) not in known_bndrs
 -  * If we keep it, add its binder to known_bndrs; if not, don't
+-
 -To maximise what we can "grab", start by extracting /renamings/ of the
 -forall'd rule_bndrs, and bringing them to the top.  A renaming is
 -    rule_bndr = d
 -If we see this:
 -  * Bring d=rule_bndr to the top
 -  * Add d to the set of variables to look for on the right.
 -    e.g.    rule_bndrs = d1, d2
 -            Bindings   { d7=d9; d1=d7 }
 -        Bring to the top  { d7=d1; d9=d7 }
+-
 +   These variables will get values from a successful RULE match.
++
 +(S3) `getRenamings`: starting from the rule_bndrs, make bindings for
 +   all other variables that are equal to them.  In the example, we
 +   make renaming-bindings for d7, dx1, d6.
++
 +   NB1: we don't actually have to remove the original bindings;
 +        it's harmless to leave them
 +   NB2: We also reverse bindings like  d1 = d2 |> co, to get
 +           d2 = d1 |> sym co
 +        It's easy and may help.
++
 +(S4) `pickSpecBinds`: pick the bindings we want to keep in the
 +   specialised function.  We start from `known_vars`, the variables we
 +   know, namely the `rule_bndrs` and the binders from (S3), which are
 +   all equal to one of the `rule_bndrs`.
++
 +   Then we keep a binding if the free vars of its RHS are all known.
 +   In our example, `d2` and `d3` are both picked, but `d4` is not.
 +   The non-picked ones won't end up being specialised.
++
 +(S5) Finally, work out which of the `rule_bndrs` we must pass on to
 +   specialised function. We just filter out ones bound by a renaming
 +   or a picked binding.
  -}
  ------------------------
@@ -1155,7 +960,9 @@ dsSpec_help :: Name -> Id -> CoreExpr              -- Function to specialise
              -> InlinePragma -> [Var] -> CoreExpr
              -> DsM (Maybe (OrdList (Id,CoreExpr), CoreRule))
  dsSpec_help poly_nm poly_id poly_rhs spec_inl orig_bndrs ds_call
 -  = do { mb_call_info <- decomposeCall poly_id ds_call
 +  = do { -- Decompose the call
 +         -- Step (S1) of Note [Desugaring new-form SPECIALISE pragmas]
 +         mb_call_info <- decomposeCall poly_id ds_call
         ; case mb_call_info of {
              Nothing -> return Nothing ;
              Just (binds, rule_lhs_args) ->
@@ -1167,12 +974,17 @@ dsSpec_help poly_nm poly_id poly_rhs spec_inl orig_bndrs ds_call
               is_local :: Var -> Bool
               is_local v = v `elemVarSet` locals
 +             -- Find `rule_bndrs`: (S2) of Note [Desugaring new-form SPECIALISE pragmas]
               rule_bndrs = scopedSort (exprsSomeFreeVarsList is_local rule_lhs_args)
 +             -- getRenamings: (S3) of  Note [Desugaring new-form SPECIALISE pragmas]
               rn_binds     = getRenamings orig_bndrs binds rule_bndrs
++
 +             -- pickSpecBinds: (S4) of Note [Desugaring new-form SPECIALISE pragmas]
               known_vars   = mkVarSet rule_bndrs `extendVarSetList` bindersOfBinds rn_binds
               picked_binds = pickSpecBinds is_local known_vars binds
 +             -- Fins `spec_bndrs`: (S5) of Note [Desugaring new-form SPECIALISE pragmas]
               -- Make spec_bndrs, the variables to pass to the specialised
               -- function, by filtering out the rule_bndrs that aren't needed
               spec_binds_bndr_set = mkVarSet (bindersOfBinds picked_binds)
@@ -1262,12 +1074,12 @@ dsSpec_help poly_nm poly_id poly_rhs spec_inl orig_bndrs ds_call
  decomposeCall :: Id -> CoreExpr
                -> DsM (Maybe ([CoreBind], [CoreExpr] ))
  -- Decompose the call into (let <binds> in f <args>)
 +-- See (S1) in Note [Desugaring new-form SPECIALISE pragmas]
  decomposeCall poly_id ds_call
 -  = do { -- Simplify the (desugared) call; see wrinkle (SP1)
 -         -- in Note [Desugaring new-form SPECIALISE pragmas]
 -       ; dflags  <- getDynFlags
 +  = do { dflags  <- getDynFlags
         ; let simpl_opts = initSimpleOpts dflags
               core_call  = simpleOptExprNoInline simpl_opts ds_call
 +               -- simpleOpeExprNoInlint: see Wrinkle (S1a)!
         ; case go [] core_call of {
             Nothing -> do { diagnosticDs (DsRuleLhsTooComplicated ds_call core_call)

compiler/GHC/Tc/Gen/Sig.hs

@@ -759,9 +759,7 @@ This is done in three parts.
      (1) Typecheck the expression, capturing its constraints
 -    (2) Solve these constraints, but in special TcSSpecPrag mode which ensures
 -        each original Wanted is either fully solved or left untouched.
 -        See Note [Fully solving constraints for specialisation].
 +    (2) Solve these constraints
      (3) Compute the constraints to quantify over, using `getRuleQuantCts` on
          the unsolved constraints returned by (2).
@@ -797,68 +795,6 @@ This is done in three parts.
          of the form:
             forall @a @b d1 d2 d3. f d1 d2 d3 = $sf d1 d2 d3
 -Note [Fully solving constraints for specialisation]
 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 -As far as specialisation is concerned, it is actively harmful to simplify
 -constraints without /fully/ solving them.
+-
 -Example:
+-
 -  f :: ∀ a t. (Eq a, ∀x. Eq x => Eq (t x)). t a -> Char
 -  {-# SPECIALISE f @Int #-}
+-
 -  Typechecking 'f' will result in [W] Eq Int, [W] ∀x. Eq x => Eq (t x).
 -  We absolutely MUST leave the quantified constraint alone, because we want to
 -  quantify over it. If we were to try to simplify it, we would emit an
 -  implication and would thereafter never be able to quantify over the original
 -  quantified constraint.
+-
 -  However, we still need to simplify quantified constraints that can be
 -  /fully solved/ from instances, otherwise we would never be able to
 -  specialise them away. Example: {-# SPECIALISE f @a @[] #-}.
+-
 -The conclusion is this:
+-
 -  when solving the constraints that arise from a specialise pragma, following
 -  the recipe described in Note [Handling new-form SPECIALISE pragmas], all
 -  Wanted quantified constraints should either be:
 -    - fully solved (no free evidence variables), or
 -    - left untouched.
+-
 -To achieve this, we run the solver in a special "all-or-nothing" solving mode,
 -described in Note [TcSSpecPrag] in GHC.Tc.Solver.Monad.
+-
 -Note that a similar problem arises in other situations in which the solver takes
 -an irreversible step, such as using a top-level class instance. This is currently
 -less important, as the desugarer can handle these cases. To explain, consider:
+-
 -    g :: ∀ a. Eq a => a -> Bool
 -    {-# SPECIALISE g @[e] #-}
+-
 -  Typechecking 'g' will result in [W] Eq [e]. Were we to simplify this to
 -  [W] Eq e, we would have difficulty generating a RULE for the specialisation:
+-
 -    $sg :: Eq e => [e] -> Bool
+-
 -    RULE ∀ e (d :: Eq [e]). g @[e] d = $sg @e (??? :: Eq e)
 -      -- Can't fill in ??? because we can't run instances in reverse.
+-
 -    RULE ∀ e (d :: Eq e). g @[e] ($fEqList @e d) = $sg @e d
 -      -- Bad RULE matching template: matches on the structure of a dictionary
+-
 -  Moreover, there is no real benefit to any of this, because the specialiser
 -  can't do anything useful from the knowledge that a dictionary for 'Eq [e]' is
 -  constructed from a dictionary for 'Eq e' using the 'Eq' instance for lists.
+-
 -Here, it would make sense to also use the "solve completely" mechanism in the
 -typechecker to avoid producing evidence terms that we can't "run in reverse".
 -However, the current implementation tackles this issue in the desugarer, as is
 -explained in Note [prepareSpecLHS] in GHC.HsToCore.Binds.
 -So, for the time being at least, in TcSSpecPrag mode, we don't attempt to "fully solve"
 -constraints when we use a top-level instance. This might change in the future,
 -were we to decide to attempt to address [Shortcoming] in Note [prepareSpecLHS]
 -in GHC.HsToCore.Binds.
+-
  Note [Handling old-form SPECIALISE pragmas]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  NB: this code path is deprecated, and is scheduled to be removed in GHC 9.18, as per #25440.
@@ -1039,7 +975,7 @@ tcSpecPrag poly_id (SpecSigE nm rule_bndrs spec_e inl)
              <- tcRuleBndrs skol_info rule_bndrs $
                 tcInferRho spec_e
 -         -- (2) Solve the resulting wanteds in TcSSpecPrag mode.
 +         -- (2) Solve the resulting wanteds
         ; ev_binds_var <- newTcEvBinds
         ; spec_e_wanted <- setTcLevel rhs_tclvl $
                            runTcSWithEvBinds ev_binds_var  $

compiler/GHC/Tc/Solver/Dict.hs

@@ -462,30 +462,20 @@ from the instance that we have in scope:
          case x of { GHC.Types.I# x1 -> GHC.Types.I# (GHC.Prim.+# x1 1#) }
  ** NB: It is important to emphasize that all this is purely an optimization:
 -** exactly the same programs should typecheck with or without this
 -** procedure.
+-
 -Solving fully
 -~~~~~~~~~~~~~
 -There is a reason why the solver does not simply try to solve such
 -constraints with top-level instances. If the solver finds a relevant
 -instance declaration in scope, that instance may require a context
 -that can't be solved for. A good example of this is:
+-
 -    f :: Ord [a] => ...
 -    f x = ..Need Eq [a]...
+-
 -If we have instance `Eq a => Eq [a]` in scope and we tried to use it, we would
 -be left with the obligation to solve the constraint Eq a, which we cannot. So we
 -must be conservative in our attempt to use an instance declaration to solve the
 -[W] constraint we're interested in.
+-
 -Our rule is that we try to solve all of the instance's subgoals
 -recursively all at once. Precisely: We only attempt to solve
 -constraints of the form `C1, ... Cm => C t1 ... t n`, where all the Ci
 -are themselves class constraints of the form `C1', ... Cm' => C' t1'
 -... tn'` and we only succeed if the entire tree of constraints is
 -solvable from instances.
 +** exactly the same programs should typecheck with or without this procedure.
++
 +Consider
 +       f :: Ord [a] => ...
 +       f x = ..Need Eq [a]...
 +We could use the Eq [a] superclass of the Ord [a], or we could use the top-level
 +instance `Eq a => Eq [a]`.   But if we did the latter we'd be stuck with an
 +insoluble constraint (Eq a).
++
 +So the ShortCutSolving rule is this:
 +   If we could solve a constraint from a local Given,
 +   try first to /completely/ solve the constraint using only top-level instances.
 +   - If that succeeds, use it
 +   - If not, use the local Given
  An example that succeeds:
@@ -511,7 +501,8 @@ An example that fails:
      f :: C [a] b => b -> Bool
      f x = m x == []
 -Which, because solving `Eq [a]` demands `Eq a` which we cannot solve, produces:
 +Which, because solving `Eq [a]` demands `Eq a` which we cannot solve. so short-cut
 +solving fails and we use the superclass of C:
      f :: forall a b. C [a] b => b -> Bool
      f = \ (@ a) (@ b) ($dC :: C [a] b) (eta :: b) ->
@@ -521,23 +512,49 @@ Which, because solving `Eq [a]` demands `Eq a` which we cannot solve, produces:
            (m @ [a] @ b $dC eta)
            (GHC.Types.[] @ a)
 -Note [Shortcut solving: type families]
 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 -Suppose we have (#13943)
 -  class Take (n :: Nat) where ...
 -  instance {-# OVERLAPPING #-}                    Take 0 where ..
 -  instance {-# OVERLAPPABLE #-} (Take (n - 1)) => Take n where ..
+-
 -And we have [W] Take 3.  That only matches one instance so we get
 -[W] Take (3-1).  Really we should now rewrite to reduce the (3-1) to 2, and
 -so on -- but that is reproducing yet more of the solver.  Sigh.  For now,
 -we just give up (remember all this is just an optimisation).
+-
 -But we must not just naively try to lookup (Take (3-1)) in the
 -InstEnv, or it'll (wrongly) appear not to match (Take 0) and get a
 -unique match on the (Take n) instance.  That leads immediately to an
 -infinite loop.  Hence the check that 'preds' have no type families
 -(isTyFamFree).
 +The moving parts are relatively simple:
++
 +* To attempt to solve the constraint completely, we just recursively
 +  call the constraint solver. See the use of `tryTcS` in
 +  `tcShortCutSolver`.
++
 +* When this attempted recursive solving, we set a special mode
 +  `TcSShortCut`, which signals that we are trying to solve using only
 +  top-level instances.  We switch on `TcSShortCut` mode in
 +  `tryShortCutSolver`.
++
 +* When in TcSShortCut mode, we behave specially in a few places:
 +  - `tryInertDicts`, where we would otherwise look for a Given to solve our Wantee
 +  - `noMatchableGivenDicts`, which also consults the Givens
 +  - `matchLocalInst`, which would otherwise consult Given quantified constraints
 +  - `GHC.Tc.Solver.Instance.Class.matchInstEnv`: when short-cut solving, don't
 +    pick overlappable top-level instances
++
++
 +Some wrinkles:
++
 +(SCS1) Note [Shortcut solving: incoherence]
++
 +(SCS2) The short-cut solver just uses the solver recursively, so we get its
 +  full power:
++
 +    * We need to be able to handle recursive super classes. The
 +      solved_dicts state  ensures that we remember what we have already
 +      tried to solve to avoid looping.
++
 +    * As #15164 showed, it can be important to exploit sharing between
 +      goals. E.g. To solve G we may need G1 and G2. To solve G1 we may need H;
 +      and to solve G2 we may need H. If we don't spot this sharing we may
 +      solve H twice; and if this pattern repeats we may get exponentially bad
 +      behaviour.
++
 +    * Suppose we have (#13943)
 +          class Take (n :: Nat) where ...
 +          instance {-# OVERLAPPING #-}                    Take 0 where ..
 +          instance {-# OVERLAPPABLE #-} (Take (n - 1)) => Take n where ..
++
 +      And we have [W] Take 3.  That only matches one instance so we get
 +      [W] Take (3-1).  Then we should reduce the (3-1) to 2, and continue.
  Note [Shortcut solving: incoherence]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -603,37 +620,6 @@ The output of `main` if we avoid the optimization under the effect of
  IncoherentInstances is `1`. If we were to do the optimization, the output of
  `main` would be `2`.
 -Note [Shortcut try_solve_from_instance]
 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 -The workhorse of the short-cut solver is
 -    try_solve_from_instance :: (EvBindMap, DictMap CtEvidence)
 -                            -> CtEvidence       -- Solve this
 -                            -> MaybeT TcS (EvBindMap, DictMap CtEvidence)
 -Note that:
+-
 -* The CtEvidence is the goal to be solved
+-
 -* The MaybeT manages early failure if we find a subgoal that
 -  cannot be solved from instances.
+-
 -* The (EvBindMap, DictMap CtEvidence) is an accumulating purely-functional
 -  state that allows try_solve_from_instance to augment the evidence
 -  bindings and inert_solved_dicts as it goes.
+-
 -  If it succeeds, we commit all these bindings and solved dicts to the
 -  main TcS InertSet.  If not, we abandon it all entirely.
+-
 -Passing along the solved_dicts important for two reasons:
+-
 -* We need to be able to handle recursive super classes. The
 -  solved_dicts state  ensures that we remember what we have already
 -  tried to solve to avoid looping.
+-
 -* As #15164 showed, it can be important to exploit sharing between
 -  goals. E.g. To solve G we may need G1 and G2. To solve G1 we may need H;
 -  and to solve G2 we may need H. If we don't spot this sharing we may
 -  solve H twice; and if this pattern repeats we may get exponentially bad
 -  behaviour.
  Note [No Given/Given fundeps]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compiler/GHC/Tc/Solver/Monad.hs

@@ -942,20 +942,16 @@ The constraint solver can operate in different modes:
  * TcSVanilla: Normal constraint solving mode. This is the default.
  * TcSPMCheck: Used by the pattern match overlap checker.
 -      Like TcSVanilla, but the idea is that the returned InertSet will
 -      later be resumed, so we do not want to restore type-equality cycles
 -      See also Note [Type equality cycles] in GHC.Tc.Solver.Equality
 +  Like TcSVanilla, but the idea is that the returned InertSet will
 +  later be resumed, so we do not want to restore type-equality cycles
 +  See also Note [Type equality cycles] in GHC.Tc.Solver.Equality
  * TcSEarlyAbort: Abort (fail in the monad) as soon as we come across an
    insoluble constraint. This is used to fail-fast when checking for hole-fits.
    See Note [Speeding up valid hole-fits].
 -* TcSSpecPrag: Solve constraints fully or not at all. This is described in
 -  Note [TcSSpecPrag].
+-
 -  This mode is currently used in one place only: when solving constraints
 -  arising from specialise pragmas.
 -  See Note [Fully solving constraints for specialisation] in GHC.Tc.Gen.Sig.
 +* TcSShortCut: Solve constraints fully or not at all. This is described in
 +  Note [Shortcut solving] in GHC.Tc.Solver.Dict
  -}
  data TcSEnv
@@ -1126,54 +1122,6 @@ runTcSEarlyAbort tcs
    = do { ev_binds_var <- TcM.newTcEvBinds
         ; runTcSWithEvBinds' TcSEarlyAbort ev_binds_var tcs }
 --- | Run the 'TcS' monad in 'TcSSpecPrag' mode, which either fully solves
 --- individual Wanted quantified constraints or leaves them alone.
 ---
+-
 -{- Note [TcSSpecPrag]
 -~~~~~~~~~~~~~~~~~~~~~
 -The TcSSpecPrag mode is a specialized constraint solving mode that guarantees
 -that Wanted quantified constraints are either:
 -  - Fully solved with no free evidence variables, or
 -  - Left completely untouched (no simplification at all)
+-
 -Examples:
+-
 -  * [W] forall x. Eq x => Eq (f x).
+-
 -    In TcSSpecPrag mode, we **do not** process this quantified constraint by
 -    creating a new implication constraint; we leave it alone instead.
+-
 -  * [W] Eq (Maybe Int).
+-
 -    This constraint is solved fully, using two top-level Eq instances.
+-
 -  * [W] forall x. Eq x => Eq [x].
+-
 -    This constraint is solved fully as well, using the Eq instance for lists.
+-
 -This functionality is crucially used by the specialiser, for which taking an
 -irreversible constraint solving steps is actively harmful, as described in
 -Note [Fully solving constraints for specialisation] in GHC.Tc.Gen.Sig.
+-
 -Note that currently we **do not** refrain from using top-level instances,
 -even though we also can't run them in reverse; this isn't a problem for the
 -specialiser (which is currently the sole consumer of this functionality).
+-
 -The implementation is as follows: in TcSSpecPrag mode, when we are about to
 -solve a Wanted quantified constraint by emitting an implication, we call the
 -special function `solveCompletelyIfRequired`. This function recursively calls
 -the solver but in TcSVanilla mode (i.e. full-blown solving, with no restrictions).
 -If this recursive call manages to solve all the remaining constraints fully,
 -then we proceed with that outcome (i.e. we continue with that inert set, etc).
 -Otherwise, we discard everything that happened in the recursive call, and
 -continue with the original quantified constraint unchanged.
+-
 -In the future, we could consider re-using this functionality for the short-cut
 -solver (see Note [Shortcut solving] in GHC.Tc.Solver.Dict), but we would have to
 -be wary of the performance implications.
 --}
+-
  -- | This can deal only with equality constraints.
  runTcSEqualities :: TcS a -> TcM a
  runTcSEqualities thing_inside
@@ -1282,20 +1230,21 @@ setTcLevelTcS lvl (TcS thing_inside)
  nestImplicTcS :: EvBindsVar
                -> TcLevel -> TcS a
                -> TcS a
 -nestImplicTcS ref inner_tclvl (TcS thing_inside)
 +nestImplicTcS ev_binds_var inner_tclvl (TcS thing_inside)
    = TcS $ \ env@(TcSEnv { tcs_inerts = old_inert_var }) ->
      do { inerts <- TcM.readTcRef old_inert_var
         -- Initialise the inert_cans from the inert_givens of the parent
         -- so that the child is not polluted with the parent's inert Wanteds
 +       -- See Note [trySolveImplication] in GHC.Tc.Solver.Solve
 +       -- All other InertSet fields are inherited
         ; let nest_inert = inerts { inert_cycle_breakers = pushCycleBreakerVarStack
                                                              (inert_cycle_breakers inerts)
                                   , inert_cans = (inert_givens inerts)
                                                     { inert_given_eqs = False } }
 -                 -- All other InertSet fields are inherited
         ; new_inert_var <- TcM.newTcRef nest_inert
         ; new_wl_var    <- TcM.newTcRef emptyWorkList
 -       ; let nest_env = env { tcs_ev_binds = ref
 +       ; let nest_env = env { tcs_ev_binds = ev_binds_var
                              , tcs_inerts   = new_inert_var
                              , tcs_worklist = new_wl_var }
         ; res <- TcM.setTcLevel inner_tclvl $
@@ -1306,7 +1255,7 @@ nestImplicTcS ref inner_tclvl (TcS thing_inside)
  #if defined(DEBUG)
         -- Perform a check that the thing_inside did not cause cycles
 -       ; ev_binds <- TcM.getTcEvBindsMap ref
 +       ; ev_binds <- TcM.getTcEvBindsMap ev_binds_var
         ; checkForCyclicBinds ev_binds
  #endif
         ; return res }

compiler/GHC/Tc/Solver/Solve.hs

@@ -260,6 +260,11 @@ more meaningful error message (see T19627)
  This also applies for quantified constraints; see `-fqcs-fuel` compiler flag and `QCI.qci_pend_sc` field.
  -}
 +{- ********************************************************************************
 +*                                                                                 *
 +*                    Solving implication constraints                              *
 +*                                                                                 *
 +******************************************************************************** -}
  solveNestedImplications :: Bag Implication
                          -> TcS (Bag Implication)
@@ -280,7 +285,22 @@ solveNestedImplications implics
         ; return unsolved_implics }
 +{- Note [trySolveImplication]
 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 +`trySolveImplication` may be invoked while solving simple wanteds, notably from
 +`solveWantedForAll`.  It returns a Bool to say if solving succeeded or failed.
++
 +It used `nestImplicTcS` to build a nested scope.  One subtle point is that
 +`nestImplicTcS` uses the `inert_givens` (not the `inert_cans`) of the current
 +inert set to initialse the `InertSet` of the nested scope.  It super-important not
 +to pollute the sub-solving problem with the unsolved Wanteds of the current scope.
++
 +Whenever we do `solveSimpleGivens`, we snapshot the `inert_cans` into `inert_givens`.
 +(At that moment there should be no Wanteds.)
 +-}
++
  trySolveImplication :: Implication -> TcS Bool
 +-- See Note [trySolveImplication]
  trySolveImplication (Implic { ic_tclvl  = tclvl
                              , ic_binds  = ev_binds_var
                              , ic_given  = given_ids
@@ -977,6 +997,7 @@ solveSimpleGivens givens
         -- Capture the Givens in the inert_givens of the inert set
         -- for use by subsequent calls of nestImplicTcS
 +       -- See Note [trySolveImplication]
         ; updInertSet (\is -> is { inert_givens = inert_cans is })
         ; cans <- getInertCans
@@ -1368,6 +1389,8 @@ solveWantedForAll qci tvs theta body_pred
                               , unitBag (mkNonCanonical $ CtWanted wanted_ev)) }
         ; traceTcS "solveForAll {" (ppr skol_tvs $$ ppr given_ev_vars $$ ppr wanteds $$ ppr w_id)
++
 +       -- Try to solve the constraint completely
         ; ev_binds_var <- TcS.newTcEvBinds
         ; solved <- trySolveImplication $
                     (implicationPrototype loc_env)
@@ -1381,9 +1404,13 @@ solveWantedForAll qci tvs theta body_pred
         ; traceTcS "solveForAll }" (ppr solved)
         ; evbs <- TcS.getTcEvBindsMap ev_binds_var
         ; if not solved
 -         then do { addInertForAll qci
 +         then do { -- Not completely solved; abandon that attempt and add the
 +                   -- original constraint to the inert set
 +                   addInertForAll qci
                   ; stopWith (CtWanted wtd) "Wanted forall-constraint:unsolved" }
 -         else do { setWantedEvTerm dest EvCanonical $
++
 +         else do { -- Completely solved; build an evidence terms
 +                   setWantedEvTerm dest EvCanonical $
                     EvFun { et_tvs = skol_tvs, et_given = given_ev_vars
                           , et_binds = evBindMapBinds evbs, et_body = w_id }
                   ; stopWith (CtWanted wtd) "Wanted forall-constraint:solved" } }
@@ -1404,36 +1431,68 @@ solveWantedForAll qci tvs theta body_pred
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Solving a wanted forall (quantified) constraint
    [W] df :: forall a b. (Eq a, Ord b) => C x a b
 -is delightfully easy.   Just build an implication constraint
 +is delightfully easy in principle.   Just build an implication constraint
      forall ab. (g1::Eq a, g2::Ord b) => [W] d :: C x a
  and discharge df thus:
      df = /\ab. \g1 g2. let <binds> in d
  where <binds> is filled in by solving the implication constraint.
  All the machinery is to hand; there is little to do.
 -We can take a more straightforward parth when there is a matching Given, e.g.
 -  [W] dg :: forall c d. (Eq c, Ord d) => C x c d
 -In this case, it's better to directly solve the Wanted from the Given, instead
 -of building an implication. This is more than a simple optimisation; see
 -Note [Solving Wanted QCs from Given QCs].
 +There are some tricky corners though:
++
 +(WFA1) We can take a more straightforward parth when there is a matching Given, e.g.
 +          [W] dg :: forall c d. (Eq c, Ord d) => C x c d
 +    In this case, it's better to directly solve the Wanted from the Given, instead
 +    of building an implication. This is more than a simple optimisation; see
 +    Note [Solving Wanted QCs from Given QCs].
++
 +(WFA2) Termination: see #19690.  We want to maintain the invariant (QC-INV):
++
 +    (QC-INV) Every quantified constraint returns a non-bottom dictionary
++
 +  just as every top-level instance declaration guarantees to return a non-bottom
 +  dictionary.  But as #19690 shows, it is possible to get a bottom dicionary
 +  by superclass selection if we aren't careful.  The situation is very similar
 +  to that described in Note [Recursive superclasses] in GHC.Tc.TyCl.Instance;
 +  and we use the same solution:
++
 +  * Give the Givens a CtOrigin of (GivenOrigin (InstSkol IsQC head_size))
 +  * Give the Wanted a CtOrigin of (ScOrigin IsQC NakedSc)
++
 +  Both of these things are done in solveForAll.  Now the mechanism described
 +  in Note [Solving superclass constraints] in GHC.Tc.TyCl.Instance takes over.
++
 +(WFA3) We do not actually emit an implication to solve later.  Rather we
 +    try to solve it completely immediately using `trySolveImplication`
 +    - If successful, we can build evidence
 +    - If unsuccessful, we abandon the attempt and add the unsolved
 +      forall-constraint to the inert set.
 +    Several reasons for this "solve immediately" approach
 -The tricky point is about termination: see #19690.  We want to maintain
 -the invariant (QC-INV):
 +    - It saves quite a bit of plumbing, tracking the emitted implications for
 +      later solving; and the evidence would have to contain as-yet-incomplte
 +      bindings which complicates tracking of unused Givens.
 -  (QC-INV) Every quantified constraint returns a non-bottom dictionary
 +    - We get better error messages, about failing to solve, say
 +             (forall a. a->a) ~ (forall b. b->Int)
 -just as every top-level instance declaration guarantees to return a non-bottom
 -dictionary.  But as #19690 shows, it is possible to get a bottom dicionary
 -by superclass selection if we aren't careful.  The situation is very similar
 -to that described in Note [Recursive superclasses] in GHC.Tc.TyCl.Instance;
 -and we use the same solution:
 +    - Consider
 +        f :: forall f a. (Ix a, forall x. Eq x => Eq (f x)) => a -> f a
 +        {-# SPECIALISE f :: forall f. (forall x. Eq x => Eq (f x)) => Int -> f Int #-}
 +      This SPECIALISE is treated like an expression with a type signature, so
 +      we instantiate the constraints, simplify them and re-generalise.  From the
 +      instantaiation we get  [W] d :: (forall x. Eq a => Eq (f x))
 +      and we want to generalise over that.  We do not want to attempt to solve it
 +      and them get stuck, and emit an error message.  If we can't solve it, better
 +      to leave it alone
 -* Give the Givens a CtOrigin of (GivenOrigin (InstSkol IsQC head_size))
 -* Give the Wanted a CtOrigin of (ScOrigin IsQC NakedSc)
 +      We still need to simplify quantified constraints that can be
 +      /fully solved/ from instances, otherwise we would never be able to
 +      specialise them away. Example: {-# SPECIALISE f @[] @a #-}.
 -Both of these things are done in solveForAll.  Now the mechanism described
 -in Note [Solving superclass constraints] in GHC.Tc.TyCl.Instance takes over.
 +    You might worry about the wasted work, but it is seldom repeated (because the
 +    constraint solver seldom iterates much).
  Note [Solving a Given forall-constraint]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~