[GHC] #14060: TH-reified types can suffer from kind signature oversaturation

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Template | Version: 8.2.1 Haskell | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I ran into this issue when attempting to fix #11785 in `TcSplice`. Here's a simple Template Haskell program: {{{#!hs {-# LANGUAGE DataKinds #-} {-# LANGUAGE TemplateHaskell #-} module Bug where import Data.Proxy import Language.Haskell.TH newtype Foo = Foo (Proxy '[False, True, False]) $(return []) main :: IO () main = putStrLn $(reify ''Foo >>= stringE . pprint) }}} The output of this program is unfortunately noisy: {{{ $ runghc Bug.hs newtype Bug.Foo = Bug.Foo (Data.Proxy.Proxy ((':) 'GHC.Types.False ((':) 'GHC.Types.True ((':) 'GHC.Types.False ('[] :: [GHC.Types.Bool]) :: [GHC.Types.Bool]) :: [GHC.Types.Bool]) :: [GHC.Types.Bool])) }}} Somehow, we've ended up with four copies of a `[GHC.Types.Bool]` kind annotation! It definitely feels like Template Haskell could be more conservative here in choosing which types get explicit kind annotations - at the most, I'd only expect `'[]` to get one. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Template Haskell | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): [http://git.haskell.org/ghc.git/blob/7089dc2f12f9616771fc1de143e9b974157405d8... Here] is why Template Haskell is choosing to attach a kind signature to every occurrence of `'[]` and `'(:)`: {{{#!hs {- Note [Kind annotations on TyConApps] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A poly-kinded tycon sometimes needs a kind annotation to be unambiguous. For example: type family F a :: k type instance F Int = (Proxy :: * -> *) type instance F Bool = (Proxy :: (* -> *) -> *) It's hard to figure out where these annotations should appear, so we do this: Suppose the tycon is applied to n arguments. We strip off the first n arguments of the tycon's kind. If there are any variables left in the result kind, we put on a kind annotation. But we must be slightly careful: it's possible that the tycon's kind will have fewer than n arguments, in the case that the concrete application instantiates a result kind variable with an arrow kind. So, if we run out of arguments, we conservatively put on a kind annotation anyway. This should be a rare case, indeed. Here is an example: data T1 :: k1 -> k2 -> * data T2 :: k1 -> k2 -> * type family G (a :: k) :: k type instance G T1 = T2 type instance F Char = (G T1 Bool :: (* -> *) -> *) -- F from above Here G's kind is (forall k. k -> k), and the desugared RHS of that last instance of F is (G (* -> (* -> *) -> *) (T1 * (* -> *)) Bool). According to the algorithm above, there are 3 arguments to G so we should peel off 3 arguments in G's kind. But G's kind has only two arguments. This is the rare special case, and we conservatively choose to put the annotation in. See #8953 and test th/T8953. -} }}} To summarize, `'[]` gets a kind annotation because its kind is `[k]`, which has a kind variable. Fair enough. But what about `'(:) 'False ('[] :: [Bool])`? The kind of `'(:)` is `forall k. k -> [k] -> [k]`. It is applied to two arguments, so we drop the first two arguments to `'(:)`'s kind, leaving us with `[k]`. That has a kind variable, so TH chooses to attach a kind signature to it. But this doesn't feel right, does it? After all, `'(:)`'s first arguments have monomorphic kinds, which makes the kind of `'(:) 'False ('[] :: [Bool])` (`[Bool]`) monomorphic as well. Therefore, I propose this additional stipulation: if the tycon is applied to any required arguments, then apply their kinds to the tycon's kind before doing further analysis. That way, in the `'(:) 'False ('[] :: [Bool])` example, we'd start with `'(:)`'s kind being `Bool -> [Bool] -> [Bool]`, drop the first two arguments and be left with `[Bool]`, which is monomorphic (and thus does not warrant a kind annotation). On the other hand, `[]`'s argument `Bool` is not required, so we end up with a kind of `[k]`, which warrants a kind annotation as expected. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Template Haskell | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by goldfire): I'm not sure what "if the tycon is applied to any required arguments, then apply their kinds to the tycon's kind before doing further analysis" means. I suppose it could mean that you try to solve for the kind variables by doing unification and seeing if unification fixes the kind variables. Then apply the unifying substitution to the remainder of the kind. But that's quite a bit of work! Define "remainder of the tycon's kind" to mean the portion of the kind left over after all components of the kind corresponding to provided type arguments are removed. Do no instantiation of kind variables here. The current rule is: A. Include a kind annotation when the remainder of the tycon's kind mentions a kind variable. What if we change it to: B. Include a kind annotation when the remainder of the tycon's kind mentions a kind variable ''and'' that kind variable is not mentioned in any arguments that have been dropped (when finding the remainder). (But don't count occurrences in the arguments to non-injective type families.) Why is this new rule good? Because we're assuming that any reified type will have a fixed, known kind. When reifying a type application, we reify all its type arguments as well. Thus, all the type arguments will have a fixed, known kind. And thus any kind variables in the tycon's kind mentioned in that argument's expected kind will also be fixed. The exception here is for type families, which aren't injective. Happily, this keeps the annotation on `'[]` but drops it on `'(:)`. I will restate (B) to be a bit more implementable: * We have a tyconapp `T ty1 ... tyn`. The tycon `T` has kind `ki0`. * If `T` is oversaturated, no annotation is necessary. Otherwise, we can assume `ki0` has the form `forall k1 ... km. s1 -> ... -> sn -> p`. * Let `K` be the set of kind variables (that is, a subset of `{k1, ..., km}`) that occur in "injective positions" in `s1 ... sn`. An injective position in a type `ty` is either `ty` itself, an injective position within an injective argument to a tycon, an injective position in a function, or an injective position in the argument to a type variable. * Include an annotation only when `p` (the remainder of the tycon's kind) contains a kind variable ''not'' in the set `K`. What do we think? Might this work? I do think this is considerably easier than invoking unification. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Template Haskell | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): After re-reading, I'm not sure myself what "if the tycon is applied to any required arguments, then apply their kinds to the tycon's kind before doing further analysis" means. But you've laid out a much better plan of attach than I ever could. Thanks!
If `T` is oversaturated, no annotation is necessary. Otherwise, we can assume `ki0` has the form `forall k1 ... km. s1 -> ... -> sn -> p`.
An injective position in a type `ty` is either `ty` itself, an injective
What happens if we have a dependent quantifier like `forall km -> s1 -> ... -> sn -> p`? Would `km` be included in `K`? My hunch is "yes". position within an injective argument to a tycon, an injective position in a function, or an injective position in the argument to a type variable. The "`ty` itself" bit confuses me. Did you mean "a `ty` itself is in an injective position if `ty` is a tyvar"? I'm guessing you didn't intend for it to be interpreted as "a `ty` itself is always in an injective position", since that would mean that all the variables in `InjTyFam a ... z` would count, where `InjTyFam` is an injective type family.
I do think this is considerably easier than invoking unification.
Certainly. But I know close to nothing about how GHC's unifier works, so this is a pretty low bar for me personally :) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Template Haskell | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by goldfire): Yes, a dependent quantifier would be in `K`. Yes about your initial understanding about "`ty` itself": I was saying that the tyvar `a` is in an injective position within the type `a`. Actually, GHC's pure unifier is easy enough to invoke, but I think that route is likely to be harder to reason about than my algorithm above. Yes, it's a bit long to state, but I think just a few lines of Haskell code will implement it nicely. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: Component: Template Haskell | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3807 Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * status: new => patch * differential: => Phab:D3807 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14060: TH-reified types can suffer from kind signature oversaturation
-------------------------------------+-------------------------------------
Reporter: RyanGlScott | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Template Haskell | Version: 8.2.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D3807
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#14060: TH-reified types can suffer from kind signature oversaturation -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.4.1 Component: Template Haskell | Version: 8.2.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3807 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed * milestone: => 8.4.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14060#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC