[GHC] #13848: Unexpected order of variable quantification with GADT constructor

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 (Type checker) | Keywords: | Operating System: Unknown/Multiple TypeApplications | Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Here's some code: {{{#!hs {-# LANGUAGE DataKinds #-} {-# LANGUAGE ExistentialQuantification #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE KindSignatures #-} {-# LANGUAGE ScopedTypeVariables #-} module Foo where data N = Z | S N data Vec (n :: N) a where VNil :: forall a. Vec Z a VCons :: forall n a. a -> Vec n a -> Vec (S n) a }}} I want to use `TypeApplications` on `VCons`. I tried doing so in GHCi: {{{ GHCi, version 8.2.0.20170523: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/rgscott/.ghci [1 of 1] Compiling Foo ( Foo.hs, interpreted ) Ok, modules loaded: Foo. λ> :set -XTypeApplications -XDataKinds λ> :t VCons @Z @Int 1 VNil <interactive>:1:8: error: • Expected a type, but ‘'Z’ has kind ‘N’ • In the type ‘Z’ In the expression: VCons @Z @Int 1 VNil <interactive>:1:11: error: • Expected kind ‘N’, but ‘Int’ has kind ‘*’ • In the type ‘Int’ In the expression: VCons @Z @Int 1 VNil <interactive>:1:17: error: • Couldn't match type ‘'Z’ with ‘Int’ Expected type: Vec Int 'Z Actual type: Vec 'Z 'Z • In the fourth argument of ‘VCons’, namely ‘VNil’ In the expression: VCons @Z @Int 1 VNil }}} Huh? That's strange, I would have expected the first type application to be of kind `N`, and the second to be of kind `*`. But GHC disagrees! {{{ λ> :set -fprint-explicit-foralls λ> :type +v VCons VCons :: forall a (n :: N). a -> Vec n a -> Vec ('S n) a }}} That's downright unintuitive to me, since I explicitly specified the order in which the quantified variables should appear in the type signature for `VCons`. Similarly, if you leave out the `forall`s: {{{#!hs data Vec (n :: N) a where VNil :: Vec Z a VCons :: a -> Vec n a -> Vec (S n) a }}} Then `:type +v` will also report the same quantified variable order for `VCons`. This is perhaps less surprising, since the `n` and `a` in `data Vec (n :: N) a` don't scope over the constructors, so GHC must infer the topological order in which the variables appear in each constructor. But I would certainly not expect GHC to do this if I manually specify the order with `forall`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): Above, I implied that GHC was inferring the topological order for `a` and `n` in the type signature for `VCons` even with an explicit `forall`. I just realized there's more to it after reading `Note [mkGADTVars]`. [http://git.haskell.org/ghc.git/blob/5c93df90a96494229b60bbed0971a4b08c0326a6... The relevant excerpt]: {{{ Note [mkGADTVars] ~~~~~~~~~~~~~~~~~ Running example: data T (k1 :: *) (k2 :: *) (a :: k2) (b :: k2) where MkT :: T x1 * (Proxy (y :: x1), z) z We need the rejigged type to be MkT :: forall (x1 :: *) (k2 :: *) (a :: k2) (b :: k2). forall (y :: x1) (z :: *). (k2 ~ *, a ~ (Proxy x1 y, z), b ~ z) => T x1 k2 a b You might naively expect that z should become a universal tyvar, not an existential. (After all, x1 becomes a universal tyvar.) The problem is that the universal tyvars must have exactly the same kinds as the tyConTyVars. z has kind * while b has kind k2. So we need an existential tyvar and a heterogeneous equality constraint. }}} So what's //really// going on is that GHC is putting all of the universally quantified variables (`a`) before the existentially quantified ones (`n`). Still, I still think that GHC ought to adhere to what the user wrote if they bother to write an explicit `forall`, because otherwise you have to trace out the variable order that GHC happens to infer. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): The question remains: how could we preserve the order in which the user wrote the `forall`d variables? The assumption that they should be partitioned into separate universal and existential groups is baked pretty deeply into the GHC AST, as `DataCon` has separate `dcUnivTyVars` and `dcExTyVars` fields, and never the twain shall meet. One possible solution is to introduce a `dcOrigTyVars :: Maybe [TyVar]` field, which is `Just` the `forall`d variables (in whatever order the user writes) or `Nothing` otherwise. GHC could then use this information when computing `dataConUserType`, which I believe GHC uses for determining the order in which `TypeApplications` happen. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Hmm. Fortunately we already have just the hook we need. Every interesting data constructor (including every GADT data con) has a wrapper; it is built by `MkId.mkDataConRep`. Look at `wrapper_reqd` to see which data cons have wrappers. The wrapper mediates between the nice user view of the data constructor and the internal "representation data constructor" that GHC uses internally. So for example {{{ data T a where MkT :: forall p q. p -> q -> T (p,q) }}} The representation data con has this type {{{ MkT :: forall r. forall p q. (r ~ (p,q)) => p -> q -> T r }}} But the wrapper is defined like this {{{ $WMkT :: forall p q. p -> q -> T (p,q) $WMkT = /\p q. \(x::p) (y::q). MkT @(p,q) @p @q <(p,q)> x y }}} The `<(p,q)>` is a coercion argument (refl) witnessing `(p,q)~(p,q)` Now currently `dataConUserType` (which claims to show the user-written type of the data con) is thus (in `DataCon`): {{{ dataConUserType (MkData { dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs, dcEqSpec = eq_spec, dcOtherTheta = theta, dcOrigArgTys = arg_tys, dcOrigResTy = res_ty }) = mkForAllTys (filterEqSpec eq_spec univ_tvs) $ mkForAllTys ex_tvs $ mkFunTys theta $ mkFunTys arg_tys $ res_ty }}} This `filterEqSpec` business is fishy! But now we understand about wrappers, for data constructors that have wrappers we could insead use the type of the wrapper, thus: {{{ dataConUserTYpe (MkData { dcRep = rep , dcRepType = rep_ty } = case rep of DCR { dcr_wrap_id = wrap_id } -> idType wrap_id NoDataConRep -> rep_ty -- If there is no wrapper, the "rep-type" is the same -- as the "user type" }}} Whizzo. Now, to return to the ticket, * We should ensure that the wrapper type reflects exactly the type the user wrote including type variable order * That in turn may force to make a wrapper when we don't right now, just to swizzle round the type variables Make sense? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I don't think it's going to be that simple. If you look at the implementation of `mkDataConRep`, you'll notice that it figures out the type of the `wrap_id` like this: {{{#!hs wrap_ty = dataConUserType data_con }}} So obviously we don't want to change `dataConUserType`'s implementation to use `idType wrap_id`, since that would be a circular definition.
We should ensure that the wrapper type reflects exactly the type the user wrote including type variable order
Sure, but that's easier said than done. The only info you have in `mkDataConId` to determine the wrapper type is the `DataCon` itself, and at that point, the type variables have already been carved up into the universal and existential ones, with no way to recover the original order. So I don't see any way to make this work without having a separate `dcOrigTyVars` fields, as proposed in comment:2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Well it would take a bit more plumbing, I agree. You'd need to take the `qtkvs` returned by `tcGadtSigType` in `TcTyClsDecls.tcConDecl` and pass it on to `buildDataCon`, which can in turn give them to `mkDataConRep` to use instead of calling `dataConUserType`. I'm not certain that's all, but I think so. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): OK, I think I'm starting to see the picture more clearly. However, some of the details are still hazy to me. In particular, `tcConDecl` is far from the only place that uses `buildDataCon` to construct a `DataCon`. However, as far as I can tell, `tcConDecl` is the //only// place that knows exactly what order the user wrote the constructor's type variables in. This is important because we might write a GADT to an interface file and read it back in [http://git.haskell.org/ghc.git/blob/78c80c250021ccb7a84afaabebe0d69f9b9372ee... tyConToIfaceDecl], which also uses `buildDataCon`. However, we've lost the order of the type variables used in the wrapper type by that point, since `IfaceData` doesn't store the wrapper type! This suggests to me that we need to beef up `IfaceData` to accommodate this change. Should we be storing the wrapper type as a field of `IfaceData` as well? Or am I barking up the wrong tree? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Yes you are right. So yes, we have to put more info in the `IfaceConDecl`. Putting in the wrapper type doesn't feel right -- duplicates too much. Better to put in the original tyvars. And in that case we may as well record them in the `DataCon` too, as you suggested earlier. Then we wouldn't need to pass them to `mkDataConRep` after all, because they'll be in the `DataCon`. One fiddly thing I'm not sure about: in the declaration of `IfaceConDecl` we see {{{ data IfaceConDecl = IfCon { ifConName :: IfaceTopBndr, -- Constructor name ifConWrapper :: Bool, -- True <=> has a wrapper ifConInfix :: Bool, -- True <=> declared infix -- The universal type variables are precisely those -- of the type constructor of this data constructor -- This is *easy* to guarantee when creating the IfCon -- but it's not so easy for the original TyCon/DataCon -- So this guarantee holds for IfaceConDecl, but *not* for DataCon }}} Reasoning is explained in `MkIface` where we convert a `DataCon` to a `ConDecl` {{{ -- Tidy the univ_tvs of the data constructor to be identical -- to the tyConTyVars of the type constructor. This means -- (a) we don't need to redundantly put them into the interface file -- (b) when pretty-printing an Iface data declaration in H98-style syntax, -- we know that the type variables will line up -- The latter (b) is important because we pretty-print type constructors -- by converting to IfaceSyn and pretty-printing that con_env1 = (fst tc_env1, mkVarEnv (zipEqual "ifaceConDecl" univ_tvs tc_tyvars)) -- A bit grimy, perhaps, but it's simple! }}} Meddling with `IfaceConDecl` is not a big deal... it's just a serialisation format, and changes are very localised. So feel free to suggest what to do. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler (Type | Version: 8.0.1 checker) | Keywords: Resolution: duplicate | TypeApplications Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #11721 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * status: new => closed * resolution: => duplicate * related: => #11721 Comment: Oh dear, I just realized this is a duplicate of #11721. Closing in favor of that ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13848#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13848: Unexpected order of variable quantification with GADT constructor
-------------------------------------+-------------------------------------
Reporter: RyanGlScott | Owner: (none)
Type: bug | Status: closed
Priority: normal | Milestone:
Component: Compiler (Type | Version: 8.0.1
checker) | Keywords:
Resolution: duplicate | TypeApplications
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #11721 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari
participants (1)
-
GHC