How does topological sorting of kind variables really work?

I'm doing some work in GHC that requires explicitly applying kinds to the kind variables of a data type. I noticed that the order in which the implicit kind variables are sorted is not as predictable as I originally thought. To illustrate what I mean, here is a GHCi session I had recently (with GHC HEAD): $ /opt/ghc/head/bin/ghci -XTypeInType GHCi, version 8.1.20160412: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/ryanglscott/.ghci > data T (a :: j -> k -> l) (b :: (x, y, z)) = MkT > :t MkT MkT :: forall x y z l k j (a :: j -> k -> l) (b :: (x, y, z)). T a b Much to my surprise, the order of the kind variables wasn't [j, k, l, x, y, z], but rather [x, y, z, l, k, j]! I know that type variables are sorted with a stable topological sort [1], but there must be some other subtlety that I'm missing. How does GHC come up with the order [x, y, z, l, k, j]? Ryan S. ----- [1] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97...

I don't think this has to do with topological sorting so much as inferring the order of undeclared (but written) kind variables. You could, with -XTypeInType, always make the order explicit:
data T :: forall j k l x y z. (j -> k -> l) -> (x, y, z) -> * where MkT :: T a b
The order of these variables is derived from the extractHsTyRdrTyVars function and friends in RnTypes. I'm sure you could look closely at those functions and see where the behavior is coming from. It's very likely an improvement can be found here.
Richard
On Apr 12, 2016, at 7:47 PM, Ryan Scott
I'm doing some work in GHC that requires explicitly applying kinds to the kind variables of a data type. I noticed that the order in which the implicit kind variables are sorted is not as predictable as I originally thought. To illustrate what I mean, here is a GHCi session I had recently (with GHC HEAD):
$ /opt/ghc/head/bin/ghci -XTypeInType GHCi, version 8.1.20160412: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/ryanglscott/.ghci
data T (a :: j -> k -> l) (b :: (x, y, z)) = MkT :t MkT MkT :: forall x y z l k j (a :: j -> k -> l) (b :: (x, y, z)). T a b
Much to my surprise, the order of the kind variables wasn't [j, k, l, x, y, z], but rather [x, y, z, l, k, j]! I know that type variables are sorted with a stable topological sort [1], but there must be some other subtlety that I'm missing. How does GHC come up with the order [x, y, z, l, k, j]?
Ryan S. ----- [1] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97... _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Aha! For some reason, it never clicked that you could use forall
within a datatype's kind signature like that. That technique would, at
the very least, allow me to guarantee the order in which the kind
variables appear. Thanks!
It's a bit of a shame that type inference doesn't give you a reliable
order, but I suppose if you rely on -XTypeApplications working a
certain way, you're taking a risk by NOT using a forall.
Ryan S.
On Tue, Apr 12, 2016 at 10:30 PM, Richard Eisenberg
I don't think this has to do with topological sorting so much as inferring the order of undeclared (but written) kind variables. You could, with -XTypeInType, always make the order explicit:
data T :: forall j k l x y z. (j -> k -> l) -> (x, y, z) -> * where MkT :: T a b
The order of these variables is derived from the extractHsTyRdrTyVars function and friends in RnTypes. I'm sure you could look closely at those functions and see where the behavior is coming from. It's very likely an improvement can be found here.
Richard
On Apr 12, 2016, at 7:47 PM, Ryan Scott
wrote: I'm doing some work in GHC that requires explicitly applying kinds to the kind variables of a data type. I noticed that the order in which the implicit kind variables are sorted is not as predictable as I originally thought. To illustrate what I mean, here is a GHCi session I had recently (with GHC HEAD):
$ /opt/ghc/head/bin/ghci -XTypeInType GHCi, version 8.1.20160412: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/ryanglscott/.ghci
data T (a :: j -> k -> l) (b :: (x, y, z)) = MkT :t MkT MkT :: forall x y z l k j (a :: j -> k -> l) (b :: (x, y, z)). T a b
Much to my surprise, the order of the kind variables wasn't [j, k, l, x, y, z], but rather [x, y, z, l, k, j]! I know that type variables are sorted with a stable topological sort [1], but there must be some other subtlety that I'm missing. How does GHC come up with the order [x, y, z, l, k, j]?
Ryan S. ----- [1] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97... _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Apr 12, 2016, at 10:44 PM, Ryan Scott
It's a bit of a shame that type inference doesn't give you a reliable order, but I suppose if you rely on -XTypeApplications working a certain way, you're taking a risk by NOT using a forall.
Inference is meant to give you a reliable ordering. If it's not, that's a bug!

Well, I'd be hesitant to call it "reliable":
$ /opt/ghc/8.0.1/bin/ghci -XRankNTypes -XTypeInType
-fprint-explicit-foralls -fprint-explicit-kinds
GHCi, version 8.0.0.20160411: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /home/ryanglscott/.ghci
> import Data.Proxy
> let wat :: forall (a :: j -> k -> l) (b :: (x, y, z)). Proxy a
-> Proxy b; wat = undefined
> :t wat
wat
:: forall {j} {k} {l} {x} {y} {z} {a :: j -> k -> l} {b :: (x, y,
z)}.
Proxy (j -> k -> l) a -> Proxy (x, y, z) b
> data Wat (a :: j -> k -> l) (b :: (x, y, z)) = Wat (Proxy a) (Proxy b)
> :t Wat
Wat
:: forall {x} {y} {z} {l} {k} {j} {a :: j -> k -> l} {b :: (x, y,
z)}.
Proxy (j -> k -> l) a -> Proxy (x, y, z) b -> Wat x y z l k j a b
In the example above, a plain old function gives me the ordering I'd
expect, but a data constructor does not. Now, maybe there's something
tricky involving data that makes this different - I'm not qualified
enough to say for sure. But it has made me a lot more paranoid about
checking the order of the variables with :t before I use
-XTypeApplications, since you never know what you're going to get
sometimes.
Ryan S.
On Tue, Apr 12, 2016 at 10:45 PM, Richard Eisenberg
On Apr 12, 2016, at 10:44 PM, Ryan Scott
wrote: It's a bit of a shame that type inference doesn't give you a reliable order, but I suppose if you rely on -XTypeApplications working a certain way, you're taking a risk by NOT using a forall.
Inference is meant to give you a reliable ordering. If it's not, that's a bug!

| > It's a bit of a shame that type inference doesn't give you a reliable | > order, but I suppose if you rely on -XTypeApplications working a | > certain way, you're taking a risk by NOT using a forall. | | Inference is meant to give you a reliable ordering. If it's not, that's a | bug! I thought that inferred foralls are Invisible binders (see Note [TyBinders and VisibilityFlags]); and you can't use visible type application for Invisible binders. I thought that was because it's hard to specify the order of inferred type variable, just as Ryan says. So /is/ inference meant to give a reliable ordering? If so why? Also I don't understand why Ryan needs a reliable ordering. So you can see I'm a bit confused here Simon

I thought that inferred foralls are Invisible binders (see Note [TyBinders and VisibilityFlags]); and you can't use visible type application for Invisible binders.
In this particular case, j, k, l, x, y, and z aren't invisible binders, but rather specified ones. And you certainly can use visible type application on specified binders; this is important for my use case.
Also I don't understand why Ryan needs a reliable ordering.
I guess I should elaborate on what exactly my use case is. I'm trying to fix Trac #10604 (i.e., make Generic1 poly-kinded). As part of this effort, I need to generalize the (:.:) type [2] from GHC.Generics like so: newtype (:.:) (f :: k2 -> *) (g :: k1 -> k2) (p :: k1) = Comp1 (f (g a)) I also need to change a part of TcGenGenerics where I apply the (:.:) type constructor to some other types [3]. Previously, (:.:) was monokinded, so I only needed to apply (:.:) to some type arguments. But now that (:.:) is poly-kinded, I need to also apply it to two kind arguments! Now for the interesting part: in which order do you give the kind arguments for (:.:)? I originally thought the order in which you'd do it would be [k2,k1] (since that's the order they appear in the definition from left to right), but in reality, it's actually [k1,k2] due to the reasons mentioned earlier in this thread. This is quite bewildering, especially since in GHC 7.10.3, it was the former order, but in GHC 8.0 it's the latter order. Who's to say that the order won't change again in a later GHC, and then deriving Generic1 will silently break as a result? So at this point, I see my options as follows: 1. Change the definition of (:.:) to use the GADT syntax that Richard proposed. This would give a stable ordering for the kind variables, but I'm loathe to make such a drastic change to the surface syntax to fix a silly problem like this. 2. Investigate a fix to the implicitly quantified ordering issue. I'm not sure where the issue could lie, and just from looking at extract_lty [4], it's not clear at all what causes the current ordering. 3. Live with the issue, and just change the code in TcGenGenerics if the kind variable ordering ever changes. This is a bummer, but looks like the simplest option. Ryan S. ----- [1] https://ghc.haskell.org/trac/ghc/ticket/10604 [2] http://hackage.haskell.org/package/base-4.8.2.0/docs/GHC-Generics.html#t::.: [3] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97... [4] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97...

I can't say for sure, but this could be related to #4012.
The order that things appear in lists often depends on the way that uniques
get allocated. I know for sure that inferred type signatures are
nondeterministic right now, see [1] for an attempt to fix it.
The easiest way to check if it affects you is to reverse the order of
allocated uniques by passing `-dunique-increment=-1` on the command line.
If you rely on orders lining up I'd also recommend testing with it. See
#11148, #11362, #11361 for examples where it failed with reversed uniques.
[1] https://phabricator.haskell.org/D1924
2016-04-13 15:12 GMT+01:00 Ryan Scott
I thought that inferred foralls are Invisible binders (see Note [TyBinders and VisibilityFlags]); and you can't use visible type application for Invisible binders.
In this particular case, j, k, l, x, y, and z aren't invisible binders, but rather specified ones. And you certainly can use visible type application on specified binders; this is important for my use case.
Also I don't understand why Ryan needs a reliable ordering.
I guess I should elaborate on what exactly my use case is. I'm trying to fix Trac #10604 (i.e., make Generic1 poly-kinded). As part of this effort, I need to generalize the (:.:) type [2] from GHC.Generics like so:
newtype (:.:) (f :: k2 -> *) (g :: k1 -> k2) (p :: k1) = Comp1 (f (g a))
I also need to change a part of TcGenGenerics where I apply the (:.:) type constructor to some other types [3]. Previously, (:.:) was monokinded, so I only needed to apply (:.:) to some type arguments. But now that (:.:) is poly-kinded, I need to also apply it to two kind arguments!
Now for the interesting part: in which order do you give the kind arguments for (:.:)? I originally thought the order in which you'd do it would be [k2,k1] (since that's the order they appear in the definition from left to right), but in reality, it's actually [k1,k2] due to the reasons mentioned earlier in this thread. This is quite bewildering, especially since in GHC 7.10.3, it was the former order, but in GHC 8.0 it's the latter order. Who's to say that the order won't change again in a later GHC, and then deriving Generic1 will silently break as a result?
So at this point, I see my options as follows:
1. Change the definition of (:.:) to use the GADT syntax that Richard proposed. This would give a stable ordering for the kind variables, but I'm loathe to make such a drastic change to the surface syntax to fix a silly problem like this. 2. Investigate a fix to the implicitly quantified ordering issue. I'm not sure where the issue could lie, and just from looking at extract_lty [4], it's not clear at all what causes the current ordering. 3. Live with the issue, and just change the code in TcGenGenerics if the kind variable ordering ever changes. This is a bummer, but looks like the simplest option.
Ryan S. ----- [1] https://ghc.haskell.org/trac/ghc/ticket/10604 [2] http://hackage.haskell.org/package/base-4.8.2.0/docs/GHC-Generics.html#t:: .: [3] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97... [4] http://git.haskell.org/ghc.git/blob/d81cdc227cd487659995ddea577214314c9b4b97... _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ah, I hadn't considered what effect nondeterminism might have on this. Here's a quick experiment: With normal uniques: $ /opt/ghc/8.0.1/bin/ghci -XTypeInType -fprint-explicit-kinds -fprint-explicit-foralls GHCi, version 8.0.0.20160411: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/ryanglscott/.ghci λ> data T (a :: j -> k -> l) (b :: (x, y, z)) = MkT λ> :info T type role T nominal nominal nominal nominal nominal nominal phantom phantom data T x y z l k j (a :: j -> k -> l) (b :: (x, y, z)) = MkT -- Defined at <interactive>:1:1 λ> :type MkT MkT :: forall {x} {y} {z} {l} {k} {j} {a :: j -> k -> l} {b :: (x, y, z)}. T x y z l k j a b With reversed uniques: $ /opt/ghc/8.0.1/bin/ghci -XTypeInType -fprint-explicit-kinds -fprint-explicit-foralls -dunique-increment=-1 -dinitial-unique=16777000 GHCi, version 8.0.0.20160411: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/ryanglscott/.ghci λ> data T (a :: j -> k -> l) (b :: (x, y, z)) = MkT λ> :info T type role T nominal nominal nominal nominal nominal nominal phantom phantom data T x y z l k j (a :: j -> k -> l) (b :: (x, y, z)) = MkT -- Defined at <interactive>:1:1 λ> :type MkT MkT :: forall {j} {k} {l} {z} {y} {x} {b :: (x, y, z)} {a :: j -> k -> l}. T x y z l k j a b So if :info is to be believed, uniques don't (appear to) have an effect on the order in which the kind variables are bound in the type constructor, but they do have an effect with :type (which isn't surprising, since :type re-generalizes the type signature, and generalization is precisely what you found to be non-deterministic). Ryan S.

On Apr 13, 2016, at 5:17 AM, Simon Peyton Jones
I thought that inferred foralls are Invisible binders (see Note [TyBinders and VisibilityFlags]); and you can't use visible type application for Invisible binders.
No. See that Note. A binder is Invisible only when it is completely unmentioned in the source. The implicitly-bound kind variables Ryan uses are meant to be Specified.
I thought that was because it's hard to specify the order of inferred type variable, just as Ryan says.
Once the user has written something, then we have a left-to-right ordering to guide us.
So /is/ inference meant to give a reliable ordering? If so why?
We had decided that a signature `f :: a -> a` meant that `a` is specified. It seemed silly if `data P (a :: k)` did not mean that `k` is specified, too. Another confounding detail here: the `:type` command now fully instantiates a type and then regeneralizes. So any output of `:type` can't be relied upon to get the variable ordering correct. For simple functions, you can use `:info`. But for data constructors, for example, GHC currently provides no way to see which variables are specified, and what order they appear in. See #11376. It is my intent that any variables mentioned by the user appear in a predictable order in types. That does not seem to be the case here, so this is a bug. Richard
participants (4)
-
Bartosz Nitka
-
Richard Eisenberg
-
Ryan Scott
-
Simon Peyton Jones