
Thanks, Alexis, for the detailed explanation about how "type variables" in Haskell are doing a sort of "triple duty". I think my unconscious understanding of type variables with distinct names was 1. they can be chosen independently 2. they don't unify 1 is not true, not least because of MultiParamTypeClasses. I can't choose `a` and `b` completely indepently in the following type signature foo :: C a b => a -> b I can only choose them independently to the extent that a `C` instance exists. Moreover, type equality constraints mean that 2 is not true! The `a` and `b` in the following do indeed unify[1] bar :: a ~ b => a -> b The simple world that Hindly-Milner-Damas created for us becomes more complex when extensions that affect the type system come into play. I'm reassured that type variable bindings in patterns in case alternatives can now bind monomorphic types, that is case () of (() :: t) -> () is now permitted. The previous behaviour seemed inconsistent. On the other hand I'm still somewhat concerned that the meaning of the syntax can vary depending on whether the type variable is in scope or not. That is, I'm not comfortable that the same syntax is used in baz1 to assert that an expression *has a type* as is used in baz2 to *create a fresh name* for a type that an expression has. baz1 :: forall a. a -> () baz1 a = case () of (() :: a) -> () baz2 :: forall a. a -> () baz2 a = case () of (() :: b) -> () "let" does not suffer this inconsistency. The latter is simply not supported. I realise that some syntax must offer the ability to come up with a fresh name, I'm just not convinced that this overloading is the way to do it. To me it feels like "let x = exp" meaning "if `x` is not in scope then bind the value of `exp` to the variable `x`. If `x` is in scope, then check that it is equal to `exp` and if not then crash.". Anyway, I'm sure that opens a whole new can of worms, so thanks for your clarifying remarks to help get my understanding to here! Tom [1] Perhaps I'm conflating an implementation ("unification") with a specification (some semantic notion of what "type equality" means) but I hope my intention here is clear enough. On Mon, Aug 17, 2020 at 03:52:22PM -0500, Alexis King wrote:
I think this behavior is obscured by the way Haskell uses the term “type variable” to refer to three different things:
Variables that refer to types and are bound within the type namespace.
Skolem type constants introduced by a `forall`.
Solver/unification variables, aka “metavariables”, which aren’t really types at all but holes to be filled in with types.
These are all subtly different, but they are so frequently treated as if they were the same that I suspect listing them out is not actually enough for many people to understand the distinctions! So let me elaborate.
Suppose we view `forall` like a type-level lambda, so `forall a. a -> a` is analogous to this term:
\a -> a :-> a
Clearly, there is a difference between the variable `a` in the above expression and the value we eventually substitute for it when we apply the lambda. When we instantiate the function by doing, say, `id @Int`, we substitute `Int` for `a` to get `Int -> Int`. So `a` itself isn’t a type, it’s a name that refers to a type—a type variable.
Under this interpretation, there’s nothing wrong with your example. In
add (x :: a) (y :: b) = x + y
the variable `a` names `x`’s type, and the variable `b` names `y`’s type, and there’s nothing at all wrong with having two different names that refer to the same thing. It would be like writing this term:
\t -> let { a = t; b = t } in a :-> b
So why does it seem confusing? Well, because if we write
add :: Num t => a -> b -> t
we get a type error, not two aliases for t. This is because GHC will implicitly quantify all the variables in the type signature to get
add :: forall t a b. Num t => a -> b -> t
so `a` and `b` are now distinct, “lambda-bound” variables, not aliases for `t`.
This implicit quantification means we don’t often think about the type variables we write in type signatures as being like term variables, which are just names that reference other things. Rather, we think of type variables as being like type constants, and in fact GHC encourages this interpretation in its error reporting.
To explain, consider what happens when GHC checks `add` against the type we wrote above. Somehow, it needs to figure out the types of `x` and `y`, but it can’t just say
x :: a y :: b
because those variables were bound by the `forall` (they are “lambda-bound”), and if we move them into some other context where they appear free, they no longer have any meaning. GHC has to somehow pick actual types for x and y, not type variables. A naïve idea would be to invent two new solver variables, aka “holes to be filled in with inferred types”, yielding something like
a1, b1 fresh x :: a1 y :: b1
but this doesn’t work. Imagine we did that while typechecking this function:
bad :: a -> a bad _ = True
If we invented a new solver variable to use in place of `a` while typechecking `bad`, GHC would just say “oh, I can fill in the hole with Bool” and carry on, which would be quite bad! The whole point of parametricity is that the function must work with any type.
So GHC actually does something else. Behind the scenes, it creates two new type definitions, distinct from all other types in the program:
data FakeTypeA data FakeTypeB
Now it typechecks the function using these fake types substituted for `a` and `b`:
x :: FakeTypeA y :: FakeTypeB
Now when GHC goes to typecheck `x + y`, it will raise a type error, since FakeTypeA and FakeTypeB are obviously not the same type. This enforces parametricity.
Of course, it would be very confusing to the user to see names like “FakeTypeA” in their type errors. So GHC pulls some sleight of hand, and it prints the new types it generates as just “a” and “b”, the same as the forall-bound variables it created them for. These don’t look like type constants, because in source Haskell, all type constants start with UpperCase. But behind the scenes, they really are type constants, not type variables. GHC calls these fake type constants “rigid variables” or “skolem variables,” but don’t be fooled: they’re not variables at all, they’re constants.
GHC tells this white lie to keep programmers from having to think about all this behind the scenes business. When GHC says “could not match ‘a’ with ‘b’,” it’s natural for programmers to think “ah, these are the ‘a’ and ‘b’ I wrote in my type signature,” and they learn to treat “type variables” and “rigid type constants” as one and the same. They don’t think about how this is a meaningfully different definition of “variable” than what they’re used to at the term level. This works out okay because the only way to bind type variables in source Haskell 98/2010 is universal quantification.
But now ScopedTypeVariables and ExistentialQuantification enter the picture, and there’s a problem: we suddenly need to be able to bind type variables in a place that isn’t `forall`. Now we have type variable binding sites that look just like type variable uses, but they aren’t implicitly-quantified the way they are in standalone type signatures, and confusion ensues.
This confusion actually goes much deeper than you might expect. Because Haskell historically hasn’t syntactically distinguished type variable uses from binding sites, we have new ambiguous situations like this:
\(x :: Maybe a) -> ...
Is this a type annotation on x, which states that x should have type `Maybe a`, where `a` is a type variable already in scope? Or is it a pattern binding that matches against x’s type, which asserts that it has the shape `Maybe t` and binds a new variable `a` to the type `t`? And it gets even worse! Suppose we have this:
f (Foo @a x) = ...
Is `@a` a visible type application of `Foo` to the in-scope type `a`, instantiating the (universally-quantified) `Foo` constructor at a particular type? Or is it a binder for an existential variable “stored inside” `Foo`? Both are reasonable (and useful) interpretations.
A GHC proposal that presents a design that sorts all these ambiguities out would be most welcome. To my knowledge, no such proposal currently exists, but I could be wrong—I know this issue has been discussed on some proposals (such as visible type applications in patterns), so maybe someone has come up with something I’m not aware of.
Alexis
On Aug 16, 2020, at 07:17, Tom Ellis
wrote: I have just noticed that with ScopedTypeVariables one can write the following bizarre definition
-- Inferred signature: -- add :: Num a => a -> a -> a add (x :: a) (y :: b) = x + y
The reason for this behaviour is that
in all patterns other than pattern bindings, a pattern type signature may mention a type variable that is not in scope; in this case, the signature brings that type variable into scope.
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts...
and the justification is that
Bringing type variables into scope is particularly important for existential data constructors
But this leads to a rather puzzling user experience. Was it really not possible to design this extension in a way that would allow bringing new type variables into scope locally but not allow them to unify with other type variables from outer scopes?
To be more concrete, what I would like to see is a design where `k` is allowed because the type `a` (bound within a pattern) does not need to unify with anything but `k2` is not allowed because `a` is forbidden to unify with `b`. `k3` would also be forbidden (the type variables called `a` are actually distinct) as would `k4` (the type variable for the inferred signature would be fresh and distinct from `a`).
I believe this design might lead to much less puzzling behaviour. Would it be possible (within a new extension, of course) or have I overlooked something?
Tom
data T = forall a. MkT [a]
k :: T -> T k (MkT [t::a]) = MkT t3 where (t3::[a]) = [t,t,t]
data T2 a = MkT2 [a]
k2 :: T2 b -> T2 b k2 (MkT2 [t::a]) = MkT2 t3 where (t3::[a]) = [t,t,t]
k3 :: T2 a -> T2 a k3 (MkT2 [t::a]) = MkT2 t3 where (t3::[a]) = [t,t,t]
k4 (MkT2 [t::a]) = MkT2 t3 where (t3::[a]) = [t,t,t]
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.