
Warning for Andrew: this post explains a new-to-you typed lambda calculus and a significant part of the innards of Hindley-Milner typing in order to answer your questions. Expect to bang your head a little! On Tue, 27 May 2008, Andrew Coppin wrote:
- A function starts out with a polymorphic type, such as 'x -> x'.
Which is the same as "forall x. x -> x".
- Each time the function is called, all the type variables have to be locked down to real types such as 'Int' or 'Either (Maybe (IO Int)) String' or something.
Yep, in fact in the explicitly-typed calculus commonly used as a model for rank-n polymorphism (System F) there are explicit type lambdas and type applications that do exactly this. Using /\ for a type lambda and [term type] for type applications, id might be written thus: id = /\t.(\x::t->x) and called thus: [id Int] 1
- By the above, any function passed to a high-order function has to have all its type variables locked down, yielding a completely monomorphic function.
Yep.
- The "forall" hack somehow convinces the type checker to not lock down some of the type variables. In this way, the type variables relating to a function can remain unlocked until we reach each of the call sites. This allows the variables to be locked to different types at each site [exactly as they would be if the function were top-level rather than an argument].
Not a hack. If there is a hack, it's in /not/ including a forall for rank-1 types. Foralls correspond to type lambdas in the same way that ->s correspond to ordinary lambdas.
- Why are top-level variables and function arguments treated differently by the type system?
The big difference is between lambdas and binding groups (as in let, top-level etc). With the binding group, any constraints on a value's type will be found within its scope - so the type can be 'generalised' there, putting a forall around it. The same isn't true of lambdas, and so their parameters can only be polymorphic given an appropriate type annotation. For extra confusion, type variables are themselves monomorphic[1].
- Why do I have to manually tell the type checker something that should be self-evident?
It isn't - the general case is in fact undecidable.
- Why do all type variables need to be locked down at each call site in the first place?
Find an alternative model for parametric polymorphism that works! Note that 'not locking down' is equivalent to locking down to parameters you took from elsewhere. As such, if you stick to rank-1 types you never actually notice the difference - whereas it makes the type system an awful lot easier to understand.
- Why is this virtually never a problem in real Haskell programs?
I wouldn't go so far as to say virtually never, I've run into it a fair few times - but normally until you're trying to do pretty generalised stuff it's just not necessary.
- Is there ever a situation where the unhacked type system behaviour is actually desirably?
There are plenty of situations where a rank-1 type is the correct type. If you give id a rank-2 type, it's more specific - just as if you typed it Int->Int.
- Why can't the type system automatically detect where polymorphism is required?
Because there's often more than one possible type for a term, without a 'most general' type.
- Does the existence of these anomolies indicate that Haskell's entire type system is fundamentally flawed and needs to be radically altered - or completely redesigned?
About as much as the undecidability of the halting problem and the incompleteness theory suggest our understanding of computation and logic is fundamentally flawed - which is to say, not at all.
The key problem seems to be that the GHC manual assumes that anybody reading it will know what "universal quantification" actually means. Unfortunately, neither I nor any other human being I'm aware of has the slightest clue what that means.
Guess nobody on this list's human, huh? It's really not the GHC manual's job to teach you the language extensions from scratch any more than it should teach Haskell 98 from scratch. I guess the real question is where the relevant community documentation is, something I have to admit to not having needed to check.
existential quantification = "this is true for SOMETHING" universal quantification = "this is true for EVERYTHING"
The type "forall x. x -> x" is "for all types x, x -> x". As in, "for all types x, (\y -> y) :: x -> x". [1] Actually this is no longer quite true since GHC now supports impredicative polymorphism in which type variables can be instantiated with polymorphic types. But please ignore that for now? -- flippa@flippac.org 'In Ankh-Morpork even the shit have a street to itself... Truly this is a land of opportunity.' - Detritus, Men at Arms