Q: haskell type inference algorithm paper

Hi, all I'm new to haskell, I'm learning haskell for its type system, and I found haskell type inference is excellent, I'm interested in its type inference algorithm, could you give me some paper or material reference that describe the haskell type inference algorithm? I want to port it to typed clojure.

The current version of GHC does type inference by (an advanced version of)
the OutsideIn(X) algorithm.
You can check the original paper at
http://research.microsoft.com/en-us/um/people/simonpj/papers/constraints/jfp...
The initial algorithm was then enhanced with datatype promotion and closed
type families, which you can learn about in "Giving Haskell a Promotion"
and "Closed Type Families with Overlapping Equations".
2014-02-25 14:38 GMT+01:00 Di Xu
Hi, all
I'm new to haskell, I'm learning haskell for its type system, and I found haskell type inference is excellent, I'm interested in its type inference algorithm, could you give me some paper or material reference that describe the haskell type inference algorithm? I want to port it to typed clojure.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

To refine Alejandro's answer a bit: The OutsideIn(X) paper is really the one to read. It's long, but I think it would have to be! The later papers he mentions describe enhancements to the internal, typed language (called variously System FC or GHC Core -- they're more or less the same). With these enhancements came some changes to type inference, but not major ones. The "Promotion" paper describes a bit about polymorphic kind inference in datatype definitions, but the details aren't really there. The "Closed Type Families" paper does not really address type inference, as the closed type families feature (somewhat surprisingly) doesn't interact much with type inference at all.
It would be great to see this work be adopted into another language. Good luck!
Richard
On Feb 25, 2014, at 8:46 AM, Alejandro Serrano Mena
The current version of GHC does type inference by (an advanced version of) the OutsideIn(X) algorithm. You can check the original paper at http://research.microsoft.com/en-us/um/people/simonpj/papers/constraints/jfp... The initial algorithm was then enhanced with datatype promotion and closed type families, which you can learn about in "Giving Haskell a Promotion" and "Closed Type Families with Overlapping Equations".
2014-02-25 14:38 GMT+01:00 Di Xu
: Hi, all I'm new to haskell, I'm learning haskell for its type system, and I found haskell type inference is excellent, I'm interested in its type inference algorithm, could you give me some paper or material reference that describe the haskell type inference algorithm? I want to port it to typed clojure.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Feb 25, 2014 at 8:38 PM, Di Xu
I'm interested in its type inference algorithm, could you give me some paper or material reference that describe the haskell type inference algorithm? I want to port it to typed clojure.
Haskell type inference is still a developing research area. Suffice to say, not even SPJ knows all about what it _should be_. Here's a reasonable plan: 1) Start with so-called "Unification" [1]. Try manually inferring the types of some Haskell fragments to get a sense of how it applies. The connection should emerge as clear as day. 2) Next, study Damas-Hindley-Milner. Gain a sense of what motivates 'principal types', why they're useful, as opposed to a situation where they are impossible to infer. 3) Finally, there's type classes. I don't know if typed clojure has them or not, but this is really what separates Haskell from, say, Lazy ML (LML). Be warned, this is where it gets gnarly. [1] http://en.wikipedia.org/wiki/Unification_%28computer_science%29 -- Kim-Ee

Thanks a lot for these paper reference and detailed explanation. I have a lot to read now ;)
Here's a reasonable plan:
1) Start with so-called "Unification" [1]. Try manually inferring the types of some Haskell fragments to get a sense of how it applies. The connection should emerge as clear as day.
2) Next, study Damas-Hindley-Milner. Gain a sense of what motivates 'principal types', why they're useful, as opposed to a situation where they are impossible to infer.
3) Finally, there's type classes. I don't know if typed clojure has them or not, but this is really what separates Haskell from, say, Lazy ML (LML). Be warned, this is where it gets gnarly.
[1] http://en.wikipedia.org/wiki/Unification_%28computer_science%29
Wow, thanks for this such a detailed plan, I'm also new to type system, it seems that all the later type inference algorithm is base on Damas-Hindley-Milner, I must start from here. The feature that typed clojure misses is some ability to do fully automatic type inference, here are some example: map is a polymorphic function with type `(a -> b) -> [a] -> [b]`, and so does identity function of type `a -> a`, here are invocation: `map identity [1,2,3]`, haskell could successfully inference its type, but typed clojure is unable to do this fully automatically, we had to instantiate either map or identity manually. The author of typed clojure given me this paperhttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.7265, and to quote from paper: When a polymorphic function is given an anonymous function as argument, we have a situation where either one of these techniques can be used to infer some type annotations, but not both at the same time. For example, in the term map (fun x -> x) [1,2,3], we cannot infer both the type arguments to map and the annotation on the function parameter x for the following reason. We take the simple approach that all the the arguments' types must be determined before calculation of any missing type arguments. This means that the type of the anonymous function must be synthesized with the concrete type of its parameter not available from the context. This synthesis immediately fails, since the parameter is not annotated with its type. So, I'd like to know how haskell solved this problem? Thanks

Let's try explicitly unifying the types in your map example. We'll assume that map already has the following explicit type: data List a = Nil | Cons a (List a) type Function a b = a -> b map :: Function (Function a b) (Function (List a) (List b)) I've defined new "List" and "Function" types since the "[]" and "->" may cause confusion. The "a" and "b" are placeholders; they give us *constraints* by telling us which types must be the same, but they don't tell us what those types actually are. These placeholders can be re-used in different type declarations, but I'll use different letters to prevent any confusion. Now let's look at the arguments given to map in your example. First the identity function, written like this in Haskell, with some unknown type: (\x -> x) :: ? The simple placeholder "?" is my way of saying 'we don't know yet'. In fact, we already know that the value is a function, so we can write that down: (\x -> x) :: Function ? ? Now we can calculate the output type by *assuming* that the input type holds. In other words, to calculate the type of the output value we need to assume that we've been given an input value. Here we will assume that we've been given a value of some type "c": x :: c Now we can calculate the type of the output expression: x :: ? Since we assumed that x :: c, we can immediately write down the type of the output, since it's the same: x :: c Hence we've found that the output type happens to be the same as the input type: (\x -> x) :: Function c c Now let's look at the other argument: (Cons 1 (Cons 2 (Cons 3 Nil))) :: ? This is [1,2,3] written using my List type. We know that Cons gives us a List, so we can write that down: (Cons 1 (Cons 2 (Cons 3 Nil))) :: List ? Now we need to know the types of "1", "2" and "3". Haskell treats numbers in a special way, using the "Num" typeclass, but that will just complicate things, so I'll pretend that they're Ints: (Cons 1 (Cons 2 (Cons 3 Nil))) :: List Int Notice that this type has no placeholders, it is 'concrete'. Now let's look at the full example: map (\x -> x) (Cons 1 (Cons 2 (Cons 3 Nil))) We can only apply a function to arguments of the correct type, which means that the following equations must be satisfied: Function a b = Function c c List a = List Int We can see that the 'top level' constraints are OK: we have a Function type equal to a Function type and a List type equal to a List type. Next we need to check their arguments (known as "type arguments"): a = c b = c a = Int This looks OK, since we don't have any nonsense like Int = Bool or anything. However, we still haven't worked out what the types actually are. It looks like this is where typed clojure gives up:
We take the simple approach that all the the arguments' types must be determined before calculation of any missing type arguments.
In Haskell we can go a bit further by rearranging these equations using substitution. We can see that the first and third equations both use "a", hence we can substitute one for the other: Int = c This uses "c" and so does the middle equation, so we can do another substitution: b = Int Now we have concrete types for all of our placeholders, specifically: a = Int b = Int c = Int Hence we can substitute these back into the type of our map expression: Function (Function Int Int) (Function (List Int) (List Int)) Or, with the conventional notation for types: (Int -> Int) -> [Int] -> [Int] The extra step of solving simultaneous equations to get the type arguments is the reason Haskell can infer this type and typed clojure cannot. Regards, Chris
participants (5)
-
Alejandro Serrano Mena
-
Chris Warburton
-
Di Xu
-
Kim-Ee Yeoh
-
Richard Eisenberg