
On Fri, Nov 19, 2010 at 1:05 PM, Andrew Coppin
So is Either what is meant by a "sum type"? Similarly, (X, Y) [...] is this a "product type"?
Yes, and yes, although more generally speaking data Test = A Int Bool | B Test | C [Test] is a recursive ADT composed of sums (A ...| B... | C...) and products ((Int, Bool), Test, [Test])
Notionally (->) is just another type constructor, so functions aren't fundamentally different to any other types - at least, as far as the type system goes.
Sort of, but I think your discussion later gets into exactly why it *is* fundamentally different.
Where *the hell* do GADTs fit in here? Well, they're usually used with phantom types, so I guess we need to figure out where phantom types fit in.
Well, I find it's better to think of GADTs as types that have extra elements holding proofs about their contents which you can unpack. For example: data Expr a where Prim :: a -> Expr a Lam :: (Expr x -> Expr y) -> Expr (x -> y) App :: Expr (x -> a) -> Expr x -> Expr a (The type variables are arbitrary; I chose these specific odd ones for a reason which will shortly become clear) This can be reconstructed as an ADT with existentials and constraints: data Expr a = Prim a | forall x y. (a ~ (x -> y)) => Lam (Expr x -> Expr y) | forall x. App (Expr x -> a) (Expr x) Conceptually, here is what these types look like in memory; [x] is a box in memory holding a pointer to an object of type x Expr a = tag_Prim, [a] or tag_Lam, type x, type y, proof that a = (x -> y), [Expr x -> Expr y] or tag_App, type x, [Expr x -> a], [Expr x] Now, because we have type safety the compiler can actually erase all the types and equality proofs (although in the case of typeclass constraints it keeps the typeclass dictionary pointer around so that it can be used to call functions in the class).
To the type checker, Foo Int and Foo Bool are two totally seperate types. In the phantom type case, the set of possible values for both types are actually identical. So really we just have two names for the same set. The same thing goes for a type alias (the "type" keyword).
Not exactly; in the phantom type case, the two sets ARE disjoint in a way; there are no objects that are both Foo Int and Foo Bool (although there may be objects of type forall a. Foo a -- more on that later). Whereas the type keyword really creates two names for the same set.
So what would happen if some crazy person decided to make the kind system more like the type system? Or make the type system more like the value system? Do we end up with another layer to our cake? Is it possible to generalise it to an infinite number of layers? Or to a circular hierachy? Is any of this /useful/ in any way?
Absolutely! In fact, dependent type systems do exactly this; values can be members of types and vice versa. For example, in a dependently typed language, you can write something like data Vector :: * -> Int -> * where Nil :: forall a::*. Vector a 0 Cons :: forall a::*, n::Int. a -> Vector a n -> Vector a (n+1) append :: forall a::*, m::Int, n::Int. Vector a m -> Vector a n -> Vector a (m+n) append a 0 n Nil v = v append a m n (Cons a as) v = Cons a (append as v) However, -> is special here, because it works at both the type level and the value level! That is, (Vector Int :: Int -> *) can be applied to the value 5 to create (Vector Int 5), the set of vectors with 5 elements of type Int. And in fact, -> is just a generalization of 'forall' that can't refer to the object on it's right hand side: append :: ( forall a :: *. forall m :: Int. forall n :: Int. forall v1 :: Vector a m. forall v2 :: Vector a n. Vector a (m+n) ) Since we don't refer to v1 or v2, we can abbreviate the 'forall' with ->: append :: ( forall a :: *. forall m :: Int. forall n :: Int. Vector a m -> Vector a n -> Vector a (m+n) ) Now, however, you can run arbitrary code at the type level; just create an object with the type Vector Int (fib 5) and the compiler needs to calculate (fib 5) to make sure the length is correct! In compiler circles it's generally considered a bad thing when the compiler doesn't terminate, (one reason: how do I know if the bug is in my code or the compiler?) so these languages tend to force your code (or at least the code that can run at the type level) to terminate. There's another benefit to this: if it's possible that a type-level expression might not terminate, you lose type erasure and have to do the calculation at runtime. Otherwise you can create unsound expressions: cast :: forall a::*, b::*. (a = b) -> a -> b cast a b p x = x -- note that p is never evaluated, we just require the proof in scope! bottom :: forall a. a bottom a = bottom a -- if we evaluate this, we infinite loop unsound :: forall a::*, b::*. a -> b unsound a b x = cast a b (bottom (a=b)) x print (unsound Int String 0) -- BOOM! Headshot to your type-safety. So, these languages tend to be very heavily on the 'not turing complete' side of the fence, because they have restrictions on nonterminating programs. But they still manage to be useful! There's one nit remaining, though: that pesky *. What's its type? Sadly, * :: * does not work, it leads to a paradox. But you can make a hierarchy: * :: *1, *1 :: *2, *2 :: *3, etc. Some existing compilers do this, some even going so far as to automatically lift your types into the right 'universe' (*-level), and then complaining if there is a cycle that can't be resolved (x :: *n, y :: *m, with n > m and m > n). And sometimes they even provide a switch (like Haskell's UndecidableInstances) that lets you collapse the *'s and not care. You just have to promise not to write Russell's Paradox at the type level. :) -- ryan