Developing Programs and Proofs Spontaneously using GADT

I am curious about the possibility of developing Haskell programs spontaneously with proofs about their properties and have the type checker verify the proofs for us, in a way one would do in a dependently typed language. In the exercise below, I tried to redo part of the merge-sort example in Altenkirch, McBride, and McKinna's introduction to Epigram [1]: deal the input list into a binary tree, and fold the tree by the function merging two sorted lists into one. The property I am going to show is merely that the length of the input list is preserved. Given that dependent types and GADTs are such popular topics, I believe the same must have been done before, and there may be better ways to do it. If so, please give me some comments or references. Any comments are welcomed.
{-# OPTIONS_GHC -fglasgow-exts #-}
To begin with, we define the usual type-level representation of natural numbers and lists indexed by their lengths.
data Z = Z deriving Show data S a = S a deriving Show
data List a n where Nil :: List a Z Cons :: a -> List a n -> List a (S n)
1. Append To warm up, let us see the familiar "append" example. Unfortunately, unlike Omega, Haskell does not provide type functions. I am not sure which is the best way to emulate type functions. One possibility is to introduce the following GADT:
data Plus m n k where --- m + n = k PlusZ :: Plus Z n n PlusS :: Plus m n k -> Plus (S m) n (S k)
such that Plus m n k represents a proof that m + n = k. Not having type functions, one of the possible ways to do append is to have the function, given two lists of lengths m and n, return a list of length k and a proof that m + n = k. Thus, the type of append would be: append :: List a m -> List a n -> exists k. (List a k, Plus m n k) In Haskell, the existential quantifier is mimicked by forall. We define:
data DepSum a p = forall i . DepSum (a i) (p i)
The term "dependent sum" is borrowed from the Omega tutorial of Sheard, Hook, and Linger [2] (why is it called a sum, not a product?). The function append can thus be defined as:
append :: List a m -> List a n -> DepSum (List a) (Plus m n) append Nil ys = DepSum ys PlusZ append (Cons x xs) ys = case (append xs ys) of DepSum zs p -> DepSum (Cons x zs) (PlusS p)
Another possibility is to provide append a proof that m + n = k. The type and definition of of append would be: < append :: Plus m n k -> List a m -> List a n -> List a k < append PlusZ Nil ys = ys < append (PlusS pf) (Cons x xs) ys = Cons x (append pf xs ys) I thought the second append would be more difficult to use: to append two lists, I have to provide a proof about their lengths! It turns out that this append actually composes easier with other parts of the program. We will come to this later. 2. Some Lemmas Here are some lemmas represented as functions on terms. The function, for example, converts a proof of m + (1+n) = k to a proof of (1+m) + n = k.
incAssocL :: Plus m (S n) k -> Plus (S m) n k incAssocL PlusZ = PlusS PlusZ incAssocL (PlusS p) = PlusS (incAssocL p)
incAssocR :: Plus (S m) n k -> Plus m (S n) k incAssocR (PlusS p) = plusMono p
plusMono :: Plus m n k -> Plus m (S n) (S k) plusMono PlusZ = PlusZ plusMono (PlusS p) = PlusS (plusMono p)
For example, the following function revcat performs list reversal by an accumulating parameter. The invariant we maintain is m + n = k. To prove that the invariant holds, we have to use incAssocL.
revcat :: List a m -> List a n -> DepSum (List a) (Plus m n) revcat Nil ys = DepSum ys PlusZ revcat (Cons x xs) ys = case revcat xs (Cons x ys) of DepSum zs p -> DepSum zs (incAssocL p)
3. Merge Apart from the proof manipulations, the function merge is not very different from what one would expect:
merge :: Ord a => List a m -> List a n -> DepSum (List a) (Plus m n) merge Nil ys = DepSum ys PlusZ merge (Cons x xs) Nil = append (Cons x xs) Nil merge (Cons x xs) (Cons y ys) | x <= y = case merge xs (Cons y ys) of DepSum zs p -> DepSum (Cons x zs) (PlusS p) | otherwise = case merge (Cons x xs) ys of DepSum zs p -> DepSum (Cons y zs) (plusMono p)
The lemma plusMono is used to convert a proof of m + n = k to a proof of m + (1+n) = 1+k. 4. Sized Trees We also index binary trees by their sizes:
data Tree a n where Nul :: Tree a Z Tip :: a -> Tree a (S Z) Bin :: Tree a n1 -> Tree a n -> (Plus p n n1, Plus n1 n k) -> Tree a k
The two trees given to the constructor Bin have sizes n1 and n respectively. The resulting tree, of size k, comes with a proof that n1 + n = k. Furthermore, we want to maintain an invariant that n1 either equals n, or is bigger than n by one. This is represented by the proof Plus p n n1. In the definition of insertT later, p is either PlusZ or PlusS PlusZ. 5. Lists to Trees The function insertT inserts an element into a tree:
insertT :: a -> Tree a n -> Tree a (S n) insertT x Nul = Tip x insertT x (Tip y) = Bin (Tip x) (Tip y) (PlusZ, PlusS PlusZ) insertT x (Bin t u (PlusZ, p)) = Bin (insertT x t) u (PlusS PlusZ, PlusS p) insertT x (Bin t u (PlusS PlusZ, p)) = Bin t (insertT x u) (PlusZ, PlusS (incAssocR p))
Note that whenever we construct a tree using Bin, the first proof, corresponding to the difference in size of the two subtrees, is either PlusZ or PlusS PlusZ. The counterpart of foldr on indexed list is defined by:
foldrd :: (forall k . (a -> b k -> b (S k))) -> b Z -> List a n -> b n foldrd f e Nil = e foldrd f e (Cons x xs) = f x (foldrd f e xs)
The result is also an indexed type (b n). The function deal :: List a n -> Tree a n, building a tree out of a list, can be defined as a fold:
deal :: List a n -> Tree a n deal = foldrd insertT Nul
6. Trees to Lists, and Merge Sort The next step is to fold through the tree by the function merge. The first two clauses are simple:
mergeT :: Ord a => Tree a n -> List a n mergeT Nul = Nil mergeT (Tip x) = Cons x Nil
For the third clause, one would wish that we could write something as simple as: mergeT (Bin t u (_,p1)) = case merge (mergeT t) (mergeT u) of DepSum xs p -> xs However, this does not type check. Assume that t has size n1, and u has size n. The DepSum returned by merge consists of a list of size i, and a proof p of type Plus m n i, for some i. The proof p1, on the other hand, is of type P m n k for some k. Haskell does not know that Plus m n is actually a function and cannot conclude that i=k. To explicitly state the equality, we assume that there is a function plusFn which, given a proof of m + n = i and a proof of m + n = k, yields a function converting an i in any context to a k. That is: plusFn :: Plus m n i -> Plus m n k -> (forall f . f i -> f k) The last clause of mergeT can be written as:
mergeT (Bin t u (_,p1)) = case merge (mergeT t) (mergeT u) of DepSum xs p -> plusFn p p1 xs
How do I define plusFn? I would like to employ the techniques related to equality types [3,4,5], but currently I have not yet figured out how. I've merely produced a version of plusFn specialised to List a:
plusFn :: Plus m n h -> Plus m n k -> List a h -> List a k plusFn PlusZ PlusZ xs = xs plusFn (PlusS p1) (PlusS p2) (Cons x xs) = Cons x (plusFn p1 p2 xs)
Needless to say this is not satisfactory. Now that we have both deal and mergeT, merge sort is simply their composition:
msort :: Ord a => List a n -> List a n msort = mergeT . deal
The function mergeT can be defined using a fold on trees as well. Such a fold might probably look like this:
foldTd :: (forall m n k . Plus m n k -> b m -> b n -> b k) -> (a -> b (S Z)) -> b Z -> Tree a n -> b n foldTd f g e Nul = e foldTd f g e (Tip x) = g x foldTd f g e (Bin t u (_,p)) = f p (foldTd f g e t) (foldTd f g e u)
mergeT :: Ord a => Tree a n -> List a n mergeT = foldTd merge' (\x -> Cons x Nil) Nil where merge' p1 xs ys = case merge xs ys of DepSum xs p -> plusFn p p1 xs I am not sure whether this is a "reasonable" type for foldTd. 7. Passing in the Proof as an Argument Previously I thought the second definition of append would be more difficult to use, because we will have to construct a proof about the lengths before calling append. In the context above, however, it may actually be more appropriate to use this style of definitions. An alternative definition of merge taking a proof as an argument can be defined by: < merge :: Ord a => Plus m n k -> List a m -> < List a n -> List a k < merge PlusZ Nil ys = ys < merge pf (Cons x xs) Nil = append pf (Cons x xs) Nil < merge (PlusS p) (Cons x xs) (Cons y ys) < | x <= y = Cons x (merge p xs (Cons y ys)) < | otherwise = Cons y (merge (incAssocL p) (Cons x xs) ys) A definition of mergeT using this definition of merge follows immediately because we can simply use the proof coming with the tree: < mergeT :: Ord a => Tree a n -> List a n < mergeT Nul = Nil < mergeT (Tip x) = Cons x Nil < mergeT (Bin t u (_,p1)) = < merge p1 (mergeT t) (mergeT u) I don't know which approach can be called more "natural". References [1] Thorsten Altenkirch, Conor McBride, and James McKinna. Why Dependent Types Matter. http://www.e-pig.org/downloads/ydtm.pdf [2] Tim Sheard, James Hook, and Nathan Linger. GADTs + Extensible Kinds = Dependent Programming. http://web.cecs.pdx.edu/~sheard/papers/GADT+ExtKinds.ps [3] James Cheney, Ralf Hinze. A lightweight implementation of generics and dynamics. [4] Stephanie Weirich, Type-safe cast: (functional pearl), ICFP 2000. [5] Arthur I. Baars , S. Doaitse Swierstra, Typing dynamic typing, ACM SIGPLAN Notices, v.37 n.9, p.157-166, September 2002 Appendix. Auxiliary Functions
instance Show a => Show (List a n) where showsPrec _ Nil = ("[]"++) showsPrec _ (Cons x xs) = shows x . (':':) . shows xs
instance Show (Plus m n k) where showsPrec _ PlusZ = ("pZ"++) showsPrec p (PlusS pf) = showParen (p>=10) (("pS " ++) . showsPrec 10 pf)
instance Show a => Show (Tree a n) where showsPrec _ Nul = ("Nul"++) showsPrec p (Tip x) = showParen (p >= 10) (("Tip " ++) . shows x) showsPrec p (Bin t u pf) = showParen (p>=10) (("Bin "++) . showsPrec 10 t . (' ':) . showsPrec 10 u . (' ':) . showsPrec 10 pf)

On 8/4/07, Shin-Cheng Mu
Unfortunately, unlike Omega, Haskell does not provide type functions.
Something similar is coming: http://haskell.org/haskellwiki/GHC/Indexed_types#Instance_declarations_2
Haskell does not know that Plus m n is actually a function and cannot conclude that i=k.
To explicitly state the equality, we assume that there is a function plusFn which, given a proof of m + n = i and a proof of m + n = k, yields a function converting an i in any context to a k. That is:
plusFn :: Plus m n i -> Plus m n k -> (forall f . f i -> f k)
[snip]
How do I define plusFn? I would like to employ the techniques related to equality types [3,4,5], but currently I have not yet figured out how.
plusFn :: Plus m n h -> Plus m n k -> f h -> f k plusFn PlusZ PlusZ x = x plusFn (PlusS p1) (PlusS p2) x = case plusFn p1 p2 Equal of Equal -> x data Equal a b where Equal :: Equal a a Jim

Shin-Cheng Mu wrote:
I am curious about the possibility of developing Haskell programs spontaneously with proofs about their properties and have the type checker verify the proofs for us, in a way one would do in a dependently typed language.
In the exercise below, I tried to redo part of the merge-sort example in Altenkirch, McBride, and McKinna's introduction to Epigram [1]: deal the input list into a binary tree, and fold the tree by the function merging two sorted lists into one. The property I am going to show is merely that the length of the input list is preserved.
Cool! :)
Given that dependent types and GADTs are such popular topics, I believe the same must have been done before, and there may be better ways to do it. If so, please give me some comments or references. Any comments are welcomed.
{-# OPTIONS_GHC -fglasgow-exts #-}
To begin with, we define the usual type-level representation of natural numbers and lists indexed by their lengths.
data Z = Z deriving Show data S a = S a deriving Show
data List a n where Nil :: List a Z Cons :: a -> List a n -> List a (S n)
1. Append
To warm up, let us see the familiar "append" example. Unfortunately, unlike Omega, Haskell does not provide type functions. I am not sure which is the best way to emulate type functions. One possibility is to introduce the following GADT:
data Plus m n k where --- m + n = k PlusZ :: Plus Z n n PlusS :: Plus m n k -> Plus (S m) n (S k)
such that Plus m n k represents a proof that m + n = k.
Wouldn't type families (~ associated type synonyms) do exactly that once they become available? type family Plus :: * -> * -> * type instance Plus Z n = n type instance Plus (S m) n = S (Plus m n) append :: (Plus m n ~ k) => List a m -> List a n -> List a k append Nil ys = ys append (Cons x xs) ys = Cons x (append xs ys) But I'd guess that there are some constraints on the type family instance declarations to keep things decidable. Viewed with the dictionary translation for type classes in mind, this is probably exactly the alternative type of append you propose: append :: Plus m n k -> List a m -> List a n -> List a k
However, this does not type check. Assume that t has size n1, and u has size n. The DepSum returned by merge consists of a list of size i, and a proof p of type Plus m n i, for some i. The proof p1, on the other hand, is of type P m n k for some k. Haskell does not know that Plus m n is actually a function and cannot conclude that i=k.
To explicitly state the equality, we assume that there is a function plusFn which, given a proof of m + n = i and a proof of m + n = k, yields a function converting an i in any context to a k. That is:
plusFn :: Plus m n i -> Plus m n k -> (forall f . f i -> f k)
How do I define plusFn? I would like to employ the techniques related to equality types [3,4,5], but currently I have not yet figured out how. I've merely produced a version of plusFn specialised to List a:
plusFn :: Plus m n h -> Plus m n k -> List a h -> List a k plusFn PlusZ PlusZ xs = xs plusFn (PlusS p1) (PlusS p2) (Cons x xs) = Cons x (plusFn p1 p2 xs)
Needless to say this is not satisfactory.
I remember that the newtype Equal a b = Proof (forall f . f a -> f b) type equality has been used to define/implement GADTs Ralf Hinze. Fun with phantom types. http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf so a general plusFn ought to be possible. I think that the following induction should work (untested!): equalZ :: Equal Z Z equalS :: Equal m n -> Equal (S n) (S m) plusFn :: Plus m n i -> Plus m n k -> Equal i k plusFn PlusZ PlusZ = equalZ plusFn (PlusS x) (PlusS y) = equalS (plusFn x y) with the "trivial" equality proofs for natural numbers equalZ = Proof id newtype Succ f a = InSucc { outSucc :: f (S a) } equalS (Proof eq) = Proof (outSucc . eq . InSucc) The newtype is just for making the type checker recognize that f (S a) is indeed of the form g a for some type constructor g . Regards, apfelmus

On Sat, 2007-08-04 at 23:25 +0800, Shin-Cheng Mu wrote:
[...]
In Haskell, the existential quantifier is mimicked by forall. We define:
data DepSum a p = forall i . DepSum (a i) (p i)
The term "dependent sum" is borrowed from the Omega tutorial of Sheard, Hook, and Linger [2] (why is it called a sum, not a product?).
There is already a dependent product. One compelling reason for this naming comes from category theory and how they relate to each other (the dependent sum is a left adjoint, the dependent product is a right adjoint [not directly to each other].) They are related in the same way products and sums are related, or universal quantification and existential quantification are. So, looking at the product case (A,B)=AxB (and ()=1) and the sum case Either A B = A+B we can see how they generalize to be the above. Let's say we have a pair of the same type. We can clearly represent it as AxA, but another alternative is 1+1->A. We can write the isomorphism in Haskell: pToF :: (a,a) -> (Either () () -> a) pToF (x,_) (Left ()) = x pToF (_,y) (Right ()) = y fToP :: (Either () () -> a) -> (a,a) fToP f = (f (Left ()), f (Right ())) Similarly for sums, A+A can be written (1+1)xA. Again, the Haskell: sToP :: Either a a -> (Either () (),a) sToP (Left x) = (Left (), x) sToP (Right x) = (Right (), x) pToS :: (Either () (), a) -> Either a a pToS (Left (), x) = Left x pToS (Right (), x) = Right x Incidentally, this is how sums are usually encoded, i.e. a pair of a tag and a value. (Also incidentally, these are almost exactly the isomorphisms defining adjunctions related to the ones that define product and sum, i.e. the above definitions immediately fall out of the category theory.) So how to represent AxB or A+B with this encoding? You can't. However, in a dependently typed language we can write (in some dependently typed pseudo-Haskell) [for concreteness, let's set A=Int, B=String]: f :: Either () () -> Type f (Left ()) = Int f (Right ()) = String Now we use the same definitions as above only with the types, pToF :: (Int,String) -> (s :: Either () () -> f s) fToP :: (s :: Either () () -> f s) -> (Int,String) sToP :: Either Int String -> s :: Either () (). f s pToS :: (s :: Either () (). f s) -> Either Int String This leads to the dependent product being a function space when there is no dependency, and similarly the dependent sum being a (normal) product.

Another way of understanding the terminology is this: A "dependent product" type (usually written \Pi x:A.B) can be thought of as a big product of the types (B[a1/x], ..., B[aN/x]) for the inhabitants a1...aN of the type A. To introduce a dependent product, you have to provide each component of the tuple. To eliminate a dependent product, you can project out one component. A "dependent sum" type (usually written \Sigma x:A.B) can be thought of as a big sum of the types B[a1/x] + ... + B[aN/x], where again a1 .. aN are all the inhabitants of A. To introduce a dependent sum, you pick a particular branch of the sum (by choosing an a_i) and inject a value of that type (i.e., B[a_i/x]). To eliminate a dependent sum, you can see what branch a value is in (i.e., recover the a_i) and you can see what the associated data is. This terminology takes a little getting used to because of how we usually present these intro and elim forms syntactically. Consider dependent products. For finite products (e.g., (Int, String)) we introduce a product just by writing down an inhabitant of each component type. However, the dependent product needs one component for each inhabitant of the type A, so we can't in general list them all explicitly. Thus, the usual intro form for a dependent product is to give an inhabitant of B for a generic inhabitant x of A (i.e., a variable x:A). So the intro form is a function. On the other hand, to eliminate a finite product we usually say "I'd like the 2nd component, please". But for a dependent product, you have to say "I'd like the a_i^th component", so the elim is application. Sums are similar. For a finite sum, we usually say "this data is in the second branch", but for a dependent sum, we have to say "this data is in the a_i^th branch". So the intro form is a pair of a tag a_i and a value of type B[a_i/x]. On the other hand, to eliminate a finite sum, we usually do a case with one branch for each of the tags, where in each branch we get the tagged value. To eliminate a dependent sum, we cover all those cases at once by getting (1) a generic tag x:A and (2) a value of the type associated with that tag. But this is exactly a split (x,y) = e1 in e2 pattern-matching style elimination form for a pair. So, dependent products are represented by functions, and dependent sums are represented by pairs. It's confusing at first, but it's established terminology. I tend to use "dependent function" and "dependent pair" myself, since sometimes people use "dependent product" to mean \Sigma rather than \Pi (understandable given the intro and elim for \Sigma!), so I sometimes find it ambiguous. Does that make sense? -Dan On Aug04, Derek Elkins wrote:
On Sat, 2007-08-04 at 23:25 +0800, Shin-Cheng Mu wrote:
[...]
In Haskell, the existential quantifier is mimicked by forall. We define:
data DepSum a p = forall i . DepSum (a i) (p i)
The term "dependent sum" is borrowed from the Omega tutorial of Sheard, Hook, and Linger [2] (why is it called a sum, not a product?).
There is already a dependent product. One compelling reason for this naming comes from category theory and how they relate to each other (the dependent sum is a left adjoint, the dependent product is a right adjoint [not directly to each other].) They are related in the same way products and sums are related, or universal quantification and existential quantification are. So, looking at the product case (A,B)=AxB (and ()=1) and the sum case Either A B = A+B we can see how they generalize to be the above.
Let's say we have a pair of the same type. We can clearly represent it as AxA, but another alternative is 1+1->A. We can write the isomorphism in Haskell: pToF :: (a,a) -> (Either () () -> a) pToF (x,_) (Left ()) = x pToF (_,y) (Right ()) = y
fToP :: (Either () () -> a) -> (a,a) fToP f = (f (Left ()), f (Right ()))
Similarly for sums, A+A can be written (1+1)xA. Again, the Haskell: sToP :: Either a a -> (Either () (),a) sToP (Left x) = (Left (), x) sToP (Right x) = (Right (), x)
pToS :: (Either () (), a) -> Either a a pToS (Left (), x) = Left x pToS (Right (), x) = Right x
Incidentally, this is how sums are usually encoded, i.e. a pair of a tag and a value. (Also incidentally, these are almost exactly the isomorphisms defining adjunctions related to the ones that define product and sum, i.e. the above definitions immediately fall out of the category theory.)
So how to represent AxB or A+B with this encoding? You can't. However, in a dependently typed language we can write (in some dependently typed pseudo-Haskell) [for concreteness, let's set A=Int, B=String]: f :: Either () () -> Type f (Left ()) = Int f (Right ()) = String
Now we use the same definitions as above only with the types, pToF :: (Int,String) -> (s :: Either () () -> f s) fToP :: (s :: Either () () -> f s) -> (Int,String) sToP :: Either Int String -> s :: Either () (). f s pToS :: (s :: Either () (). f s) -> Either Int String
This leads to the dependent product being a function space when there is no dependency, and similarly the dependent sum being a (normal) product.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (5)
-
apfelmus
-
Dan Licata
-
Derek Elkins
-
Jim Apple
-
Shin-Cheng Mu