Re: [Haskell-cafe] Induction (help!)

After you grok induction over the naturals, you can start to think about structural induction, which is usually what we use in programming. They are related, and understanding one will help you understand the other (structural induction actually made more sense to me when I was learning, because I started as a programmer and then became a mathematician, so I thought in terms of data structures). Paul: I was hoping that understanding the classic mathematical concept would help me appreciate the structural computer science) variation better. I don't know what it is about induction that I'm not seeing. It's so frustrating! Deduction in spite of the complexity in some parts makes perfect sense. This, however, is a different beast! So let's say you have a tree, and we want to count the number of leaves in the tree. data Tree = Leaf Int | Branch Tree Tree countLeaves :: Tree -> Int countLeaves (Leaf _) = 1 countLeaves (Branch l r) = countLeaves l + countLeaves r We want to prove that countLeaves is correct. So P(x) means "countLeaves x == the number of leaves in x". Paul: By 'correct' presumably you mean sound. First we prove P(Leaf i), since leaves are the trees that have no subtrees. This is analogous to proving P(0) over the naturals. Paul: I'd presume 'proof' here means applying the function to one leaf to see if it returns 1. If I'm not mistaken this is establishing the base case. countLeaves (Leaf i) = 1, by definition of countLeaves. "Leaf i" has exactly one leaf, obviously. So countLeaves (Leaf i) is correct. Now to prove P(Branch l r), we get to assume that P holds for all of its subtrees, namely we get to assume P(l) and P(r). Paul: How did you arrive at this decision? Why can we assume that P holds fr all its subtrees? You can think of this as constructing an algorithm to prove P for any tree, and we have "already proved" it for l and r in our algorithm. Paul: Is this the function definition for countLeaves? This is analogous to proving P(n) => P(n+1) in the case of naturals. Paul:I thought P(n) was the induction hypothesis and P(n+1) was the proof that the formula/property holds for the subsequent element in the sequence if P(n) is true. I don't see how countLeaves l and countLeaves r are analogous to P(n) and P(n+1). So: Assume P(l) and P(r). P(l) means "countLeaves l == the number of leaves in l" P(r) means "countLeaves r == the number of leaves in r" countLeaves (Branch l r) = countLeaves l + countLeaves r, by definition. And that is the number of leaves in "Branch l r", the sum of the number of leaves in its two subtress. Therefore P(Branch l r). Now that we have those two cases, we are done; P holds for any Tree whatsoever. In general you will have to do one such proof for each case of your data structure in order to prove a property for the whole thing. At each case you get to assume the property holds for all substructures. Generally not so many steps are written down. In fact in this example, most people who do this kind of thing a lot would write "straightforward induction", and that would be the whole proof :-) The analogy between structural induction and induction over the naturals is very strong; in fact induction over the naturals is just induction over this data structure: data Nat = Zero | Succ Nat Hope this helps. Luke

Am Dienstag, 6. Mai 2008 23:34 schrieb PR Stanley:
After you grok induction over the naturals, you can start to think about structural induction, which is usually what we use in programming. They are related, and understanding one will help you understand the other (structural induction actually made more sense to me when I was learning, because I started as a programmer and then became a mathematician, so I thought in terms of data structures). Paul: I was hoping that understanding the classic mathematical concept would help me appreciate the structural computer science) variation better.
Understanding either will help understanding the other, they're the same idea in different guise.
I don't know what it is about induction that I'm not seeing. It's so frustrating! Deduction in spite of the complexity in some parts makes perfect sense. This, however, is a different beast!
So let's say you have a tree, and we want to count the number of leaves in the tree.
data Tree = Leaf Int | Branch Tree Tree
countLeaves :: Tree -> Int countLeaves (Leaf _) = 1 countLeaves (Branch l r) = countLeaves l + countLeaves r
We want to prove that countLeaves is correct. So P(x) means "countLeaves x == the number of leaves in x". Paul: By 'correct' presumably you mean sound.
First we prove P(Leaf i), since leaves are the trees that have no subtrees. This is analogous to proving P(0) over the naturals. Paul: I'd presume 'proof' here means applying the function to one leaf to see if it returns 1. If I'm not mistaken this is establishing the base case.
Yes, right.
countLeaves (Leaf i) = 1, by definition of countLeaves. "Leaf i" has exactly one leaf, obviously. So countLeaves (Leaf i) is correct.
Now to prove P(Branch l r), we get to assume that P holds for all of its subtrees, namely we get to assume P(l) and P(r). Paul: How did you arrive at this decision? Why can we assume that P holds fr all its subtrees?
It's the induction hypothesis. We could paraphrase the induction step as "If P holds for all cases of lesser complexity than x, then P also holds for x". In this example, the subtrees have lesser complexity than the entire tree (you could define complexity as depth). The induction step says "assuming P(l) and P(r), we can deduce P(Branch l r)", or, as a formula, forall l, r. ([P(l) and P(r)] ==> P(Branch l r)).
You can think of this as constructing an algorithm to prove P for any tree, and we have "already proved" it for l and r in our algorithm. Paul: Is this the function definition for countLeaves?
This is analogous to proving P(n) => P(n+1) in the case of naturals. Paul:I thought P(n) was the induction hypothesis and P(n+1) was the proof that the formula/property holds for the subsequent element in the sequence if P(n) is true. I don't see how countLeaves l and countLeaves r are analogous to P(n) and P(n+1).
(countLeaves l == number of leaves in l) and (countLeaves r == number of leaves in r) together are analogous to P(n). Proving forall l, r. ([P(l) and P(r)] ==> P(Branch l r)) is analogous to proving forall n. [P(n) ==> P(n+1)]. Let me phrase mathematical induction differently. To prove forall natural numbers n. P(n) , it is sufficient to prove 1) P(0) and 2) forall n. [P(n) implies P(n+1)] . And structural induction for lists: To prove forall (finite) lists l . P(l) , you prove 1L) P([ ]) -- the base case is the empty list 2L) forall x, l. [P(l) ==> P(x:l)] and for trees: to prove forall trees t. P(t) , you prove 1T) forall i. P(Leaf i) 2T) forall l, r. ([P(l) and P(r)] ==> P(Branch l r)). In all three cases, the pattern is the same. The key thing is number 2, which says if P holds for one level of complexity, then it also holds for the next level of complexity. In the domino analogy, 2) says that the dominos are placed close enough, so that a falling domino will tip its neighbour over. Then 1) is tipping the first domino over.
So:
Assume P(l) and P(r). P(l) means "countLeaves l == the number of leaves in l" P(r) means "countLeaves r == the number of leaves in r" countLeaves (Branch l r) = countLeaves l + countLeaves r, by definition. And that is the number of leaves in "Branch l r", the sum of the number of leaves in its two subtress. Therefore P(Branch l r).
Now that we have those two cases, we are done; P holds for any Tree whatsoever.
In general you will have to do one such proof for each case of your data structure in order to prove a property for the whole thing. At each case you get to assume the property holds for all substructures.
Generally not so many steps are written down. In fact in this example, most people who do this kind of thing a lot would write "straightforward induction", and that would be the whole proof :-)
The analogy between structural induction and induction over the naturals is very strong; in fact induction over the naturals is just induction over this data structure:
data Nat = Zero | Succ Nat
Hope this helps.
Luke
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Another way to look at it is that induction is just a way to abbreviate proofs.
Lets say you have a proposition over the natural numbers that may or
may not be true; P(x).
If you can prove P(0), and given P(x) you can prove P(x+1), then for
any given natural number n, you can prove P(n):

Ryan Ingram wrote:
One point to remember is that structural induction fails to hold on infinite data structures:
As I understand it, structural induction works even for infinite data structures if you remember that the base case is always _|_. [1] Let the initial algebra functor F = const Zero | Succ We have to prove that: P(_|_) holds if P(X) holds then P(F(X)) holds Here, we see that for P(x) = (x /= Succ x), since infinity = Succ infinity = _|_ then P(_|_) does not hold (since _|_ = Succ _|_ = _|_) As a counterexample, we can prove e.g. that length (L ++ M) = length L + length M See [2] for details, except that they neglect the base case P(_|_) and start instead with the F^1 case of P([]), so their proof is valid only for finite lists. We can include the base case too: length (_|_ ++ M) = length _|_ + length M length (_|_ ) = _|_ + length M _|_ = _|_ and similarly for M = _|_ and both _|_. So this law holds even for infinite lists. [1] Richard Bird, "Introduction to Functional Programming using Haskell", p. 67 [2] http://en.wikipedia.org/wiki/Structural_induction Also note that
data Nat = Zero | Succ Nat deriving (Eq, Show)
Take P(x) to be (x /= Succ x).
P(0): Zero /= Succ Zero. (trivial)
P(x) => P(Succ x)
So, given P(x) which is: (x /= Succ x), we have to prove P(Succ x): (Succ x /= Succ (Succ x))
In the derived Eq typeclass: Succ a /= Succ b = a /= b Taking "x" for a and "Succ x" for b: Succ x /= Succ (Succ x) = x /= Succ x which is P(x).
Now, consider the following definition: infinity :: Nat infinity = Succ infinity
infinity /= Succ infinity == _|_; and if you go by definitional equality, infinity = Succ infinity, so even though P(x) holds on all natural numbers due to structural induction, it doesn't hold on this infinite number.
-- ryan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi One of you chaps mentioned the Nat data type data Nat = Zero | Succ Nat Let's have add :: Nat -> Nat -> Nat add Zero n = n add (Succ m)n = Succ (add m n) Prove add m Zero = m I'm on the verge of giving up on this. :-( Cheers Paul

Am Mittwoch, 7. Mai 2008 23:27 schrieb PR Stanley:
Hi One of you chaps mentioned the Nat data type data Nat = Zero | Succ Nat
Let's have add :: Nat -> Nat -> Nat add Zero n = n add (Succ m) n = Succ (add m n)
Prove add m Zero = m
I'm on the verge of giving up on this. :-( Cheers Paul
Proposition: forall n. P(n) , where P(n): add n Zero = n Base case: P(Zero) add Zero Zero = Zero by definition of add (first clause). Induction step: Induction hypothesis is P(n) for some arbitrary, but fixed, n. Then: add (Succ n) Zero = Succ (add n Zero) -- by the second clause of add = Succ (n) -- by the induction hypothesis qed

On Wed, May 7, 2008 at 9:27 PM, PR Stanley
Hi One of you chaps mentioned the Nat data type
data Nat = Zero | Succ Nat
Let's have add :: Nat -> Nat -> Nat add Zero n = n add (Succ m)n = Succ (add m n)
Prove add m Zero = m
To prove this by induction on m, you would need to show: 1) add Zero Zero = Zero 2) If "add m Zero = m", then "add (Succ m) Zero = Succ m" Number (1) is completely trivial, nothing more needs to be said. (2) is easy, after expanding the definition. Here the P I used was P(x) := add m Zero = m, the thing we were trying to prove. (1) is a base case, P(Zero). (2) is the inductive step, "If P(m) then P(Succ m)". Hoping I don't sound patronizing: if you're still having trouble, then I suspect you haven't heard what it means to prove an "if-then" statement. Here's a silly example. We want to prove: If y = 10, then y - 10 = 0. First we *assume* the condition of the if. We can consider it true. Assume y = 10. Show y - 10 = 0. Well, y = 10, so that's equivalent to 10 - 10 = 0, which is true. Luke

Paul, Sometimes it helps to go exhaustively through every detail to be sure there is no magic going on. Proceed only if you want exhaustive detail... If it seems that people are skipping some steps in their argument, it is because they are! They already understand it so well that they forgot to add them. Here is what I believe is an excruciatingly complete proof, to assure you that no handwaving is going on. The values of Nat are defined inductively on their structure, using its initial algebra. Take the data type definition data Nat = Zero | Succ Nat There is actually an implied undefined as well, so this is really Nat = undefined | Zero | Succ Nat Just as 3 = const 3 x for any x, we can convert from pointed to pointfree notation (i.e. go from a recursive definition to a least-defined fixed point definition): f = const undefined | const Zero | Succ Nat = LFP (m -> infinity) ( f^m undefined ) [ For the notationally picky, I am using | instead of + in the functor for clarity, which causes no ambiguity.] Because we are overachievers, we will prove our theorem not just for the least-defined fixed point of f, but change the limit to a union and prove it for (f^m undefined) for every m, which includes the fixed point. Why the extra work? Because now we have an inductive argument. The base case is undefined, and each step is another application of f. DEFINITION: add Zero n = n -- Line 1 add (Succ m) n = Succ (add m n) -- Line 2 THEOREM: forall x :: Nat, add x Zero = x PROOF: By induction BASE CASE: x = f^0 undefined = undefined add undefined Zero = undefined { Line 1, strict pattern match failure in first arg } STEP CASE: Assume add x Zero = x, Prove add y Zero = y where y in f x What y are in f x? f x = (const undefined | const Zero | Succ) x = const undefined x | const Zero x | Succ x = undefined | Zero | Succ x We have to consider each of these possibilities for y. 1) add undefined Zero = undefined { Base case } 2) add Zero Zero = Zero { Def. line 1 } 3) add (Succ x) Zero = Succ (add x Zero) { Def. line 2 } = Succ x { Step case assumption } This exhausts the three cases for y. Therefore, by induction add x Zero = x for all x :: Nat Note how structural induction maps to induction over integers. You do not have to enumerate some flattened form of the domain and do induction over their enumerated order. Indeed, some domains (like binary trees) are not even countably infinite, so the induction hypothesis would not be valid when used in this way. Instead you rely on the fact that all algebraic data types already have an (at most countably infinite) complete partial order based on constructor application (or eqivalently, initial algebra composition) [and always remembering that there is an implied union in | with undefined], grounded at undefined, and that consequently you can do induction over constructor application. I hope that there are no errors in the above and that I have not left out even the smallest detail. You should be able to see from the above that structural induction works on any algebraic data type. Obviously, after the first time only a masochist would be so verbose. But the induction hypothesis does after all require a first time! :) Dan Weston PR Stanley wrote:
Hi One of you chaps mentioned the Nat data type data Nat = Zero | Succ Nat
Let's have add :: Nat -> Nat -> Nat add Zero n = n add (Succ m)n = Succ (add m n)
Prove add m Zero = m
I'm on the verge of giving up on this. :-( Cheers Paul

PR Stanley
Hi One of you chaps mentioned the Nat data type data Nat = Zero | Succ Nat
Let's have add :: Nat -> Nat -> Nat add Zero n = n add (Succ m)n = Succ (add m n)
Prove add m Zero = m
I'm on the verge of giving up on this. :-(
The important point is to learn to regard an infinite number of terms as one term, and how to mess with it without allowing individual terms to escape or dangle around. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.
participants (7)
-
Achim Schneider
-
Andrew Coppin
-
Dan Weston
-
Daniel Fischer
-
Luke Palmer
-
PR Stanley
-
Ryan Ingram