
After you grok induction over the naturals, you can start to think about structural induction, which is usually what we use in programming. They are related, and understanding one will help you understand the other (structural induction actually made more sense to me when I was learning, because I started as a programmer and then became a mathematician, so I thought in terms of data structures). Paul: I was hoping that understanding the classic mathematical concept would help me appreciate the structural computer science) variation better. I don't know what it is about induction that I'm not seeing. It's so frustrating! Deduction in spite of the complexity in some parts makes perfect sense. This, however, is a different beast! So let's say you have a tree, and we want to count the number of leaves in the tree. data Tree = Leaf Int | Branch Tree Tree countLeaves :: Tree -> Int countLeaves (Leaf _) = 1 countLeaves (Branch l r) = countLeaves l + countLeaves r We want to prove that countLeaves is correct. So P(x) means "countLeaves x == the number of leaves in x". Paul: By 'correct' presumably you mean sound. First we prove P(Leaf i), since leaves are the trees that have no subtrees. This is analogous to proving P(0) over the naturals. Paul: I'd presume 'proof' here means applying the function to one leaf to see if it returns 1. If I'm not mistaken this is establishing the base case. countLeaves (Leaf i) = 1, by definition of countLeaves. "Leaf i" has exactly one leaf, obviously. So countLeaves (Leaf i) is correct. Now to prove P(Branch l r), we get to assume that P holds for all of its subtrees, namely we get to assume P(l) and P(r). Paul: How did you arrive at this decision? Why can we assume that P holds fr all its subtrees? You can think of this as constructing an algorithm to prove P for any tree, and we have "already proved" it for l and r in our algorithm. Paul: Is this the function definition for countLeaves? This is analogous to proving P(n) => P(n+1) in the case of naturals. Paul:I thought P(n) was the induction hypothesis and P(n+1) was the proof that the formula/property holds for the subsequent element in the sequence if P(n) is true. I don't see how countLeaves l and countLeaves r are analogous to P(n) and P(n+1). So: Assume P(l) and P(r). P(l) means "countLeaves l == the number of leaves in l" P(r) means "countLeaves r == the number of leaves in r" countLeaves (Branch l r) = countLeaves l + countLeaves r, by definition. And that is the number of leaves in "Branch l r", the sum of the number of leaves in its two subtress. Therefore P(Branch l r). Now that we have those two cases, we are done; P holds for any Tree whatsoever. In general you will have to do one such proof for each case of your data structure in order to prove a property for the whole thing. At each case you get to assume the property holds for all substructures. Generally not so many steps are written down. In fact in this example, most people who do this kind of thing a lot would write "straightforward induction", and that would be the whole proof :-) The analogy between structural induction and induction over the naturals is very strong; in fact induction over the naturals is just induction over this data structure: data Nat = Zero | Succ Nat Hope this helps. Luke