The Holy Trinity of Functional Programming

Hi Folks, In Richard Bird's book (Introduction to Functional Programming using Haskell) he has a fascinating discussion on three key aspects of functional programming. Below is my summary of his discussion. /Roger --------------------------------------------------------- The Holy Trinity of Functional Programming --------------------------------------------------------- These three ideas constitute the holy trinity of functional programming: 1. User-defined recursive data types. 2. Recursively defined functions over recursive data types. 3. Proof by induction: show that some property P(n) holds for each element of a recursive data type. Here is an example of a user-defined recursive data type. It is a declaration for the natural numbers 0, 1, 2, ...: data Nat = Zero | Succ Nat The elements of this data type include: Zero, Succ Zero, Succ (Succ Zero), Succ (Succ (Succ Zero)), ... To understand this, when creating a Nat value we have a choice of either Zero or Succ Nat. Suppose we choose Succ Nat. Well, now we must choose a value for the Nat in Succ Nat. Again, we have a choice of either Zero or Succ Nat. Suppose this time we choose Zero, to obtain Succ Zero. The ordering of the elements in the Nat data type can be specified by defining Nat to be a member of the Ord class: instance Ord Nat where Zero < Zero = False Zero < Succ n = True Succ m < Zero = False Succ m < Succ n = (m < n) Here is how the Nat version of the expression 2 < 3 is evaluated: Succ (Succ Zero) < Succ (Succ (Succ Zero)) -- Nat version of 2 < 3 = Succ Zero < Succ (Succ Zero) -- by the 4th equation defining order = Zero < Succ Zero -- by the 4th equation defining order = True -- by the 2nd equation defining order Here is a recursively defined function over the data type; it adds two Nat elements: (+) :: Nat -> Nat -> Nat m + Zero = m m + Succ n = Succ(m + n) Here is how the Nat version of 0 + 1 is evaluated: Zero + Succ Zero -- Nat version of 0 + 1 = Succ (Zero + Zero) -- by the 2nd equation defining + = Succ Zero -- by the 1st equation defining + Here is another recursively defined function over the data type; it subtracts two Nat elements: (-) :: Nat -> Nat -> Nat m - Zero = m Succ m - Succ n = m - n If the Nat version of 0 - 1 is executed, then the result is undefined: Zero - Succ Zero The "undefined value" is denoted by this symbol: _|_ (also known as "bottom") Important: _|_ is an element of *every* data type. So we must expand the list of elements in Nat: _|_, Zero, Succ Zero, Succ (Succ Zero), Succ (Succ (Succ Zero)), ... There are still more elements of Nat. Suppose we define a function that returns a Nat. Let's call the function undefined: undefined :: Nat undefined = undefined It is an infinitely recursive function: when invoked it never stops until the program is interrupted. This function undefined is denoted by the symbol _|_ Recall how we defined the ordering of values in Nat: instance Ord Nat where Zero < Zero = False Zero < Succ n = True Succ m < Zero = False Succ m < Succ n = (m < n)
From that ordering we can see that Succ _|_ is an element of Nat:
Zero < Succ undefined -- Note: same as Zero < Succ _|_ = True -- by the 2nd equation defining order And Succ (Succ _|_ ) is an element of Nat: Succ Zero < Succ (Succ undefined) -- Note: same as Zero < Succ (Succ _|_ ) = Zero < Succ undefined -- by the 4th equation defining order = True -- by the 2nd equation defining order And Succ (Succ (Succ _|_ ) is an element of Nat, and so forth. So the list of elements in Nat expands: ..., Succ (Succ (Succ _|_ )), Succ (Succ _|_ ), Succ _|_, _|_, Zero, Succ Zero, Succ (Succ Zero), Succ (Succ (Succ Zero)), ... One can interpret the extra values in the following way: _|_ corresponds to the natural number about which there is absolutely no information Succ _|_ to the natural number about which the only information is that it is greater than Zero Succ (Succ _|_ ) to the natural number about which the only information is that it is greater than Succ Zero And so on There is one further value of Nat, namely the "infinite" number: Succ (Succ (Succ (Succ ...))) It can be defined by this function: infinity :: Nat infinity = Succ infinity It is different from all the other Nat values because it is the only number for which Succ m < infinity for all finite numbers m: Zero < infinity = True Succ Zero < infinity = True Succ (Succ Zero) < infinity = True Thus, the values of Nat can be divided into three classes: - The finite values, Zero, Succ Zero, Succ (Succ Zero), and so on. - The partial values, _|_, Succ _|_, Succ (Succ _|_ ), and so on. - The infinite value. Important: the values of *every* recursive data type can be divided into three classes: - The finite values of the data type. - The partial values, _|_, and so on. - The infinite values. Although the infinite Nat value is not of much use, the same is not true of the infinite values of other data types. Recap: We have discussed two aspects of the holy trinity of functional programming: recursive data types and recursively defined functions over those data types. The third aspect is induction, which is discussed now. Induction allows us to reason about the properties of recursively defined functions over a recursive data type. In the case of Nat the principle of induction can be stated as follows: in order to show that some property P(n) holds for each finite number n of Nat, it is sufficient to treat two cases: Case (Zero). Show that P(Zero) holds. Case (Succ n). Show that if P(n) holds, then P(Succ n) also holds. As an example, let us prove this property (the identity for addition): Zero + n = n Before proving this, recall that + is defined by these two equations: m + Zero = m m + Succ n = Succ(m + n) Proof: The proof is by induction on n. Case (Zero). Substitute Zero for n in the equation Zero + n = n. So we have to show that Zero + Zero = Zero, which is immediate from the first equation defining +. Case (Succ n). Assume P(n) holds; that is, assume Zero + n = n holds. This equation is referred to as the induction hypothesis. We have to show that P(Succ n) holds; that is, show that Zero + Succ n = Succ n holds. We do so by simplifying the left-hand expression: Zero + Succ n = Succ (Zero + n) -- by the 2nd equation defining + = Succ (n) -- by the induction hypothesis We have proven the two cases and so we have proven that Zero + n = n holds for all finite numbers of Nat. Full Induction. To prove that a property P holds not only for finite members of Nat, but also for every partial (undefined) number, then we have to prove three things: Case ( _|_ ). Show that P( _|_ ) holds. Case (Zero). Show that P(Zero) holds. Case (Succ n). Show that if P(n) holds, then P(Succ n) holds also. To just prove that a property P holds for every partial (undefined) number, then we omit the second case. To illustrate proving a property that holds for every partial number (not for the finite numbers), let us prove the rather counterintuitive result that m + n = n for all numbers m and all partial numbers n. For easy reference, we repeat the definition of + m + Zero = m m + Succ n = Succ(m + n) Proof: The proof is by partial number induction on n. Case ( _|_ ). Substitute _|_ for n in the equation m + n = n. So we have to show that m + _|_ = _|_, which is true since _|_ does not match either of the patterns in the definition of +. Case (Succ n). We assume P(n) holds; that is, assume m + n = n holds. This equation is the induction hypothesis. We have to show that P(Succ n) holds; that is, show that m + Succ n = Succ n holds. For the left-hand side we reason m + Succ n = Succ(m + n) -- by the second equation for + = Succ(n) -- by the induction hypothesis Since the right-hand side is also Succ n, we are done. The omitted case (m + Zero = Zero) is false, which is why the property does not hold for finite numbers. As an added bonus, having proved that an equation holds for all partial (undefined) numbers, we can assert that it holds for the infinite number too; that is, P(infinity) holds. Thus, we can now assert that m + infinity = infinity for all numbers m.

On 08/22/2012 01:27 AM, Costello, Roger L. wrote:
--------------------------------------------------------- The Holy Trinity of Functional Programming --------------------------------------------------------- These three ideas constitute the holy trinity of functional programming:
1. User-defined recursive data types. 2. Recursively defined functions over recursive data types. 3. Proof by induction: show that some property P(n) holds for each element of a recursive data type.
<snip>
Although the infinite Nat value is not of much use, the same is not true of the infinite values of other data types.
Please bear with us simpletons: But do you think you could provide a more "real world" example of an application of the "Holy Trinity" ideas? (One of those mysterious "other data types".) Not to say that recursive data types and proof by induction aren't fascinating in and of themselves... but sometimes it is difficult to make the connection to the practical programming process. -- frigidcode.com indicium.us

Hello Christopher,
But do you think you could provide a more "real world" example of an application of the "Holy Trinity" ideas?
A commonly cited real-world example that illustrates the first key idea (recursive data type) is a binary tree: data Tree = Node String Tree Tree | Leaf The elements of this data type include: Leaf Node "Root" Leaf Leaf Node "Root" (Node "Left" Leaf Leaf) (Node "Right" Leaf Leaf) And so forth. Common functions on this data type include insertions, deletions, traversal. I am not sure if there is a use for the infinite value of this data type. /Roger

On Wed, Aug 22, 2012 at 7:38 AM, Costello, Roger L.
But do you think you could provide a more "real world" example of an application of the "Holy Trinity" ideas?
A commonly cited real-world example that illustrates the first key idea (recursive data type) is a binary tree:
I'd have used lists, since in Haskell we make use of infinite lists quite a bit; a simple example being zipping some list against [0..] (a list comprising an infinitely ascending sequence of numbers) to pair each item with its index. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

Yes, but we can also use a similar style of inductive reasoning in the face of infinite types, called Scott induction [1]. There is also a perspective that infinite types (or so called 'codata') should be reasoned with 'coinduction', although the proof principles there are a little more esoteric. Cheers, Edward [1] http://blog.ezyang.com/2010/12/no-one-expects-the-scott-induction/ Excerpts from Brandon Allbery's message of Wed Aug 22 11:04:48 -0400 2012:
On Wed, Aug 22, 2012 at 7:38 AM, Costello, Roger L.
wrote: But do you think you could provide a more "real world" example of an application of the "Holy Trinity" ideas?
A commonly cited real-world example that illustrates the first key idea (recursive data type) is a binary tree:
I'd have used lists, since in Haskell we make use of infinite lists quite a bit; a simple example being zipping some list against [0..] (a list comprising an infinitely ascending sequence of numbers) to pair each item with its index.

On Wed, 22 Aug 2012, Costello, Roger L.
Hello Christopher,
But do you think you could provide a more "real world" example of an application of the "Holy Trinity" ideas?
A commonly cited real-world example that illustrates the first key idea (recursive data type) is a binary tree:
data Tree = Node String Tree Tree | Leaf
The elements of this data type include:
Leaf
Node "Root" Leaf Leaf
Node "Root" (Node "Left" Leaf Leaf) (Node "Right" Leaf Leaf)
And so forth.
Common functions on this data type include insertions, deletions, traversal.
I am not sure if there is a use for the infinite value of this data type.
/Roger
Let us leave aside two kind of things: 1. _|_ 2. "infinite" things Let us define a signature Sig to be a structure like so: 1. We have a finite set Names. 2. For each name a in Names we have a finite non-negative integer, called the "arity" of a. Note that we allow a name, say a, to have an arity of zero. Such names are called "constants" in some formalisms and distinguished from names with an arity greater than zero. We define a structure, sometimes called a "model" or an "algebra with signature Sig", to be: 1. A nonempty set S. 2. For each name a of arity n we have a function f: S^n -> S, where a. S^n is the Cartesian product of S b. f is everywhere defined on S^n and f is single-valued If we call our model M then we may use this notation: 1. The ground set of M is S, which we write as Ground(M). 2. The f associated to the name a is M(a). We now define the set of terms, which set we call "Terms", over our signature: 1. Given a name a and list of terms of length n, where n is the arity of a, t0, t1, .. t(n-1) the list a, t0, t1, ... t(n-1) is a term. 2. Terms is the minimal set which satisfies 1. Now, given a signature Sig, we may translate this setup into Haskell. We do an example. Let Sig be given as follows: Names is the set {a, b, c, d, e, f}. The arities are a has arity 0 b has arity 0 c has arity 0 d has arity 1 e has arity 2 f has arity 5 the set Terms over Sig is in one to one correspondence with those values of this type in Haskell: Data T = Ta | Tb | Tc | Td T | Te T T | Tf T T T T T (If I have made an error in the translation please tell me.) So, for the special case of "term algebras", or absolutely free algebras, or "magmas", over a finite set of generators (here taken to be the names with zero arity) and a finite set of operations we get a natural, ah mystic word of power, isomorphism with a certain simple kind of type in Haskell. This special case is one of the paradigm cases in Type Theory. One of the standard texts on Universal Algebra, Birkhoff style: A Course in Universal Algebra ** The Millennium Edition ** by Stanley N. Burris and H.P. Sankappanavar is at http://www.math.uwaterloo.ca/~snburris/htdocs/ualg.html The discussion of signatures begins on page 25, and the discussion of term algebras on page 68. (The treatment is less good for people studying programming than I remembered it as being.) New Types workers have undertaken the large task of extending the theory of ordinary algebras to "algebras" with such things as _|_ and (+ a (+ b (+ c ... . But new phenomena appear, and, when starting the study of type theory, even in the practical case of learning Haskell, it is good to take one concept at a time. Note that in this special case, clearly to every term of type T corresponds a (labeled) tree. In our example the leaves, that is, nodes with zero children, are labelled with Ta, Tb, and Tc, and the interior nodes with one child are labelled by Td, with two by Te, and with five by Tf. So for this paradigm case, "definition by pattern matching" has a natural first cartoon: we watch a dot race down the tree of the given term, and at each node, ah, I now restrain myself from the rest of the standard rant. And I now notice that I have not made use of the concept "Model over a Sig". Heaven forwarding, I will rant^Wpost more. oo--JS. PS. ad use of one infinitely downward branching tree: Often paths in this particular binary tree, which is often called "The Infinite Binary Tree", are taken, modulo a subtle equivalence relation, to be the reals between 0 and 1.
participants (5)
-
Brandon Allbery
-
Christopher Howard
-
Costello, Roger L.
-
Edward Z. Yang
-
Jay Sulzberger