The Holy Trinity of Functional Programming

22 Aug 2012

      Hi Folks,

In Richard Bird's book (Introduction to Functional Programming using Haskell) he has a fascinating discussion on three key aspects of functional programming. Below is my summary of his discussion.  /Roger

---------------------------------------------------------
The Holy Trinity of Functional Programming
---------------------------------------------------------
These three ideas constitute the holy trinity of functional programming:

1.  User-defined recursive data types.
2.  Recursively defined functions over recursive data types.
3.  Proof by induction: show that some property P(n) holds for each element of a recursive data type.

Here is an example of a user-defined recursive data type. It is a declaration for the natural numbers 0, 1, 2, ...:

	data Nat = Zero | Succ Nat

The elements of this data type include:

	Zero,  Succ Zero,  Succ (Succ Zero),  Succ (Succ (Succ Zero)),  ...

To understand this, when creating a Nat value we have a choice of either Zero or Succ Nat. Suppose we choose Succ Nat. Well, now we must choose a value for the Nat in Succ Nat. Again, we have a choice of either Zero or Succ Nat. Suppose this time we choose Zero, to obtain Succ Zero.

The ordering of the elements in the Nat data type can be specified by defining Nat to be a member of the Ord class:

	instance Ord Nat where
         	      Zero < Zero 		=  False
        	      Zero < Succ n	=  True
        	      Succ m < Zero	=  False
        	      Succ m < Succ n 	=  (m < n)

Here is how the Nat version of the expression 2 < 3 is evaluated:

	   Succ (Succ Zero) < Succ (Succ (Succ Zero))	-- Nat version of 2 < 3

	= Succ Zero < Succ (Succ Zero)			-- by the 4th equation defining order

	= Zero < Succ Zero				-- by the 4th equation defining order

	= True						-- by the 2nd equation defining order

Here is a recursively defined function over the data type; it adds two Nat elements:

	(+) :: Nat -> Nat -> Nat
	m + Zero = m
	m + Succ n = Succ(m + n)

Here is how the Nat version of 0 + 1 is evaluated:

	   Zero + Succ Zero				-- Nat version of 0 + 1

	= Succ (Zero + Zero)				-- by the 2nd equation defining +

	= Succ Zero					-- by the 1st equation defining +

Here is another recursively defined function over the data type; it subtracts two Nat elements:

	(-) :: Nat -> Nat -> Nat
	m - Zero = m
	Succ m - Succ n = m - n

If the Nat version of 0 - 1 is executed, then the result is undefined:

	Zero - Succ Zero

The "undefined value" is denoted by this symbol: _|_ (also known as "bottom")

Important: _|_ is an element of *every* data type.

So we must expand the list of elements in Nat:

	_|_,  Zero,  Succ Zero,  Succ (Succ Zero),  Succ (Succ (Succ Zero)),  ...

There are still more elements of Nat. Suppose we define a function that returns a Nat. Let's call the function undefined:

	undefined :: Nat
	undefined = undefined

It is an infinitely recursive function: when invoked it never stops until the program is interrupted.

This function undefined is denoted by the symbol _|_

Recall how we defined the ordering of values in Nat:

	instance Ord Nat where
  	      Zero < Zero 		=  False
  	      Zero < Succ n	=  True
  	      Succ m < Zero	=  False
  	      Succ m < Succ n 	=  (m < n)
...
From that ordering we can see that Succ _|_ is an element of Nat:
Zero < Succ undefined				-- Note: same as Zero < Succ _|_

	= True						-- by the 2nd equation defining order

And Succ (Succ _|_ ) is an element of Nat:

	   Succ Zero < Succ (Succ undefined)		-- Note: same as Zero < Succ (Succ _|_ )

	= Zero < Succ undefined				-- by the 4th equation defining order

	= True						-- by the 2nd equation defining order

And Succ (Succ (Succ _|_ ) is an element of Nat, and so forth.

So the list of elements in Nat expands:

..., Succ (Succ (Succ _|_ )),  Succ (Succ _|_ ),  Succ _|_, _|_,  Zero,  Succ Zero,  Succ (Succ Zero),  Succ (Succ (Succ Zero)), ...

One can interpret the extra values in the following way: 

_|_ corresponds to the natural number about which there is absolutely no information

Succ _|_ to the natural number about which the only information is that it is greater than Zero

Succ (Succ _|_ ) to the natural number about which the only information is that it is greater than Succ Zero

And so on

There is one further value of Nat, namely the "infinite" number:

	Succ (Succ (Succ (Succ ...)))

It can be defined by this function:

	infinity :: Nat
	infinity = Succ infinity

It is different from all the other Nat values because it is the only number for which Succ m < infinity for all finite numbers m: 

	Zero < infinity
	= True

	Succ Zero < infinity
	= True

	Succ (Succ Zero) < infinity
	= True

Thus, the values of Nat can be divided into three classes:

-  The finite values, Zero,  Succ Zero,  Succ (Succ Zero), and so on.
-  The partial values, _|_,  Succ _|_,  Succ (Succ _|_ ), and so on.
-  The infinite value.

Important: the values of *every* recursive data type can be divided into three classes:

-  The finite values of the data type.
-  The partial values, _|_, and so on.
-  The infinite values.

Although the infinite Nat value is not of much use, the same is not true of the infinite values of other data types.

Recap: We have discussed two aspects of the holy trinity of functional programming: recursive data types and recursively defined functions over those data types. The third aspect is induction, which is discussed now.

Induction allows us to reason about the properties of recursively defined functions over a recursive data type. 

In the case of Nat the principle of induction can be stated as follows: in order to show that some property P(n) holds for each finite number n of Nat, it is sufficient to treat two cases:

	Case (Zero). Show that P(Zero) holds.

	Case (Succ n). Show that if P(n) holds, then P(Succ n) also holds.

As an example, let us prove this property (the identity for addition):

	Zero + n = n

Before proving this, recall that + is defined by these two equations:

	m + Zero = m
	m + Succ n = Succ(m + n)

Proof: The proof is by induction on n. 

Case (Zero). Substitute Zero for n in the equation Zero + n = n. So we have to show that Zero + Zero = Zero, which is immediate from the first equation defining +.

Case (Succ n). Assume P(n) holds; that is, assume Zero + n = n holds. This equation is referred to as the induction hypothesis. We have to show that P(Succ n) holds; that is, show that Zero + Succ n = Succ n holds. We do so by simplifying the left-hand expression:

	   Zero + Succ n

	= Succ (Zero + n) 			-- by the 2nd equation defining +

	= Succ (n) 				-- by the induction hypothesis

We have proven the two cases and so we have proven that Zero + n = n holds for all finite numbers of Nat.

Full Induction. To prove that a property P holds not only for finite members of Nat, but also for every partial (undefined) number, then we have to prove three things:

	Case ( _|_ ). Show that P( _|_ ) holds.

	Case (Zero). Show that P(Zero) holds.

	Case (Succ n). Show that if P(n) holds, then P(Succ n) holds also.

To just prove that a property P holds for every partial (undefined) number, then we omit the second case.

To illustrate proving a property that holds for every partial number (not for the finite numbers), let us prove the rather counterintuitive result that m + n = n for all numbers m and all partial numbers n.

For easy reference, we repeat the definition of +

	m + Zero = m
	m + Succ n = Succ(m + n)

Proof: The proof is by partial number induction on n.

Case (Succ n). We assume P(n) holds; that is, assume m + n = n holds. This equation is the induction hypothesis. We have to show that P(Succ n) holds; that is, show that m + Succ n = Succ n holds. For the left-hand side we reason

	   m + Succ n

	= Succ(m + n)				-- by the second equation for +

	= Succ(n)				-- by the induction hypothesis

Since the right-hand side is also Succ n, we are done.

The omitted case (m + Zero = Zero) is false, which is why the property does not hold for finite numbers.

As an added bonus, having proved that an equation holds for all partial (undefined) numbers, we can assert that it holds for the infinite number too; that is, P(infinity) holds. Thus, we can now assert that m + infinity = infinity for all numbers m.

Costello, Roger L.

Christopher Howard

Costello, Roger L.

Brandon Allbery

Edward Z. Yang

Jay Sulzberger

tags

participants (5)