
On Mon, Feb 09, 2009 at 05:43:03PM -0800, Tom Poliquin wrote:
1) How does recursion happen?
Being an ex-Schemer I recalled that let x = 3 get translated into something like
(lambda (x) ... ) 3
So now there's a function involved. It's clear that b:d is
(cons b d)
another function. So with appropriate plumbing I could see the thing recurses but the Haskell viewpoint eludes me.
In Haskell, thinking of let expressions as being translated into lambda expressions is not very helpful, precisely because of the special ability to have recursive let bindings. In Haskell it is perfectly OK to have something like let x = 1:y y = 0:x in ... but it is not clear how this would get translated into function application (which would come first?). It's better just to think of let as a special, primitive construct.
2) How do things get started?
It seems that Haskell could figure out that 'd' is a list. If an uninitialized (undefined?) list contains [] then things are reasonably straightforward ..
An "uninitialized" list does not contain []. I think perhaps your use of the terms "uninitialized/undefined" might betray an imperative mindset (?); in any case these are not the right words to use, since they give an incorrect picture of what is going on. Here's what actually happens: as a first example, let's consider the let-binding let x = 1:x in ... What happens here is that x is bound to a closure or 'thunk' (suspended computation) which, *when needed*, will evaluate to 1:x. This is laziness at work. x is not 'uninitialized'; it is perfectly well initialized to this thunk. It is definitely not equal to []; at this point the runtime doesn't know anything about x other than the fact that it is a list, and that there is a thunk which may be evaluated if and when the value of x is actually needed. If you don't use x anywhere in the ..., nothing will ever happen and the thunk will eventually get garbage collected (in fact, it's likely that the compiler would optimize x away and it would never be in memory at all). But suppose the value of x is needed (that is, some code pattern-matches on x). Then the thunk will be evaluated just far enough to allow the pattern-match: in particular, x will now be a list with first element 1, and remainder of the list... another thunk. At this point x is not [1]; it is [1:...] where the ... represents another thunk which will evaluate to x. This process repeats as more elements of x are needed, with each thunk evaluating just far enough to allow the necessary pattern-matches, and suspending the remainder of computation in another thunk. Now, let's look at your example: let (c,d) = (d,b:d) Of course, c and d will both be initialized to thunks which will evaluate to lists as needed. It's important to realize that from a semantic point of view, b, c, and d never change. They always represent the same, unchanging values, as defined by this let statement; it's just that operationally, only part of those values may be actually evaluated, as needed. If b has the value 7, then d = 7:d, which means d = 7:7:d, which means d = 7:7:7:d, and so on, and the runtime will expand d out as far as needed (in this case, 10 elements). You really can think of d as *being* an infinite list of 7's, only part of which is evaluated.
This is all fine. The only thing that subconsciously nags me is that we appear to 'return' each time which would imply that we would leave the closure causing 'd' to be undefined again.
One way to think about it is that we only 'return' once: when tracePlus 7 is evaluated, it returns, once and for all, a value 'c', the evaluation of which happens to be suspended; it will be evaluated further as needed. Of course, if you want, you can indeed think of execution jumping back and forth between main and tracePlus as c is evaluated bit by bit, but I think this view is unhelpful, as it suggests procedure calls in an imperative language, where local variables such as 'd' would indeed go out of scope each time tracePlus 'exited'. Instead, think of tracePlus as returning a closure, which packages up the values of b, c, and d; it is this closure which is then evaluated incrementally as execution continues.
Back to the undefined list ... if it's *not* an [] and has something to do with bottom ( _|_ ) then my eyes glaze over and I have to go watch a Woody Allen movie to reset.
A truly _undefined_ list is indeed _|_, and not []. [] is quite defined; it is the list with no elements. _|_ just means 'the completely undefined value'. However, as I hope I have made clear above, the c and d in your example are neither [] nor undefined, so this is immaterial. There is much more to say on the topic, and doubtless there are places where I've told some small fibs because the truth is more complicated. But hopefully this helps provide a useful way to understand things. -Brent