
On Thu, Mar 26, 2009 at 03:23:01PM -0700, Michael Mossey wrote:
But back to the gist of my question last night: I am aware that most examples of recursion presented in the books so far do their processing "as the recursion unwinds." In other words:
length :: [a] -> Int length [] = 0 length (x:xs) = 1 + length xs
This function goes deeper and deeper into the recursion before doing any calculation, then does all the sums "on the way back out."
Right, this is equivalent to length = foldr (+) 0 and results in expressions like (1 + (1 + (1 + (1 + ...)))), which isn't good in this particular case, since none of the additions can be performed until the very end of the list is reached, and all the sums are indeed done "on the way back out". There are some cases, however (such as when the result is some data structure that can be computed lazily, like another list) when this is exactly what you want.
Being an imperative programmer, I keep trying to write loops that accumulate "on the way down", like this:
length' :: Int -> [a] -> Int length' s [] = s length' s (x:xs) = length' (s+1) xs
length :: [a] -> Int length xs = length' 0 xs
And this is equivalent to length = foldl (+) 0 and results in expressions like (((((0 + 1) + 1) + 1) + 1) + ... ). This looks better at first glance, since the sums can start accumulating as you go. However, since Haskell is lazy, this particular version is no better, because the sums won't be evaluated until the result is needed anyway: instead of accumulating a number, you end up accumulating a giant thunk (unevaluated expression) which is only evaluated when its result is finally needed, after the call to length has already finished! So as long as you don't call length on really long lists, you might as well use your first version---if you're going to blow the stack on long lists anyway, you might as well do it in a more natural style. =) But read on... What we'd like is some way to force the accumulator to be evaluated as we recurse down the list---and this is exactly what the foldl' function (from Data.List) does, by introducing a bit of strictness. So the best way to write length is actually length = foldl' (+) 0. Whenever you see this 'accumulator' pattern over lists---some recursive function which recurses over a list while accumulating some small summary value---think foldl'.
I suppose both forms are valid, but the first form seems to be most natural in most situations I've encountered in the exercises.
The first is indeed more natural. Generally speaking, if you find yourself using accumulating parameters, there's probably a simpler way to do it, or some library function that already does exactly what you want to do. But it takes experience to learn and recognize such situations. -Brent