[Haskell-beginners] Re: folds -- help!

11 Mar 2009


      Kurt Hutchinson  writes:
...
You're thinking of a slightly different 'empty' here. You're thinking
of what happens when you reach the end of the list, after folding it
all, and there are no more elements to fold. In this example, you're
right that the acc parameter is 6. But what if the list you *first
gave to foldr* was empty? Then it would evaluate to 0, the initial
seed value.
Ok.
...
...
if f can start delivering the result without looking at its second argument,
you can start consuming the result before the fold has traversed the whole
list.
...
Ok, that isn't clearly illustrated by the example in the book:
...
...
foldl (+) 0 (1:2:3:[])
   == foldl (+) (0 + 1)             (2:3:[])
   == foldl (+) ((0 + 1) + 2)       (3:[])
   == foldl (+) (((0 + 1) + 2) + 3) []
   ==           (((0 + 1) + 2) + 3)
foldr (+) 0 (1:2:3:[])
   ==  1 +           foldr (+) 0 (2:3:[])
   ==  1 + (2 +      foldr (+) 0 (3:[])
   ==  1 + (2 + (3 + foldr (+) 0 []))
   ==  1 + (2 + (3 + 0))
In that example, it doesn't look like anything in foldr can be evaluated
until the whole fold has been completed.
You're right, that example doesn't show how you could start using the
result without fully evaluating the fold, since addition doesn't give
partial results. The concat example is better in that regard.
Ok. I guess I did understand something.  Therefore, I think 
the example in the book that uses addition with foldl and foldr is TERRIBLE.  
I think the sections on folds in RWH are a catastrophe and 
need to be rewritten.  The authors need to get rid of the "bit twiddling" 
example and provide an example like concat to clearly show the 
difference between foldl and foldr.  I think the example using foldl 
and foldr with addition should be a secondary example to demonstrate 
that sometimes it doesn't matter whether you use foldl or foldr.
...
...
...
Common examples are things like
concat = foldr (++) [],
so
concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
and the start (l1) can be used before further reducing the fold,
So does haskell store a thunk for everything to the right of l1?
You said that when using foldr you can start "consuming" the beginning of 
the before the whole result is reduced.  I don't quite get that.
A thunk is used as a stand-in for most calculations before the result
is actually calculated. That way, if you never try to use the result,
the calculation never needs to be done, and that means less work. As
an example that ties to the concat example above, say your program
only wanted to test if the result of the concat fold was an empty
list. The function 'null' takes a list and evaluates to True or False,
based on whether the list is empty or not. So:
someFunc xs = null ( concat xs )
  where
  concat ys = foldr (++) [] ys
The 'null' function only needs to test whether the list that is the
result of foldConcat has at least one element. Let's say l1 has an
element. So it's kind of evaluated like this:
  someFunc [ l1, l2, l3, l4, l5 ]
  null (concat [ l1, l2, l3, l4, l5 ] )
  null ( l1 ++ ( thunk with rest of fold ))
  False
The rest of the fold doesn't need to be evaluated, since the beginning
part is enough for 'null' to tell that the result would have at least
one element (because l1 does). That's one way foldr can be used to
start consuming the result before the entire fold is done.
Ok.  The terminology is just a little confusing for me. When you say
"start consuming before the fold is done" it sounds to me like
somehow you are reading partial results off the front of the result
before foldr returns the entire result.  But as you explained above
what you actually mean is that foldr will return the entire result--
with part of the result being a thunk, and subsequently you can do 
operations on the result that may never require the thunked part to 
be evaluated.
...
It depends
completely on the accFunc: if it can return partial results, like
concat, then you can start consuming the result before a full
evaluation. Some accFunc's can't return partial results, like regular
addition. In that case, it's probably better to use foldl' (note the
apostrophe), which will force the thunks to be evaluated as they are
generated and so use less memory. foldl is the same as foldl' except
[foldl] does generate thunks, and then evaluates them all at the end of the
fold, so it uses a bunch of memory to store the thunks in the
meantime, which usually isn't useful.
Kurt
Thanks to you and Daniel for the great explanations.  I'm feeling a lot
better about folds now.


....
....
......
....
...
...
....
....
....
....
...
...
...
....

[Haskell-beginners] Re: folds -- help!

7stud