Re: [Haskell-cafe] generalized, tail-recursive left fold that can finish tne computation prematurely

As others have pointed out, _in principle_, foldr is not at all deficient. We can, for example, express foldl via foldr. Moreover, we can express head, tail, take, drop and even zipWith through foldr. That is, the entire list processing library can be written in terms of foldr: http://okmij.org/ftp/Algorithms.html#zip-folds That said, to express foldl via foldr, we need a higher-order fold. There are various problems with higher-order folds, related to the cost of building closures. The problems are especially severe in strict languages or strict contexts. Indeed, foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z first constructs the closure and then applies it to z. The closure has the same structure as the list -- it is isomorphic to the list. However, the closure representation of a list takes typically quite more space than the list. So, in strict languages, expressing foldl via foldr is a really bad idea. It won't work for big lists. BTW, this is why foldM is _left_ fold. The arguments against higher-order folds as a `big hammer' were made back in 1998 by Gibbons and Jones http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.1735 So, the left-fold with the early termination has a good justification. In fact, this is how Iteratees were first presented, at the DEFUN08 tutorial (part of the ICFP2008 conference). The idea of left fold with early termination is much older though. For example, Takusen (a database access framework) has been using it since 2003 or so. For a bit of history, see http://okmij.org/ftp/Streams.html#fold-stream

* oleg@okmij.org
As others have pointed out, _in principle_, foldr is not at all deficient. We can, for example, express foldl via foldr. Moreover, we can express head, tail, take, drop and even zipWith through foldr. That is, the entire list processing library can be written in terms of foldr:
http://okmij.org/ftp/Algorithms.html#zip-folds
That said, to express foldl via foldr, we need a higher-order fold. There are various problems with higher-order folds, related to the cost of building closures. The problems are especially severe in strict languages or strict contexts. Indeed,
foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
first constructs the closure and then applies it to z. The closure has the same structure as the list -- it is isomorphic to the list. However, the closure representation of a list takes typically quite more space than the list. So, in strict languages, expressing foldl via foldr is a really bad idea. It won't work for big lists.
If we unroll foldr once (assuming l is not empty), we'll get \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l)) which is a (shallow) closure. In order to observe what you describe (a closure isomorphic to the list) we'd need a language which does reductions inside closures. Am I wrong? Roman

That said, to express foldl via foldr, we need a higher-order fold. There are various problems with higher-order folds, related to the cost of building closures. The problems are especially severe in strict languages or strict contexts. Indeed,
foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
first constructs the closure and then applies it to z. The closure has the same structure as the list -- it is isomorphic to the list. However, the closure representation of a list takes typically quite more space than the list. So, in strict languages, expressing foldl via foldr is a really bad idea. It won't work for big lists.
If we unroll foldr once (assuming l is not empty), we'll get
\z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
which is a (shallow) closure. In order to observe what you describe (a closure isomorphic to the list) we'd need a language which does reductions inside closures.
I should've elaborated this point. Let us consider monadic versions of foldr and foldl. First, monads, sort of emulate strict contexts, making it easier to see when closures are constructed. Second, we can easily add tracing. import Control.Monad.Trans -- The following is just the ordinary foldr, with a specialized -- type for the seed: m z foldrM :: Monad m => (a -> m z -> m z) -> m z -> [a] -> m z -- The code below is identical to that of foldr foldrM f z [] = z foldrM f z (h:t) = f h (foldrM f z t) -- foldlM is identical Control.Monad.foldM -- Its code is shown below for reference. foldlM, foldlM' :: Monad m => (z -> a -> m z) -> z -> [a] -> m z foldlM f z [] = return z foldlM f z (h:t) = f z h >>= \z' -> foldlM f z' t t1 = foldlM (\z a -> putStrLn ("foldlM: " ++ show a) >> return (a:z)) [] [1,2,3] {- foldlM: 1 foldlM: 2 foldlM: 3 [3,2,1] -} -- foldlM' is foldlM expressed via foldrM foldlM' f z l = foldrM (\e am -> am >>= \k -> return $ \z -> f z e >>= k) (return return) l >>= \f -> f z -- foldrM'' is foldlM' with trace printing foldlM'' :: (MonadIO m, Show a) => (z -> a -> m z) -> z -> [a] -> m z foldlM'' f z l = foldrM (\e am -> liftIO (putStrLn $ "foldR: " ++ show e) >> am >>= \k -> return $ \z -> f z e >>= k) (return return) l >>= \f -> f z t2 = foldlM'' (\z a -> putStrLn ("foldlM: " ++ show a) >> return (a:z)) [] [1,2,3] {- foldR: 1 foldR: 2 foldR: 3 foldlM: 1 foldlM: 2 foldlM: 3 [3,2,1] -} As we can see from the trace printing, first the whole list is traversed by foldR and the closure is constructed. Only after foldr has finished, the closure is applied to z ([] in our case), and foldl's function f gets a chance to work. The list is effectively traversed twice, which means the `copy' of the list has to be allocated -- that is, the closure that incorporates the calls to f e1, f e2, etc.

Thanks, I see now where my mistake was.
Laziness (or call by name) is needed to make the step from
(\e a z -> a (f z e))
(head l)
(foldr (\e a z -> a (f z e)) id (tail l) z)
(f z (head l))
to
\z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
without evaluating foldr further.
Roman
* oleg@okmij.org
That said, to express foldl via foldr, we need a higher-order fold. There are various problems with higher-order folds, related to the cost of building closures. The problems are especially severe in strict languages or strict contexts. Indeed,
foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
first constructs the closure and then applies it to z. The closure has the same structure as the list -- it is isomorphic to the list. However, the closure representation of a list takes typically quite more space than the list. So, in strict languages, expressing foldl via foldr is a really bad idea. It won't work for big lists.
If we unroll foldr once (assuming l is not empty), we'll get
\z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
which is a (shallow) closure. In order to observe what you describe (a closure isomorphic to the list) we'd need a language which does reductions inside closures.
I should've elaborated this point.
Let us consider monadic versions of foldr and foldl. First, monads, sort of emulate strict contexts, making it easier to see when closures are constructed. Second, we can easily add tracing.
import Control.Monad.Trans
-- The following is just the ordinary foldr, with a specialized -- type for the seed: m z foldrM :: Monad m => (a -> m z -> m z) -> m z -> [a] -> m z -- The code below is identical to that of foldr foldrM f z [] = z foldrM f z (h:t) = f h (foldrM f z t)
-- foldlM is identical Control.Monad.foldM -- Its code is shown below for reference. foldlM, foldlM' :: Monad m => (z -> a -> m z) -> z -> [a] -> m z foldlM f z [] = return z foldlM f z (h:t) = f z h >>= \z' -> foldlM f z' t
t1 = foldlM (\z a -> putStrLn ("foldlM: " ++ show a) >> return (a:z)) [] [1,2,3]
{- foldlM: 1 foldlM: 2 foldlM: 3 [3,2,1] -}
-- foldlM' is foldlM expressed via foldrM foldlM' f z l = foldrM (\e am -> am >>= \k -> return $ \z -> f z e >>= k) (return return) l >>= \f -> f z
-- foldrM'' is foldlM' with trace printing foldlM'' :: (MonadIO m, Show a) => (z -> a -> m z) -> z -> [a] -> m z foldlM'' f z l = foldrM (\e am -> liftIO (putStrLn $ "foldR: " ++ show e) >> am >>= \k -> return $ \z -> f z e >>= k) (return return) l >>= \f -> f z
t2 = foldlM'' (\z a -> putStrLn ("foldlM: " ++ show a) >> return (a:z)) [] [1,2,3]
{- foldR: 1 foldR: 2 foldR: 3 foldlM: 1 foldlM: 2 foldlM: 3 [3,2,1] -}
As we can see from the trace printing, first the whole list is traversed by foldR and the closure is constructed. Only after foldr has finished, the closure is applied to z ([] in our case), and foldl's function f gets a chance to work. The list is effectively traversed twice, which means the `copy' of the list has to be allocated -- that is, the closure that incorporates the calls to f e1, f e2, etc.
participants (2)
-
oleg@okmij.org
-
Roman Cheplyaka