Proposal: Strict scanl, scanl1 and mapAccumL

Hi, Data.List exports strict versions of foldl an foldl1. I think it should also export these strict versions of scanl, scanl1 and mapAccumL: scanl' :: (a -> b -> a) -> a -> [b] -> [a] scanl' f q ls = q : (case ls of [] -> [] x:xs -> let q' = f q x in q' `seq` scanl f q' xs) scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f (x:xs) = scanl' f x xs scanl1' _ [] = [] mapAccumL' :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) mapAccumL' _ s [] = (s, []) mapAccumL' f s (x:xs) = (s'',y:ys) where (s', y ) = f s x (s'',ys) = s' `seq` mapAccumL' f s' xs Is there a good reason they're not included? Discussion deadline: 2 weeks (till Monday 26 November). Regards, Bas

On Mon, 12 Nov 2012, Bas van Dijk wrote:
Data.List exports strict versions of foldl an foldl1. I think it should also export these strict versions of scanl, scanl1 and mapAccumL:
scanl' :: (a -> b -> a) -> a -> [b] -> [a] scanl' f q ls = q : (case ls of [] -> [] x:xs -> let q' = f q x in q' `seq` scanl f q' xs)
scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f (x:xs) = scanl' f x xs scanl1' _ [] = []
mapAccumL' :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) mapAccumL' _ s [] = (s, []) mapAccumL' f s (x:xs) = (s'',y:ys) where (s', y ) = f s x (s'',ys) = s' `seq` mapAccumL' f s' xs
Is there a good reason they're not included?
Discussion deadline: 2 weeks (till Monday 26 November).
There was already a discussion some weeks ago: http://www.haskell.org/pipermail/libraries/2012-September/018434.html

On Mon, 12 Nov 2012, Bas van Dijk wrote:
On 12 November 2012 10:41, Henning Thielemann
wrote: There was already a discussion some weeks ago
Thanks, I missed that one. It seems it didn't reach a conclusion yet. Maybe we can continue the discussion here.
My impression is that making 'seq' available as function without a typeclass constraint was a step in the wrong direction. Then foldl' and friends were the second step in the wrong direction and scanl' would be the third step. For 'seq' I would propose we first start with a cleanly typed 'seq' and base foldl' functions on this function instead of the built-in 'seq'. But I assume that most of the time where foldl' is used, actually a deepseq-foldl' is meant. I have often seen foldl' in Haskell library code that had not the intended effect since the accumulator was a lazy pair or a Map.

On 12 November 2012 11:17, Henning Thielemann
My impression is that making 'seq' available as function without a typeclass constraint was a step in the wrong direction. Then foldl' and friends were the second step in the wrong direction and scanl' would be the third step. For 'seq' I would propose we first start with a cleanly typed 'seq' and base foldl' functions on this function instead of the built-in 'seq'.
I think I agree that putting seq in a type class would have been better (especially since we can now derive things generically). However if we are going to change this we have to change a lot of code anyway. Why not add these strict functions now and change all of them later (incl. all the strict functions in containers) when the Seq type class gets added (if that ever happens...)?
But I assume that most of the time where foldl' is used, actually a deepseq-foldl' is meant. I have often seen foldl' in Haskell library code that had not the intended effect since the accumulator was a lazy pair or a Map.
If your accumulator in foldl' is a lazy pair you should just seq the new elements before returning the pair. Not doing that is just silly. I don't see the need for deepseq there. In case your accumulator is a Map you really want to use foldl'. For example the following program results in a stack space overflow (when compiled without optimizations): import qualified Data.Map as M main = print $ M.size $ foldl (\m i -> M.insert i i m) M.empty [1..1000000] With foldl' it doesn't. Even if you have a large enough stack (-k128m) this program will still allocate a big part of your heap (to store all the intermediate 'M.insert i i m' thunks) if I'm not mistaken. foldl' also doesn't do that. Cheers, Bas

On Mon, 12 Nov 2012, Bas van Dijk wrote:
On 12 November 2012 11:17, Henning Thielemann
wrote: My impression is that making 'seq' available as function without a typeclass constraint was a step in the wrong direction. Then foldl' and friends were the second step in the wrong direction and scanl' would be the third step. For 'seq' I would propose we first start with a cleanly typed 'seq' and base foldl' functions on this function instead of the built-in 'seq'.
I think I agree that putting seq in a type class would have been better (especially since we can now derive things generically). However if we are going to change this we have to change a lot of code anyway. Why not add these strict functions now and change all of them later (incl. all the strict functions in containers) when the Seq type class gets added (if that ever happens...)?
I don't think of changing it. We could provide a package that exports "the right 'seq'" and then encourage people to use this instead of Prelude.seq.
But I assume that most of the time where foldl' is used, actually a deepseq-foldl' is meant. I have often seen foldl' in Haskell library code that had not the intended effect since the accumulator was a lazy pair or a Map.
If your accumulator in foldl' is a lazy pair you should just seq the new elements before returning the pair. Not doing that is just silly. I don't see the need for deepseq there.
Silly or not, it is simple to forget it, especially if the foldl' was written once with a simple accumulator type which is later changed to a pair type.
In case your accumulator is a Map you really want to use foldl'.
Why not a deepseq foldl'?

* Henning Thielemann
In case your accumulator is a Map you really want to use foldl'.
Why not a deepseq foldl'?
First, you may actually want your Map to contain lazy values. Second, you may know that your values are already evaluated, so deepseq is just a waste of time. deepseq is far from being cheap, since it has to traverse the whole structure, regardless of what has been already evaluated. Roman

On 12 November 2012 13:25, Henning Thielemann
I don't think of changing it. We could provide a package that exports "the right 'seq'" and then encourage people to use this instead of Prelude.seq.
I think I like this idea. So the package would export something like: module Control.Seq where import Prelude hiding (seq) import GHC.Base hiding (seq) class WHNFData a where rwhnf :: a -> () instance WHNFData [a] where rwhnf [] = () rwhnf (_:_) = () -- and all the others... seq :: WHNFData a => a -> b -> b seq a b = case rwhnf a of () -> b ($!) :: WHNFData a => (a -> b) -> a -> b f $! x = x `seq` f x force :: WHNFData a => a -> a force x = x `seq` x -- Doesn't type check unfortunately -- since the b in seq :: a -> b -> b is of kind * and not #: -- evaluate :: WHNFData a => a -> IO a -- evaluate x = IO (\s -> x `seq` (# s, x #)) However, I still think it doesn't hurt to add the strict versions now. And then in the future possibly change the whole base library at once (and other libraries like containers) to support the Seq class. Bas

On Mon, 12 Nov 2012, Bas van Dijk wrote:
On 12 November 2012 13:25, Henning Thielemann
wrote: I don't think of changing it. We could provide a package that exports "the right 'seq'" and then encourage people to use this instead of Prelude.seq.
I think I like this idea. So the package would export something like:
module Control.Seq where
import Prelude hiding (seq) import GHC.Base hiding (seq)
class WHNFData a where rwhnf :: a -> ()
instance WHNFData [a] where rwhnf [] = () rwhnf (_:_) = ()
-- and all the others...
seq :: WHNFData a => a -> b -> b seq a b = case rwhnf a of () -> b
($!) :: WHNFData a => (a -> b) -> a -> b f $! x = x `seq` f x
Yes, this looks nice!
force :: WHNFData a => a -> a force x = x `seq` x
Does this function do something?
-- Doesn't type check unfortunately -- since the b in seq :: a -> b -> b is of kind * and not #: -- evaluate :: WHNFData a => a -> IO a -- evaluate x = IO (\s -> x `seq` (# s, x #))
Maybe it can be implemented in terms of the existing 'evaluate' function but without applying 'Prelude.seq' to the 'a' typed value?

Henning Thielemann
force :: WHNFData a => a -> a force x = x `seq` x
Does this function do something?
if 'seq' adheres to the Haskell reports definition, that is seq ⊥ b = ⊥ seq a b = b, if a ≠ ⊥ 'seq a a' and 'a' must be equivalent, and thus 'force' has to be equivalent to a type-specialized 'id', i.e. force :: WHNFData a => a -> a force x = x

Yes, GHC will rewrite this to
force x = x
I'm also not sure if we really want "seq" in all of these cases, or if we
want stronger guarantees. For example, GHC can (and often will) rewrite
f x y = x `seq` y `seq` x
to
f x y = y `seq` x
This changes the order of evaluation. For pure computations this is
semantically sound. However, it can change the space behaviour of the
program, which defeats the point of using "seq" in the first place
(usually). If forcing "x" to be evaluated before "y" is the goal, then
"pseq" must be used. I think it's worthwhile to consider introducing
high-level combinators for this use case as well.
On 15 November 2012 15:38, Johan Tibell
On Thu, Nov 15, 2012 at 2:38 AM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Mon, 12 Nov 2012, Bas van Dijk wrote:
force :: WHNFData a => a -> a force x = x `seq` x
Does this function do something?
No. It says "when x is evaluated, evaluate x".
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Push the envelope. Watch it bend.

On Mon, Nov 12, 2012 at 2:12 PM, Bas van Dijk
On 12 November 2012 13:25, Henning Thielemann
wrote: I don't think of changing it. We could provide a package that exports "the right 'seq'" and then encourage people to use this instead of Prelude.seq.
I think I like this idea. So the package would export something like:
module Control.Seq where
import Prelude hiding (seq) import GHC.Base hiding (seq)
class WHNFData a where rwhnf :: a -> ()
instance WHNFData [a] where rwhnf [] = () rwhnf (_:_) = ()
-- and all the others...
seq :: WHNFData a => a -> b -> b seq a b = case rwhnf a of () -> b
($!) :: WHNFData a => (a -> b) -> a -> b f $! x = x `seq` f x
For reference, older versions of Haskell defined the following class:
class Eval a where strict :: (a -> b) -> a -> b seq :: a -> b -> b strict f x = x `seq` f x The function `strict` is quite useful for making strict versions of lazy functions. Each type was automagically an instance of this class (with the notable exception of functions, there was no way to force function values). Cheers, Josef

I just realized that mapAccumL' is not needed since the caller has the
ability to force the accumlator. So please ignore that part of my
proposal. This leaves just scanl' and scanl1' as orignally proposed by
Niklas Hambüchen.
On 12 November 2012 10:09, Bas van Dijk
Hi,
Data.List exports strict versions of foldl an foldl1. I think it should also export these strict versions of scanl, scanl1 and mapAccumL:
scanl' :: (a -> b -> a) -> a -> [b] -> [a] scanl' f q ls = q : (case ls of [] -> [] x:xs -> let q' = f q x in q' `seq` scanl f q' xs)
scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f (x:xs) = scanl' f x xs scanl1' _ [] = []
mapAccumL' :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) mapAccumL' _ s [] = (s, []) mapAccumL' f s (x:xs) = (s'',y:ys) where (s', y ) = f s x (s'',ys) = s' `seq` mapAccumL' f s' xs
Is there a good reason they're not included?
Discussion deadline: 2 weeks (till Monday 26 November).
Regards,
Bas

On 12 November 2012 11:34, Bas van Dijk
I just realized that mapAccumL' is not needed since the caller has the ability to force the accumlator. So please ignore that part of my proposal. This leaves just scanl' and scanl1' as orignally proposed by Niklas Hambüchen.
Oops scrap that. After thinking about it more and testing it I realize the caller really doesn't have control over the evaluation order in the function passed to mapAccumL. So please consider my original proposal again. Sorry for the noise. Bas

Hi, On Mon, Nov 12, 2012 at 12:02:28PM +0100, Bas van Dijk wrote:
On 12 November 2012 11:34, Bas van Dijk
wrote: I just realized that mapAccumL' is not needed since the caller has the ability to force the accumlator. So please ignore that part of my proposal. This leaves just scanl' and scanl1' as orignally proposed by Niklas Hambüchen.
Oops scrap that. After thinking about it more and testing it I realize the caller really doesn't have control over the evaluation order in the function passed to mapAccumL. So please consider my original proposal again.
scanl' and scanl1' can be naturally defined using their non-strict counterparts like: scanl' :: (b -> a -> b) -> b -> [a] -> [b] scanl' f z xs = headStrict $ scanl f z xs scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f xs = headStrict $ scanl1 f xs headStrict :: [a] -> [a] headStrict = foldr (\x y -> seq x (x : y)) [] So the `headStrict' function could be exported from Data.List instead. A nice thing about it is that it can also be used with other list-generating functions like `iterate' to make it strict. It's also possible to define mapAccumL' in terms of mapAccumL, but the corresponding forcing function would not look as nice as `headStrict' above, so I would think providing mapAccumL' is a good idea. Regards, Takano Akio

On Tue, 13 Nov 2012, Takano Akio wrote:
On Mon, Nov 12, 2012 at 12:02:28PM +0100, Bas van Dijk wrote:
On 12 November 2012 11:34, Bas van Dijk
wrote: I just realized that mapAccumL' is not needed since the caller has the ability to force the accumlator. So please ignore that part of my proposal. This leaves just scanl' and scanl1' as orignally proposed by Niklas Hambüchen.
Oops scrap that. After thinking about it more and testing it I realize the caller really doesn't have control over the evaluation order in the function passed to mapAccumL. So please consider my original proposal again.
scanl' and scanl1' can be naturally defined using their non-strict counterparts like:
scanl' :: (b -> a -> b) -> b -> [a] -> [b] scanl' f z xs = headStrict $ scanl f z xs
scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f xs = headStrict $ scanl1 f xs
headStrict :: [a] -> [a] headStrict = foldr (\x y -> seq x (x : y)) []
So the `headStrict' function could be exported from Data.List instead. A nice thing about it is that it can also be used with other list-generating functions like `iterate' to make it strict.
Can this be achieved with Strategies as well?

On 13 November 2012 11:06, Takano Akio
Hi,
On Mon, Nov 12, 2012 at 12:02:28PM +0100, Bas van Dijk wrote:
On 12 November 2012 11:34, Bas van Dijk
wrote: I just realized that mapAccumL' is not needed since the caller has the ability to force the accumlator. So please ignore that part of my proposal. This leaves just scanl' and scanl1' as orignally proposed by Niklas Hambüchen.
Oops scrap that. After thinking about it more and testing it I realize the caller really doesn't have control over the evaluation order in the function passed to mapAccumL. So please consider my original proposal again.
scanl' and scanl1' can be naturally defined using their non-strict counterparts like:
scanl' :: (b -> a -> b) -> b -> [a] -> [b] scanl' f z xs = headStrict $ scanl f z xs
scanl1' :: (a -> a -> a) -> [a] -> [a] scanl1' f xs = headStrict $ scanl1 f xs
headStrict :: [a] -> [a] headStrict = foldr (\x y -> seq x (x : y)) []
So the `headStrict' function could be exported from Data.List instead. A nice thing about it is that it can also be used with other list-generating functions like `iterate' to make it strict.
It's also possible to define mapAccumL' in terms of mapAccumL, but the corresponding forcing function would not look as nice as `headStrict' above, so I would think providing mapAccumL' is a good idea.
Regards, Takano Akio
Thanks Takano, headStrict looks like a useful function. I would like to drop the proposal for adding scanl', scanl1' and mapAccumL'. Cheers, Bas

On Wed 05 Dec 2012 07:56:10 JST, Bas van Dijk wrote:
I would like to drop the proposal for adding scanl', scanl1' and mapAccumL'.
Would you instead propose a headStrict equivalent into Data.List? It seems very useful to have it there, especially because the haddocks could refer to it for those functions where forcing-by-foldr is sufficient.
participants (9)
-
Bas van Dijk
-
Henning Thielemann
-
Herbert Valerio Riedel
-
Johan Tibell
-
Josef Svenningsson
-
Niklas Hambüchen
-
Roman Cheplyaka
-
Takano Akio
-
Thomas Schilling