Proposal: Make intersperse lazier

The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation, intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys changes the semantics from intersperse (x : _|_) = _|_ to intersperse (x : _|_) = x : _|_ apart from that, I think only the run-time behaviour is changed. Period of discussion: Two weeks, until 30 Sep. 2010. Cheers, Daniel

On Thu, 16 Sep 2010 11:50:08 -0300, Felipe Lessa
+1
+1 -- Nicolas Pouillard http://nicolaspouillard.fr

On 16 September 2010 15:39, Daniel Fischer
The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation,
changes the semantics from
intersperse (x : _|_) = _|_
to
intersperse (x : _|_) = x : _|_
You mean: intersperse sep (x : _|_) = x : _|_ We would also need to change intercalate to match: intercalate sep (x : _|_) = x ++ _|_ Note that the current definition of intercalate in the base package will not need to change because it is defined in terms of intersperse, but its strictness specification does change. Assuming the proposal is updated with the corresponding change to intercalate, then I support the proposal. It is in the spirit of the Haskell98 List module for list functions to be as lazy as possible (except there are good reasons to be stricter, e.g. as in splitAt). Incidentally, the lazier version is also slightly more efficient. Duncan

On Thursday 16 September 2010 17:08:08, Duncan Coutts wrote:
On 16 September 2010 15:39, Daniel Fischer
wrote: The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation,
changes the semantics from
intersperse (x : _|_) = _|_
to
intersperse (x : _|_) = x : _|_
You mean:
intersperse sep (x : _|_) = x : _|_
Oops, of course.
We would also need to change intercalate to match:
intercalate sep (x : _|_) = x ++ _|_
Note that the current definition of intercalate in the base package will not need to change because it is defined in terms of intersperse, but its strictness specification does change.
Assuming the proposal is updated with the corresponding change to intercalate,
You mean, also mention the consequences for intercalate explicitly? Or something more/else?
then I support the proposal. It is in the spirit of the Haskell98 List module for list functions to be as lazy as possible (except there are good reasons to be stricter, e.g. as in splitAt). Incidentally, the lazier version is also slightly more efficient.
Duncan

On 16 September 2010 16:41, Daniel Fischer
On Thursday 16 September 2010 17:08:08, Duncan Coutts wrote:
We would also need to change intercalate to match:
intercalate sep (x : _|_) = x ++ _|_
Note that the current definition of intercalate in the base package will not need to change because it is defined in terms of intersperse, but its strictness specification does change.
Assuming the proposal is updated with the corresponding change to intercalate,
You mean, also mention the consequences for intercalate explicitly? Or something more/else?
The point is that we are proposing to change the strictness properties of both intersperse and intercalate. Ideally the proposal should state the old and new strictness properties of both functions and give a definition for each that satisfies the strictness property. The fact that the old and new definitions for intercalate will be textually identical does not really matter, it's behaviour has certainly changed. (In principle we could go a step further and give a testable property that states that the new definitions refine the existing ones.) Duncan

On Thursday 16 September 2010 16:39:56, Daniel Fischer wrote:
The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation,
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys
changes the semantics from
intersperse sep (x : _|_) = _|_
to
intersperse sep (x : _|_) = x : _|_
apart from that, I think only the run-time behaviour is changed.
Period of discussion: Two weeks, until 30 Sep. 2010.
Cheers, Daniel
As Duncan pointed out, I've omitted to make explicit that this change would also affect Data.List.intercalate, changing its behaviour from intercalate sep (xs : _|_) = _|_ to intercalate sep (xs : _|_) = xs ++ _|_ I would like to include that explicitly in the proposal.

Am 16.09.2010 16:39, schrieb Daniel Fischer:
The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation,
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys
I support this proposal, because it is be more efficient. However, the (unfortunately so common) use of the auxiliary function "go" does not look nice to me. It looks less abstract (or declarative) just for efficiency reasons. Therefore I proposed the following implementation: intersperse :: a -> [a] -> [a] intersperse s l = case l of [] -> l x : r -> x : if null r then r else s : intersperse s r which looks more intuitive to me, has the same semantic change as below, and needs no auxiliary function. My version would check the list "r" twice: 1. by "null r" and 2. in the recursion "intersperse s r", but I would expect this to be optimized away. So my question is: Is my "intuitive" code really less efficient than the proposed code? If so, why? (If not, you may take my code.) I hate the idea having to write cryptic haskell code just to obtain efficiency. Cheers Christian
changes the semantics from
intersperse (x : _|_) = _|_
to
intersperse (x : _|_) = x : _|_
apart from that, I think only the run-time behaviour is changed.
Period of discussion: Two weeks, until 30 Sep. 2010.
Cheers, Daniel

Am 17.09.2010 10:48, schrieb Christian Maeder:
Am 16.09.2010 16:39, schrieb Daniel Fischer:
The current implementation of Data.List.intersperse causes a space leak under certain not uncommon circumstances. Trac ticket: http://hackage.haskell.org/trac/ghc/ticket/4282 The proposed implementation,
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys
I support this proposal, because it is be more efficient. However, the (unfortunately so common) use of the auxiliary function "go" does not look nice to me. It looks less abstract (or declarative) just for efficiency reasons.
How about changing the ugly auxiliary function into global useful function and/or avoiding recursion at all by: concatMap (\ x -> [sep, x]) (I don't have a good name for it, maybe "prepend") Would this have an efficiency problem?
Therefore I proposed the following implementation:
intersperse :: a -> [a] -> [a] intersperse s l = case l of [] -> l x : r -> x : if null r then r else s : intersperse s r
which looks more intuitive to me, has the same semantic change as below, and needs no auxiliary function.
My version would check the list "r" twice: 1. by "null r" and 2. in the recursion "intersperse s r", but I would expect this to be optimized away.
So my question is: Is my "intuitive" code really less efficient than the proposed code? If so, why? (If not, you may take my code.)
I hate the idea having to write cryptic haskell code just to obtain efficiency.
Cheers Christian
changes the semantics from
intersperse (x : _|_) = _|_
to
intersperse (x : _|_) = x : _|_
apart from that, I think only the run-time behaviour is changed.
Period of discussion: Two weeks, until 30 Sep. 2010.
Cheers, Daniel

On 17 September 2010 09:48, Christian Maeder
I support this proposal, because it is be more efficient. However, the (unfortunately so common) use of the auxiliary function "go" does not look nice to me. It looks less abstract (or declarative) just for efficiency reasons.
Hi Christian go here, and the top-level intersperse are both in the so-called "declaration style". The algorithm has one case for the first element (no prepend - take one) then another case for the all the rest (prepend with sep - take one - recurse). So the implementation is a literal (and concise) implementation of the algorithm - is this not declarative? Best wishes Stephen

On Fri, Sep 17, 2010 at 10:48 AM, Christian Maeder
Am 16.09.2010 16:39, schrieb Daniel Fischer:
The proposed implementation,
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys ... Therefore I proposed the following implementation:
intersperse :: a -> [a] -> [a] intersperse s l = case l of [] -> l x : r -> x : if null r then r else s : intersperse s r
One additional benefit about the original proposed implementation is that it applies the static argument transformation: the 'sep' argument is brought into scope of the worker function 'go' which then doesn't need to pass it to each recursion as in the original implementation. It would be interesting to see if this provides a noticeable performance benefit. Criterion anyone? Bas

Am 17.09.2010 14:21, schrieb Bas van Dijk:
On Fri, Sep 17, 2010 at 10:48 AM, Christian Maeder
wrote: Am 16.09.2010 16:39, schrieb Daniel Fischer:
The proposed implementation,
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x:xs) = x : go xs where go [] = [] go (y:ys) = sep : y : go ys ... Therefore I proposed the following implementation:
intersperse :: a -> [a] -> [a] intersperse s l = case l of [] -> l x : r -> x : if null r then r else s : intersperse s r
One additional benefit about the original proposed implementation is that it applies the static argument transformation: the 'sep' argument is brought into scope of the worker function 'go' which then doesn't need to pass it to each recursion as in the original implementation.
This is the point. Taking the simpler prepend function, do we did not write: prepend :: a -> [a] -> [a] prepend sep = go where go [] = [] go (x : xs) = sep : x : go xs for more efficiency (instead of the direct recursion or a concatMap or folding solution)? Cheers Christian
It would be interesting to see if this provides a noticeable performance benefit. Criterion anyone?
Bas

On Friday 17 September 2010 14:45:12, Christian Maeder wrote:
Am 17.09.2010 14:21, schrieb Bas van Dijk:
One additional benefit about the original proposed implementation is that it applies the static argument transformation: the 'sep' argument is brought into scope of the worker function 'go' which then doesn't need to pass it to each recursion as in the original implementation.
Which, according to the benchmarks I ran today is in this case actually not a benefit. (Not entirely surprising, in http://www.haskell.org/pipermail/glasgow- haskell-users/2008-June/014987.html, Max Bolingbroke wrote: "GHC performs the SAT iff the recursive call is direct (i.e. not on mutually recursive groups of functions) to reduce code expansion, and if the number of static arguments is at least 2. The reason for the last criterion is that moving a parameter to a closure implicitly adds an argument to all the functions that make reference to that variable, the implicit argument being the pointer to the closure. Eliminating an actual function argument just to add a layer of indirection via an allocated closure would be fairly pointless!" As far as I know, however, the SAT tends to be beneficial even for a single argument if that argument is a function to be applied, as in folds.)
This is the point. Taking the simpler prepend function, do we did not write:
prepend :: a -> [a] -> [a] prepend sep = go where go [] = [] go (x : xs) = sep : x : go xs
That just moves the worker from intersperse to another function, so I wouldn't expect you to find it any nicer. However, I benchmarked some more today, and the results suggest that intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse s (x:xs) = x : foo s xs foo :: a -> [a] -> [a] foo s (x:xs) = s : x : foo s xs foo _ [] = [] is a little faster than the SATed isgo (about 4-5% in the benchmark). Compiled with optimisations, it makes no difference whether foo (of course we need a better name) is a top-level function or local to intersperse, but without optimisations, a top-level function is better. Also nice is that that gives identical core for all optimisation levels (0,1,2) [apart from the strictness/unfolding information, which is absent with -O0]. (Tested with 6.12.3 and HEAD.) So far, the above is IMO the best implementation. A good name for foo remains to be found. I don't like 'prepend' because prepend suggests only putting something before a list (with the given type, it should be (:)) and not changing anything inside. If it's not to be exported from Data.List (and I don't consider it useful enough to be), maybe intersperseLoop wouldn't be too daft. Cheers, Daniel

Am 24.09.2010 16:53, schrieb Daniel Fischer: [...]
However, I benchmarked some more today, and the results suggest that
intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse s (x:xs) = x : foo s xs
foo :: a -> [a] -> [a] foo s (x:xs) = s : x : foo s xs foo _ [] = []
is a little faster than the SATed isgo (about 4-5% in the benchmark). Compiled with optimisations, it makes no difference whether foo (of course we need a better name) is a top-level function or local to intersperse, but without optimisations, a top-level function is better. Also nice is that that gives identical core for all optimisation levels (0,1,2) [apart from the strictness/unfolding information, which is absent with -O0]. (Tested with 6.12.3 and HEAD.)
So far, the above is IMO the best implementation. A good name for foo remains to be found.
prependToAll
I don't like 'prepend' because prepend suggests only putting something before a list (with the given type, it should be (:)) and not changing anything inside.
If it's not to be exported from Data.List (and I don't consider it useful enough to be), maybe intersperseLoop wouldn't be too daft.
I consider it useful, because the "natural" implementation: prependToAll s = foldr (\ x r -> s : x : r) [] seems to leak without optimization. Cheers Christian
Cheers, Daniel

Am 24.09.2010 17:55, schrieb Christian Maeder:
I consider it useful, because the "natural" implementation:
prependToAll s = foldr (\ x r -> s : x : r) []
seems to leak without optimization.
"leak" is wrong. My Benchmarks fail without optimization. (see attached log.txt) Christian Bench.hs: import Criterion.Main testCase f = print . length . f ',' main = do let s = replicate 100000000 'A' print $ length s defaultMain [ bench "prepend" $ testCase prepend s , bench "prependToAll" $ testCase prependToAll s ] prependToAll :: a -> [a] -> [a] prependToAll s = foldr (\ x r -> s : x : r) [] prepend :: a -> [a] -> [a] prepend s l = case l of [] -> [] x : r -> s : x : prepend s r

On Friday 24 September 2010 19:00:40, Christian Maeder wrote:
"leak" is wrong. My Benchmarks fail without optimization. (see attached log.txt)
Christian
That's probably a problem(?) in Criterion. If you move the functions you want to benchmark to their own module, compile that with -O0 and then compile the benchmarking programme with optimisations, it works properly. I always put the functions to benchmark in their own module and compile separately to prevent GHC from performing optimisations which it only does within one module (and possibly for non-exported functions). Who would've thought that is also necessary with -O0.

On Friday 24 September 2010 17:55:15, Christian Maeder wrote:
Am 24.09.2010 16:53, schrieb Daniel Fischer:
A good name for foo remains to be found.
prependToAll
Much better. I'm still not happy with the prepend, but insertBeforeEach isn't better and I can't think of anything convincing.
I don't like 'prepend' because prepend suggests only putting something before a list (with the given type, it should be (:)) and not changing anything inside.
If it's not to be exported from Data.List (and I don't consider it useful enough to be), maybe intersperseLoop wouldn't be too daft.
I consider it useful, because the "natural" implementation:
prependToAll s = foldr (\ x r -> s : x : r) []
seems to leak without optimization.
Another natural implementation would be prependToAll s = (s :) . intersperse s or, pointed, prependToAll s xs = s : intersperse s xs Those are of course a bit silly if intersperse is defined in terms of prependToAll. But I don't see where one would need that function, so I'm hesitating to propose changing the API of Data.List. If you have use cases, I would probably support it.
Cheers Christian
Cheers, Daniel

Am 24.09.2010 19:50, schrieb Daniel Fischer: [...]
prependToAll s xs = s : intersperse s xs
Those are of course a bit silly if intersperse is defined in terms of prependToAll.
But I don't see where one would need that function, so I'm hesitating to propose changing the API of Data.List. If you have use cases, I would probably support it.
You're right. Including prependToAll is unnecessary. After all we just want to fix a space leak in intersperse (with or without lokal worker). +1 for your proposal http://hackage.haskell.org/trac/ghc/ticket/4282 Christian

On Friday 17 September 2010 14:21:36, Bas van Dijk wrote:
On Fri, Sep 17, 2010 at 10:48 AM, Christian Maeder
... Therefore I proposed the following implementation:
intersperse :: a -> [a] -> [a] intersperse s l = case l of [] -> l x : r -> x : if null r then r else s : intersperse s r
One additional benefit about the original proposed implementation is that it applies the static argument transformation: the 'sep' argument is brought into scope of the worker function 'go' which then doesn't need to pass it to each recursion as in the original implementation.
It would be interesting to see if this provides a noticeable performance benefit. Criterion anyone?
Bas
Okay, if you don't then I do :) I've benchmarked a couple of variants: module Interspersing where isgo :: a -> [a] -> [a] isgo _ [] = [] isgo s (x:xs) = x : go xs where go [] = [] go (y:ys) = s : y: go ys isrec :: a -> [a] -> [a] isrec s l = case l of [] -> l (x:r) -> x : if null r then r else (s : isrec s r) ispreplc :: a -> [a] -> [a] ispreplc s (x:xs) = x : concat [[s,y] | y <- xs] ispreplc _ _ = [] isprepM :: a -> [a] -> [a] isprepM s (x:xs) = x : (xs >>= \y -> [s,y]) isprepM _ _ = [] isprepT :: a -> [a] -> [a] isprepT _ [] = [] isprepT s xs = tail $ concatMap (\x -> [s,x]) xs isprepD :: a -> [a] -> [a] isprepD s xs = case concatMap (\x -> [s,x]) xs of (_:ys) -> ys _ -> [] I've also tried with ((s:) . (:[])) instead of the lambdas, but that gives poor code (calls to (++), much slower). The explicit worker/wrapper isgo and the list comprehension version ispreplc give identical core with optimisations (-O and -O2, quite different with -O0), isprepD gets almost the same core with the obvious case after the worker, the other isprepX versions get the same worker loop inside, but with more wrapping. The direct recursion isrec produces different core, but also with a local function for the recursion. Benchmark: intersperse a '.' in a text, then compute the length. Two texts were tried, one ~1.4 million characters, the other ~6.5 million, the text was kept in memory for the entire benchmark . Results: With -O2: unsurprisingly, isgo and ispreplc have nearly identical means in each run, about 33.6 ms for the small benchmark and 153 ms for the large. isprepD is slightly slower, 33.9 ms resp 155 ms. isprepM and isprepT are a little slower again, 34.4 ms resp 157 ms. isrec lags behind, 43.4 ms resp. 193 ms. With -O1, the outcome is almost identical, isprepD is slightly slower than with -O2, now identical to isprepM/T. With -O0, the outcome is drastically different. isgo produces core almost identical to the optimised and the performance is indistinguishable. The top-level recursive isrec suffers camparatively little, the times are 51.3 ms resp. 238.8 ms. isprepD is again slightly faster than isprepM/T, but now much slower than isrec, 72.5 ms / 73.5 ms resp. 340 ms / 344 ms. The list comprehension suffers most and is now the slowest, 85.9 ms resp. 404 ms. Conclusion: isprepD/M/T were only included out of curiosity, they didn't give a reason to choose any of them in the benchmarks. The top-level recursion is clearly slower than the manual worker/wrapper, hence it's not the way to go. Unless a further good implementation is offered, the choice is between the manual worker/wrapper, isgo, and the list comprehension, ispreplc. If there's a realistic chance that base is ever compiled without optimisations, it has to be the explicit worker. If we can safely assume that base is always compiled with optimisations, it's a matter of taste.

Am 17.09.2010 20:17, schrieb Daniel Fischer:
Okay, if you don't then I do :) I've benchmarked a couple of variants:
module Interspersing where
isgo :: a -> [a] -> [a] isgo _ [] = [] isgo s (x:xs) = x : go xs where go [] = [] go (y:ys) = s : y: go ys
isrec :: a -> [a] -> [a] isrec s l = case l of [] -> l (x:r) -> x : if null r then r else (s : isrec s r)
[..]
Results: With -O2: unsurprisingly, isgo and ispreplc have nearly identical means in each run, about 33.6 ms for the small benchmark and 153 ms for the large. isprepD is slightly slower, 33.9 ms resp 155 ms. isprepM and isprepT are a little slower again, 34.4 ms resp 157 ms. isrec lags behind, 43.4 ms resp. 193 ms.
I also did some benchmarking. It made no difference if ones uses a global function "prepend" or the local "go" function. (Also prepend is not faster if written using a worker.) The function isrec seem to be rewritten to a form that does not test "r" twice: isrec2 :: a -> [a] -> [a] isrec2 s l = case l of [] -> l x : r -> myGo s x r myGo :: a -> a -> [a] -> [a] myGo s x r = x : case r of [] -> r y : t -> s : myGo s y t (making myGo local makes it worse) myGo produces a non-empty list. Therefore it is safe to change the recursive call "s : myGo s y t" to "(s :) $! myGo s y t". After this change or the change "(s :) $! isrec s r" in Daniel's isrec function, these function are almost as fast as isgo. Cheers Christian
participants (7)
-
Bas van Dijk
-
Christian Maeder
-
Daniel Fischer
-
Duncan Coutts
-
Felipe Lessa
-
Nicolas Pouillard
-
Stephen Tetley