Lists concatenation being O(n)

Hello, I re-read recently a bit of RealWorldHaskell, and I saw something that puzzles me now that I have a better understanding of the language. It concerns the list concatenation being costful, and the need to use another type like DList. [] ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys) To me, we have here something that would be costful in a strict language, but that here, thanks to guarded recursion ((:) being non-strict), doesn't evaluate the first list until it is needed. How comes RHW says "We can see that the cost of creating a new list depends on the length of the initial list", since nothing will be done unless we ask for our list to be evaluated, which is somewhat something we will end up asking.

On 11-10-12 06:50 PM, Yves Parès wrote:
[] ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys)
To me, we have here something that would be costful in a strict language, but that here, thanks to guarded recursion ((:) being non-strict), doesn't evaluate the first list until it is needed.
So let us need the whole list. Let us print length(xs++ys) and count costs. The amount of time is Θ(length xs + length ys). You then say, that's the same cost as printing length xs and printing length ys. I agree. The number of new cons cells created in due course is Θ(length xs). These cons cells would not have been created if we printed length xs and printed length ys separately. (Sure, those new cons cells may be deallocated promptly. Or may be not.) The cost of (++) is pretty affordable when all you do is xs++ys. The expensive part happens when you (...(xs 0 ++ xs 1) ++ xs 2) ...) ++ xs (n-1) printing the length of that takes quadratic time to the printed answer.

Okay, thanks. I got to wonder that because I was implementing a Show instance (for a game grid, like an Othello), and string manipulation usually comes with a heavy usage of (++). I'm just using a strict left-fold on the game grid (e.g. Vector (Maybe Pawn)) with a combination of show (converting the pawns to Strings) and (++). Is there a more efficient way to do so? (for instance using Text and converting it back to String?)
The number of new cons cells created in due course is Θ(length xs). These cons cells would not have been created if we printed length xs and printed length ys separately. Okay, so the major problem comes from memory management.
The expensive part happens when you (...(xs 0 ++ xs 1) ++ xs 2) ...) ++ xs (n-1) Okay, because it constantly recreate new cons cells.
2011/10/13 Albert Y. C. Lai
On 11-10-12 06:50 PM, Yves Parès wrote:
[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)
To me, we have here something that would be costful in a strict language, but that here, thanks to guarded recursion ((:) being non-strict), doesn't evaluate the first list until it is needed.
So let us need the whole list. Let us print length(xs++ys) and count costs.
The amount of time is Θ(length xs + length ys). You then say, that's the same cost as printing length xs and printing length ys. I agree.
The number of new cons cells created in due course is Θ(length xs). These cons cells would not have been created if we printed length xs and printed length ys separately.
(Sure, those new cons cells may be deallocated promptly. Or may be not.)
The cost of (++) is pretty affordable when all you do is xs++ys. The expensive part happens when you (...(xs 0 ++ xs 1) ++ xs 2) ...) ++ xs (n-1) printing the length of that takes quadratic time to the printed answer.
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, Oct 13, 2011 at 3:32 PM, Yves Parès
The number of new cons cells created in due course is Θ(length xs). These cons cells would not have been created if we printed length xs and printed length ys separately. Okay, so the major problem comes from memory management.
Well, it comes from extra allocations. Since most values are immutable, most Haskell work is in the form of allocations. - Jake

On 13 October 2011 20:53, Albert Y. C. Lai
The number of new cons cells created in due course is Θ(length xs).
I was actually surprised by this because I expected: length(xs++ys) to fuse into one efficient loop which doesn't create cons cells at all. Unfortunately, I was mistaken since length is defined recursively. length :: [a] -> Int length l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#) However, if we would define it as: length = foldl' (l _ -> l+1) 0 And implemented foldl' using foldr as described here: http://www.haskell.org/pipermail/libraries/2011-October/016895.html then fuse = length(xs++ys) where for example xs = replicate 1000000 1 and ys = replicate 5000 (1::Int) would compile to the following totally fused core: fuse :: Int fuse = case $wxs 1000000 0 of ww_srS { __DEFAULT -> I# ww_srS } $wxs :: Int# -> Int# -> Int# $wxs = \ (w_srL :: Int#) (ww_srO :: Int#) -> case <=# w_srL 1 of _ { False -> $wxs (-# w_srL 1) (+# ww_srO 1); True -> $wxs1_rs8 5000 (+# ww_srO 1) } $wxs1_rs8 :: Int# -> Int# -> Int# $wxs1_rs8 = \ (w_srA :: Int#) (ww_srD :: Int#) -> case <=# w_srA 1 of _ { False -> $wxs1_rs8 (-# w_srA 1) (+# ww_srD 1); True -> +# ww_srD 1 } Bas

Wow, I don't get core haskell, but I get you point.
It's indeed odd foldl' doesn't use foldr (and sum doesn't use foldl' instead
of foldl as (+) is strict (*)) if foldr permits loop fusion.
(*) Anyway, is there a place where foldl is preferable over foldl' ? Never
happened to me, I always use right-folding if I want lazy evaluation, to
benefit from guarded recursion.
2011/10/14 Bas van Dijk
On 13 October 2011 20:53, Albert Y. C. Lai
wrote: The number of new cons cells created in due course is Θ(length xs).
I was actually surprised by this because I expected: length(xs++ys) to fuse into one efficient loop which doesn't create cons cells at all.
Unfortunately, I was mistaken since length is defined recursively.
length :: [a] -> Int length l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#)
However, if we would define it as:
length = foldl' (l _ -> l+1) 0
And implemented foldl' using foldr as described here:
http://www.haskell.org/pipermail/libraries/2011-October/016895.html
then fuse = length(xs++ys) where for example xs = replicate 1000000 1 and ys = replicate 5000 (1::Int) would compile to the following totally fused core:
fuse :: Int fuse = case $wxs 1000000 0 of ww_srS { __DEFAULT -> I# ww_srS }
$wxs :: Int# -> Int# -> Int# $wxs = \ (w_srL :: Int#) (ww_srO :: Int#) -> case <=# w_srL 1 of _ { False -> $wxs (-# w_srL 1) (+# ww_srO 1); True -> $wxs1_rs8 5000 (+# ww_srO 1) }
$wxs1_rs8 :: Int# -> Int# -> Int# $wxs1_rs8 = \ (w_srA :: Int#) (ww_srD :: Int#) -> case <=# w_srA 1 of _ { False -> $wxs1_rs8 (-# w_srA 1) (+# ww_srD 1); True -> +# ww_srD 1 }
Bas
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

2011/10/14 Yves Parès
(*) Anyway, is there a place where foldl is preferable over foldl' ?
To quote http://www.haskell.org/haskellwiki/Foldr_Foldl_Foldl' : "Usually the choice is between foldr and foldl', since foldl and foldl' are the same except for their strictness properties, so if both return a result, it must be the same. foldl' is the more efficient way to arrive at that result because it doesn't build a huge thunk. However, if the combining function is lazy in its first argument, foldl may happily return a result where foldl' hits an exception" The article also contains an example. Bas

On Friday 14 October 2011, 17:10:00, Yves Parès wrote:
Wow, I don't get core haskell, but I get you point. It's indeed odd foldl' doesn't use foldr (and sum doesn't use foldl' instead of foldl as (+) is strict (*)) if foldr permits loop fusion.
No, it's not odd. The fusion technology isn't yet capable of handling everything you throw at it, and you get catastrophically bad results if it fails. With the current implementation, you pay a (significant) price at the points where fusion would succeed but have good enough worst-case behaviour.
(*) Anyway, is there a place where foldl is preferable over foldl' ? Never happened to me, I always use right-folding if I want lazy evaluation, to benefit from guarded recursion.
Bas quoted and linked the Foldr Foldl Foldl' page and referred to the example. As the author of that example, I know that it is an artificial example; I thought it up to answer the question "when would foldl be better than foldl'?". I have never come across such a situation in real life.

On Friday 14 October 2011, 16:55:14, Bas van Dijk wrote:
On 13 October 2011 20:53, Albert Y. C. Lai
wrote: The number of new cons cells created in due course is Θ(length xs).
I was actually surprised by this because I expected: length(xs++ys) to fuse into one efficient loop which doesn't create cons cells at all.
Unfortunately, I was mistaken since length is defined recursively.
length :: [a] -> Int length l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#)
However, if we would define it as:
length = foldl' (l _ -> l+1) 0
And implemented foldl' using foldr as described here:
http://www.haskell.org/pipermail/libraries/2011-October/016895.html
then fuse = length(xs++ys) where for example xs = replicate 1000000 1 and ys = replicate 5000 (1::Int) would compile to the following totally fused core:
fuse :: Int fuse = case $wxs 1000000 0 of ww_srS { __DEFAULT -> I# ww_srS }
$wxs :: Int# -> Int# -> Int# $wxs = \ (w_srL :: Int#) (ww_srO :: Int#) -> case <=# w_srL 1 of _ { False -> $wxs (-# w_srL 1) (+# ww_srO 1); True -> $wxs1_rs8 5000 (+# ww_srO 1) }
$wxs1_rs8 :: Int# -> Int# -> Int# $wxs1_rs8 = \ (w_srA :: Int#) (ww_srD :: Int#) -> case <=# w_srA 1 of _ { False -> $wxs1_rs8 (-# w_srA 1) (+# ww_srD 1); True -> +# ww_srD 1 }
Yes, that's wonderful, but it's not so wonderful for types more complicated than Int. Integer is evil enough: With fuse = length ([1 .. 50000000] ++ [0 .. 60000000]) Prelude Fuse> fuse Heap exhausted; Current maximum heap size is 1258291200 bytes (1200 MB); use `+RTS -M<size>' to increase it. The current length has no problems: Prelude> length ([1 .. 50000000] ++ [0 .. 60000000]) 110000001 (2.55 secs, 11609850632 bytes) Before foldl('), sum, length can be implemented in terms of foldr to get fusion, a lot has to be done still. Currently you'd get an improvement in some cases for a catastrophic behaviour in many others. Cheers, Daniel

Yves Parès wrote:
I re-read recently a bit of RealWorldHaskell, and I saw something that puzzles me now that I have a better understanding of the language. It concerns the list concatenation being costful, and the need to use another type like DList.
[] ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys)
To me, we have here something that would be costful in a strict language, but that here, thanks to guarded recursion ((:) being non-strict), doesn't evaluate the first list until it is needed.
How comes RHW says "We can see that the cost of creating a new list depends on the length of the initial list", since nothing will be done unless we ask for our list to be evaluated, which is somewhat something we will end up asking.
Well, the run-time needed to evaluate an expression depends on how much of the expression you evaluate. If you don't evaluate anything, then you don't need any time. The cost of list concatenation refers to the following fact: "Assuming that xs and ys have been evaluated to normal form already, then the cost of evaluating xs ++ ys to normal form is Θ(length xs) reduction steps." For difference lists, the latter cost is only Θ(1). The best way to reason about other "demand patterns" than normal form is Okasaki's method of attributing a debt to each constructor. See also http://apfelmus.nfshost.com/articles/debit-method.html Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com
participants (6)
-
Albert Y. C. Lai
-
Bas van Dijk
-
Daniel Fischer
-
Heinrich Apfelmus
-
Jake McArthur
-
Yves Parès