
On Friday 14 October 2011, 16:55:14, Bas van Dijk wrote:
On 13 October 2011 20:53, Albert Y. C. Lai
wrote: The number of new cons cells created in due course is Θ(length xs).
I was actually surprised by this because I expected: length(xs++ys) to fuse into one efficient loop which doesn't create cons cells at all.
Unfortunately, I was mistaken since length is defined recursively.
length :: [a] -> Int length l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#)
However, if we would define it as:
length = foldl' (l _ -> l+1) 0
And implemented foldl' using foldr as described here:
http://www.haskell.org/pipermail/libraries/2011-October/016895.html
then fuse = length(xs++ys) where for example xs = replicate 1000000 1 and ys = replicate 5000 (1::Int) would compile to the following totally fused core:
fuse :: Int fuse = case $wxs 1000000 0 of ww_srS { __DEFAULT -> I# ww_srS }
$wxs :: Int# -> Int# -> Int# $wxs = \ (w_srL :: Int#) (ww_srO :: Int#) -> case <=# w_srL 1 of _ { False -> $wxs (-# w_srL 1) (+# ww_srO 1); True -> $wxs1_rs8 5000 (+# ww_srO 1) }
$wxs1_rs8 :: Int# -> Int# -> Int# $wxs1_rs8 = \ (w_srA :: Int#) (ww_srD :: Int#) -> case <=# w_srA 1 of _ { False -> $wxs1_rs8 (-# w_srA 1) (+# ww_srD 1); True -> +# ww_srD 1 }
Yes, that's wonderful, but it's not so wonderful for types more complicated than Int. Integer is evil enough: With fuse = length ([1 .. 50000000] ++ [0 .. 60000000]) Prelude Fuse> fuse Heap exhausted; Current maximum heap size is 1258291200 bytes (1200 MB); use `+RTS -M<size>' to increase it. The current length has no problems: Prelude> length ([1 .. 50000000] ++ [0 .. 60000000]) 110000001 (2.55 secs, 11609850632 bytes) Before foldl('), sum, length can be implemented in terms of foldr to get fusion, a lot has to be done still. Currently you'd get an improvement in some cases for a catastrophic behaviour in many others. Cheers, Daniel