Let's make takeWhile fuse!

`takeWhile` doesn't do the stream fusion thing. This definition seems to fix that, at least to a great extent. I don't know how to write the necessary rules to make this definition be used in the right places. takeWhile :: forall a . (a -> Bool) -> [a] -> [a] takeWhile p xs = build tw' where tw' :: forall b . (a -> b -> b) -> b -> b tw' kons knil = foldr go knil xs where go x rest | p x = x `kons` rest | otherwise = knil Tests (performed on 7.8.3, but the definition hasn't changed in HEAD): In the trivial main = print $ length $ takeWhile (< 10000000) [(1::Int) .. 20000000] it fused completely, allocating virtually nothing, whereas using Prelude.takeWhile (on 7.8.3) required a whopping 1,360,049,352 In the more complex main = print $ length $ filter (\x -> x `rem` 3 == 0) $ takeWhile (< 10000000) $ filter even [(1::Int) .. 20000000] it fused partially, allocating 266,716,048 bytes. The current GHC definition allocates 746,716,000 bytes in this case (strangely, less than it used for the simple test)! David Feuer

How is the performance when there is no fusion to be had?
Many of the fused operations in GHC.List take care to rewrite back to
simple definitions when fusion doesn't happen. I don't know if there's
still a real motivation behind that, but it's important to check.
On Mon, Jul 21, 2014 at 10:29 PM, David Feuer
`takeWhile` doesn't do the stream fusion thing. This definition seems to fix that, at least to a great extent. I don't know how to write the necessary rules to make this definition be used in the right places.
takeWhile :: forall a . (a -> Bool) -> [a] -> [a] takeWhile p xs = build tw' where tw' :: forall b . (a -> b -> b) -> b -> b tw' kons knil = foldr go knil xs where go x rest | p x = x `kons` rest | otherwise = knil
Tests (performed on 7.8.3, but the definition hasn't changed in HEAD):
In the trivial
main = print $ length $ takeWhile (< 10000000) [(1::Int) .. 20000000]
it fused completely, allocating virtually nothing, whereas using Prelude.takeWhile (on 7.8.3) required a whopping 1,360,049,352
In the more complex
main = print $ length $ filter (\x -> x `rem` 3 == 0) $ takeWhile (< 10000000) $ filter even [(1::Int) .. 20000000]
it fused partially, allocating 266,716,048 bytes.
The current GHC definition allocates 746,716,000 bytes in this case (strangely, less than it used for the simple test)!
David Feuer _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
participants (2)
-
Dan Doel
-
David Feuer