
Hi, When I refactor my Segmented Seive of Eratosthenes, I'm puzzled by the performance of "foldr". Here is my original code. I realized that "sieveWith"(Integral a => ([a], [a]) -> ([a], [a]), it takes a tuple with sieving primes and prime candidates, and remove all multiplies of sieving primes in candidates, at last return a tuple with blank sieving primes and a pure primes list) in "primesFromTo" is not much readable and can be replaced by "foldr". But when I did, I find it would take about 5 seconds to sieve primes between 999900000 and 1000000000, whereas the original one just takes 1.6 seconds. The "sieveWith" function is the only place I change. I also have tried foldr', which does not much help. Is foldr slow? Or did I make any mistake? Thanks. === Origin (./main < data1.txt 1.45s user 0.11s system 99% cpu 1.562 total) === {-# OPTIONS_GHC -O2 #-} import Data.List import System.IO minus (x:xs) (y:ys) = case (compare x y) of LT -> x : minus xs (y:ys) EQ -> minus xs ys GT -> minus (x:xs) ys minus xs _ = xs intSqrt :: Integral a => a -> a intSqrt = floor . sqrt . fromIntegral primesTo :: Integral a => a -> [a] primesTo x = 2 : sieve [3, 5 .. x] where sieve ns@(n : rs) | n <= intSqrt x = n : sieve (rs `minus` [n * n, n * (n + 2) .. x]) | otherwise = ns sieve [] = [] candidatesBetween :: Integral a => a -> a -> [a] candidatesBetween x y = let x' = if odd x then x else x + 1 in [x', x' + 2 .. y] primesFromTo :: Integral a => a -> a -> [a] primesFromTo x y | x < 2 = primesTo y | otherwise = snd . sieveWith $ (primes, candidates) where primes = tail . primesTo . intSqrt $ y candidates = candidatesBetween x y sieveWith (a : as, bs'@(b : bs)) = sieveWith (as, bs' `minus` multiplies a b) where sieveWith ([], bs) = ([], bs) multiplies a b = let b' = b + ((-b) `mod` a) in [b', b' + 2 * a .. y] primesFromTo' [x, y] = primesFromTo x y main :: IO () main = do count <- fmap read getLine inputLines <- fmap (take count . lines) getContents let answers = map (primesFromTo' . map read . words) inputLines putStr . unlines . map (unlines . map show) $ answers === Origin === === New (./main < data1.txt 4.76s user 0.15s system 99% cpu 4.917 total) === {-# OPTIONS_GHC -O2 #-} import Data.List import System.IO minus (x:xs) (y:ys) = case (compare x y) of LT -> x : minus xs (y:ys) EQ -> minus xs ys GT -> minus (x:xs) ys minus xs _ = xs intSqrt :: Integral a => a -> a intSqrt = floor . sqrt . fromIntegral primesTo :: Integral a => a -> [a] primesTo x = 2 : sieve [3, 5 .. x] where sieve ns@(n : rs) | n <= intSqrt x = n : sieve (rs `minus` [n * n, n * (n + 2) .. x]) | otherwise = ns sieve [] = [] candidatesBetween :: Integral a => a -> a -> [a] candidatesBetween x y = let x' = if odd x then x else x + 1 in [x', x' + 2 .. y] primesFromTo :: Integral a => a -> a -> [a] primesFromTo x y | x < 2 = primesTo y | otherwise = sieveWith primes candidates where primes = tail . primesTo . intSqrt $ y candidates = candidatesBetween x y -- sieve a list of candidates with sieving primes -- foldr version: ./main < data1.txt 4.76s user 0.15s system 99% cpu 4.917 total sieveWith ps'@(p : ps) cs'@(c : cs) = foldr (\a b -> b `minus` (multiplies a c)) cs' ps' sieveWith [] cs = cs multiplies a b = let b' = b + ((-b) `mod` a) in [b', b' + 2 * a .. y] primesFromTo' [x, y] = primesFromTo x y main :: IO () main = do count <- fmap read getLine inputLines <- fmap (take count . lines) getContents let answers = map (primesFromTo' . map read . words) inputLines putStr . unlines . map (unlines . map show) $ answers === New === Best regards, Zhi-Qiang Lei zhiqiang.lei@gmail.com

On Thu, Feb 2, 2012 at 5:16 PM, Zhi-Qiang Lei
Hi,
When I refactor my Segmented Seive of Eratosthenes, I'm puzzled by the performance of "foldr". Here is my original code. I realized that "sieveWith"(Integral a => ([a], [a]) -> ([a], [a]), it takes a tuple with sieving primes and prime candidates, and remove all multiplies of sieving primes in candidates, at last return a tuple with blank sieving primes and a pure primes list) in "primesFromTo" is not much readable and can be replaced by "foldr".
let go a b = b `minus` (multiplies a c) foldr go cs' (p:p':ps) ==> foldr go cs' (p':ps) `minus` multiplies p c ==> (minus needs his first argument to start removing stuff) (foldr go cs' ps `minus` multiplies p' c) `minus` multiplies p c And so on and so forth... In other words, contrary to your first version, this function must develop the entire list before making its first suppression... Your version rather correspond to a foldl (be careful of strictness, using foldl it's pretty easy to get big thunks). Foldr is very useful for functions that are lazy in their second argument like (||) , (&&), (:) or others but if the function is strict in its second argument like yours (strict in b)... -- Jedaï

Thank you very much. You are right. foldl performs much better here. I thought foldr always has a performance advantage due to its tail recursion. On Feb 3, 2012, at 5:36 AM, Chaddaï Fouché wrote:
let go a b = b `minus` (multiplies a c) foldr go cs' (p:p':ps) ==> foldr go cs' (p':ps) `minus` multiplies p c ==> (minus needs his first argument to start removing stuff) (foldr go cs' ps `minus` multiplies p' c) `minus` multiplies p c And so on and so forth... In other words, contrary to your first version, this function must develop the entire list before making its first suppression...
Your version rather correspond to a foldl (be careful of strictness, using foldl it's pretty easy to get big thunks).
Foldr is very useful for functions that are lazy in their second argument like (||) , (&&), (:) or others but if the function is strict in its second argument like yours (strict in b)...
-- Jedaï
Best regards, Zhi-Qiang Lei zhiqiang.lei@gmail.com

On Sunday 05 February 2012, 11:33:35, Zhi-Qiang Lei wrote:
Thank you very much. You are right. foldl performs much better here. I thought foldr always has a performance advantage due to its tail recursion.
Firstly, tail recursion isn't automatically better in Haskell. Due to laziness, the evaluation sequence is different from what it'd be in eager languages, and not necessarily the order in which the steps are written in the code. If the recursive call to a (non tail-recursive) function is in a lazy argument position of the result, a lot of evaluation can take place before the recursive call is demanded. A tail recursive function on the other hand cannot deliver any part of the result before the recursion is completed. Secondly, foldr is not tail recursive, foldl is: foldr f z (x:xs) = x `f` (foldr f z xs) foldl f z (x:xs) = foldl f (f z x) xs Thus if 'f' is lazy in its second argument, like for example (++), the result can be partially delivered before the recursive call, in x ++ (foldr (++) [] xs) the whole list x has to be evaluated before the evaluation of foldr takes its next step. But if 'f' is strict in its second argument, like for example (+) on Int, the recursive call has to be evaluated before the top level call to 'f' can be evaluated, so a stack of function calls has to be built before any evaluation of an 'f'-call can start, and that stack has to be unwound during the evaluation of the 'f'-calls from the innermost to the outermost. That gives you two traversals: build stack, unwind. In such cases, tail recursion (a left fold) is better because then the call to 'f' can be evaluated before the recursive call - but it has to be forced, or it will generate a thunk ((...(z `f` x1) ... ) `f` xn-1) `f` xn (and when that thunk is finally evaluated it nuilds a stack of 'f'-calls, which is then unwound, so you get three traversals: build thunk, thunk -> stack, unwind - unless 'f' is lazy in its first argument, like flip (++), then the thunk may be consumed without building the stack).
On Feb 3, 2012, at 5:36 AM, Chaddaï Fouché wrote:
let go a b = b `minus` (multiplies a c) foldr go cs' (p:p':ps) ==> foldr go cs' (p':ps) `minus` multiplies p c ==> (minus needs his first argument to start removing stuff) (foldr go cs' ps `minus` multiplies p' c) `minus` multiplies p c And so on and so forth... In other words, contrary to your first version, this function must develop the entire list before making its first suppression...
Your version rather correspond to a foldl (be careful of strictness, using foldl it's pretty easy to get big thunks).
Foldr is very useful for functions that are lazy in their second argument like (||) , (&&), (:) or others but if the function is strict in its second argument like yours (strict in b)...
Best regards, Zhi-Qiang Lei zhiqiang.lei@gmail.com

Appreciate. That's favorable to understand recursion in Haskell. On Feb 5, 2012, at 9:25 PM, Daniel Fischer wrote:
On Sunday 05 February 2012, 11:33:35, Zhi-Qiang Lei wrote:
Thank you very much. You are right. foldl performs much better here. I thought foldr always has a performance advantage due to its tail recursion.
Firstly, tail recursion isn't automatically better in Haskell. Due to laziness, the evaluation sequence is different from what it'd be in eager languages, and not necessarily the order in which the steps are written in the code.
If the recursive call to a (non tail-recursive) function is in a lazy argument position of the result, a lot of evaluation can take place before the recursive call is demanded.
A tail recursive function on the other hand cannot deliver any part of the result before the recursion is completed.
Secondly, foldr is not tail recursive, foldl is:
foldr f z (x:xs) = x `f` (foldr f z xs)
foldl f z (x:xs) = foldl f (f z x) xs
Thus if 'f' is lazy in its second argument, like for example (++), the result can be partially delivered before the recursive call, in
x ++ (foldr (++) [] xs)
the whole list x has to be evaluated before the evaluation of foldr takes its next step.
But if 'f' is strict in its second argument, like for example (+) on Int, the recursive call has to be evaluated before the top level call to 'f' can be evaluated, so a stack of function calls has to be built before any evaluation of an 'f'-call can start, and that stack has to be unwound during the evaluation of the 'f'-calls from the innermost to the outermost. That gives you two traversals: build stack, unwind.
In such cases, tail recursion (a left fold) is better because then the call to 'f' can be evaluated before the recursive call - but it has to be forced, or it will generate a thunk ((...(z `f` x1) ... ) `f` xn-1) `f` xn (and when that thunk is finally evaluated it nuilds a stack of 'f'-calls, which is then unwound, so you get three traversals: build thunk, thunk -> stack, unwind - unless 'f' is lazy in its first argument, like flip (++), then the thunk may be consumed without building the stack).
Best regards, Zhi-Qiang Lei zhiqiang.lei@gmail.com
participants (3)
-
Chaddaï Fouché
-
Daniel Fischer
-
Zhi-Qiang Lei