Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

As an exercise, trying to understand the beautiful paper Stream Fusion From Lists to Streams to Nothing at All Duncan Coutts, Roman Leshchinskiy and Don Stewart http://www.cse.unsw.edu.au/~dons/papers/CLS07.html http://www.cse.unsw.edu.au/~dons/streams.html I recoded my prime sieve using a pared down version of their Stream datatype; this is the simplest version I could write that achieves a significant speedup. My reaction to their paper was, if streams are better internally than lists, why not code directly in streams? Lists enjoy a serious notational advantage in Haskell, but one could imagine a language where the list notation was reserved for stream semantics. My sieve was spending half its time in "merge", so I made only the changes necessary to convert "merge" to use streams. My streams are infinite, and "merge" can be written to not use Skip, so Step goes away. Even though "nextx" and "nexty" only have one case now, using case statements is significantly faster than using let or where clauses. I'm imagining that I read about this somewhere, but if I did, it didn't sink in until I was tuning this code. I don't know if this is related to fusion optimization, or a general effect. The timings are
[Integer] -O2 1*10^6 2*10^6 3*10^6 4*10^6 5*10^6 --------------------------------------------------------- ONeillPrimes | 3.338 | 7.320 | 11.911 | 18.225 | 21.785 StreamPrimes | 3.867 | 8.405 | 13.656 | 21.542 | 37.640 BayerPrimes | 3.960 | 8.940 | 18.528 | 33.221 | 38.568
Here is the code:
{-# OPTIONS_GHC -fglasgow-exts #-}
module StreamPrimes (primes) where
-- stream code
data Stream a = forall s. Stream (s -> (a,s)) s data AStream a = A a (AStream a) | B (Stream a)
stream :: [a] -> Stream a stream xs = Stream next xs where next [] = undefined next (x:xt) = (x,xt)
astream :: [a] -> AStream a astream [] = undefined astream (x:xt) = A x $ B $ stream xt
merge :: Ord a => Stream a -> Stream a -> Stream a merge (Stream nextx vs) (Stream nexty ws) = Stream next (vt,ws,Left v) where (v,vt) = nextx vs next (xs,ys,Left x) = case nexty ys of (y,yt) -> if x < y then (x,(xs,yt,Right y)) else (y,(xs,yt,Left x)) next (xs,ys,Right y) = case nextx xs of (x,xt) -> if x < y then (x,(xt,ys,Right y)) else (y,(xt,ys,Left x))
mergeA :: Ord a => AStream a -> AStream a -> AStream a mergeA (A x xt) ys = A x (mergeA xt ys) mergeA (B xs) ys = mergeB xs ys
mergeB :: Ord a => Stream a -> AStream a -> AStream a mergeB s@(Stream next xs) ys@(A y yt) = case next xs of (x,xt) -> if x < y then A x (mergeB (Stream next xt) ys) else A y (mergeB s yt) mergeB xs (B ys) = B $ merge xs ys
-- Code for venturi :: Ord a => [[a]] -> [a]
root :: Ord a => AStream a -> [AStream a] -> [a] root (A x xt) yss = x : (root xt yss) root (B xs) (ys:yst) = root (mergeB xs ys) yst root _ _ = undefined
pair :: Ord a => [AStream a] -> [AStream a] pair (x:y:xt) = mergeA x y : (pair xt) pair _ = undefined
group :: Ord a => [AStream a] -> [AStream a] group (x:xt) = x : (group $ pair xt) group _ = undefined
venturi :: Ord a => [[a]] -> [a] venturi (x:xt) = root (astream x) $ group $ map astream xt venturi _ = undefined
-- Code for primes :: Integral a => [a]
diff :: Ord a => [a] -> [a] -> [a] diff xs@(x:xt) ys@(y:yt) = case compare x y of LT -> x : (diff xt ys) EQ -> (diff xt yt) GT -> (diff xs yt) diff _ _ = undefined
trim :: Integral a => a -> [a] -> [a] trim p = let f m x = mod x m /= 0 in filter (f p)
seed :: Integral a => [a] seed = [2,3,5,7,11,13,17]
wheel :: Integral a => [a] wheel = drop 1 [ m*j + k | j <- [0..], k <- ws ] where m = foldr1 (*) seed ws = foldr trim [1..m] seed
multiples :: Integral a => [a] -> [[a]] multiples ws = map fst $ tail $ iterate g ([], ws) where g (_,ps@(p:pt)) = ([ m*p | m <- ps ], trim p pt) g _ = undefined
primes :: Integral a => [a] primes = seed ++ (diff wheel $ venturi $ multiples wheel)
participants (1)
-
Dave Bayer