Re: Genuine Eratosthenes sieve [Was: Optimization fun]

Sorry, I'm a little late to this thread, but the topic of
sieve [] = [] sieve (x:xs) = x : sieve [y | y <- xs, y `mod` x /= 0] (as posted by Rafael Almeida) is somewhat dear to my heart, as I wrote a paper about it last summer and sent it in to JFP. Still waiting for a reply though.
Let's go back to the beginning with the classic complaint and the excuses... Creighton Hogg wrote:
So a friend and I were thinking about making code faster in Haskell, and I was wondering if there was a way to improve the following method of generating the list of all prime numbers. It takes about 13 seconds to run, meanwhile my friend's C version took 0.1.
Here come the excuses, like this one from apfelmus,
While Haskell makes it possible to express very complicated algorithms in simple and elegant ways, you have to expect to pay a constant factor (roughly 2x-10x) when competing against the same algorithm in low-level C. and this one from Nicolas Frisby, I have yet to need my Haskell to perform well
Matthew Brecknell came up with something much faster, namely
primes :: [Int] primes = 2 : filter isPrime [3,5..] where f x p r = x < p*p || mod x p /= 0 && r isPrime x = foldr (f x) True primes
FWIW, another way to write this code (without needing to think about how "fold bails early") is primes = 2: [x | x <−[3,5..], all (\p −> x `mod` p > 0) (factorsToTry x)] where factorsToTry x = takeWhile (\p −> p*p <= x) primes Both of these algorithms best the "sieve" we began with and run quickly, but as you can see (possibly more clearly from my rephrasing), this algorithm is not actually the Sieve of Eratosthenes -- it's actually a classic naive primes algorithm which checks a number for primality by trying to divide it by every prime up to its square root. But that's okay, because our initial algorithm ISN'T THE REAL SIEVE EITHER. Markus Fischmann hits the nail on the head when he says
The characteristics of a sieve is, that it uses the already found primes to generate a list of non-primes that is then removed from a list of candiates for primeness.
But then we get distracted by a discussion about avoiding division. It's true that the real sieve avoids division, but it is NOT true that every algorithm that avoids division is the sieve. The thread ends with this algorithm from Yitzchak Gale:
-- Delete the elements of the first list from the second list, -- where both lists are assumed sorted and without repetition. deleteOrd :: Ord a => [a] -> [a] -> [a] deleteOrd xs@(x:xs') ys@(y:ys') | x > y = y : deleteOrd xs ys' | x < y = deleteOrd xs' ys | otherwise = deleteOrd xs' ys' deleteOrd _ ys = ys
sieve (x:xs) = x : sieve (deleteOrd [x+x,x+x+x..] xs) sieve _ = []
primes = sieve [2..]
Which seems reasonable, until you realize that every prime p we come up with is still "considered" by k deleteOrd filters, where k is the number of primes that preceeded it. So, let's recap: the original algorithm is beautiful and simple, but it is NOT the actual Sieve of Eratosthenes, NOT because it uses modulus, but because fundamentally, at the highest level, it is a different algorithm. At the risk of beating a dead horse, let's see why it's not the real sieve. What makes the sieve an efficient algorithm are the details of *what* gets "crossed off", *when*, and *how many times*. Suppose that we are finding the first 100 primes (i.e., 2 through 541), and have just discovered that 17 is prime. We will begin crossing off at 289 (i.e., 17 * 17) and cross off the multiples 289, 306, 323, ... , 510, 527, making fifteen crossings off in total. Notice that we cross off 306 (17 * 18), even though it is a multiple of both 2 and 3 and has thus already been crossed off twice. (The starting point of 17 * 17 is a pleasing, but actually *minor*, optimization for the *genuine* sieve.) The algorithm is efficient because each composite number, c, gets crossed off f times, where f is the number of unique factors of c less than sqrt(c). The average value for f increases slowly, being less than 3 for the first 10^12 composites, and less than 4 for the first 10^34. None of the algorithms we've discussed correspond to the above algorithm. It is not merely that they don't perform "optimizations", such as starting at the square of the prime, or that some of then use a divisibility check rather than using a simple increment. Rather, at a fundamental level, they all work differently than the real sieve. Following our earlier example, after finding that 17 is prime, the "phony" algorithm will check to see if 19 is divisible by 17 (in the case of Yitzchak's algorithm, divisibility is checked by comparing against 17*2), followed by 23, 29, 31, ... , 523, 527, checking a total of ninety-nine numbers for divisibility by 17. In fact, even if it did (somehow) begin at 289, it would examine all forty-five numbers that are not multiples of the primes prior to 17 (289, 293, 307, ... , 523, 527). It is also worth realizing that the set of numbers examined by the phony algorithm is not a superset of the ones examined by the real sieve (as we saw, the real algorithm crossed off 306, which the phony algorithm skips). In general, the speed of the phony algorithm depends on the number of primes that are *not* factors of each composite, whereas the speed of the real algorithm depends on the number of (unique) primes that *are*. Hopefully at this point I've piqued your interest. Maybe you're wondering what our original sieve algorithm is, if it isn't the real sieve and the mod issue is just a red herring...? Maybe you're wondering what the the real sieve looks like, coded in Haskell. Or maybe you're still not convinced about any of it. If so, I invite you to read a draft of my JFP paper, available at: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf Best Regards, Melissa.

You took my quote entirely out of context. In particular, you omitted
the rest of the sentence "but I'm sure that day will come." My
statement was no excuse by any stretch of the imagination--I was
initially confused when I saw it in your post and then a bit offended.
The original intent of my statement was as a preface to the fact that
I enjoyed reading the discussion and appreciated everyone's
involvement. On that note, thanks for participating.
I recognize the particular "excuse" you intended my quote to
represent, but I think it was inappropriate to chop my quote to suit
your needs--it makes me seem short-sighted in the eyes of the
community when, ironically enough, it was actually part of a
prediction that I will need performance from my Haskell someday.
I do feel a little defamed, but I don't want to go off topic on the
list, especially for personal issues. Just try to consider the affect
it will have on others when looking for quotes in the future.
Thanks and no worries,
Nick
On 2/19/07, Melissa O'Neill
Sorry, I'm a little late to this thread, but the topic of
sieve [] = [] sieve (x:xs) = x : sieve [y | y <- xs, y `mod` x /= 0] (as posted by Rafael Almeida) is somewhat dear to my heart, as I wrote a paper about it last summer and sent it in to JFP. Still waiting for a reply though.
Let's go back to the beginning with the classic complaint and the excuses...
Creighton Hogg wrote:
So a friend and I were thinking about making code faster in Haskell, and I was wondering if there was a way to improve the following method of generating the list of all prime numbers. It takes about 13 seconds to run, meanwhile my friend's C version took 0.1.
Here come the excuses, like this one from apfelmus,
While Haskell makes it possible to express very complicated algorithms in simple and elegant ways, you have to expect to pay a constant factor (roughly 2x-10x) when competing against the same algorithm in low-level C. and this one from Nicolas Frisby, I have yet to need my Haskell to perform well
Matthew Brecknell came up with something much faster, namely
primes :: [Int] primes = 2 : filter isPrime [3,5..] where f x p r = x < p*p || mod x p /= 0 && r isPrime x = foldr (f x) True primes
FWIW, another way to write this code (without needing to think about how "fold bails early") is
primes = 2: [x | x <−[3,5..], all (\p −> x `mod` p > 0) (factorsToTry x)] where factorsToTry x = takeWhile (\p −> p*p <= x) primes
Both of these algorithms best the "sieve" we began with and run quickly, but as you can see (possibly more clearly from my rephrasing), this algorithm is not actually the Sieve of Eratosthenes -- it's actually a classic naive primes algorithm which checks a number for primality by trying to divide it by every prime up to its square root.
But that's okay, because our initial algorithm ISN'T THE REAL SIEVE EITHER. Markus Fischmann hits the nail on the head when he says
The characteristics of a sieve is, that it uses the already found primes to generate a list of non-primes that is then removed from a list of candiates for primeness.
But then we get distracted by a discussion about avoiding division. It's true that the real sieve avoids division, but it is NOT true that every algorithm that avoids division is the sieve. The thread ends with this algorithm from Yitzchak Gale:
-- Delete the elements of the first list from the second list, -- where both lists are assumed sorted and without repetition. deleteOrd :: Ord a => [a] -> [a] -> [a] deleteOrd xs@(x:xs') ys@(y:ys') | x > y = y : deleteOrd xs ys' | x < y = deleteOrd xs' ys | otherwise = deleteOrd xs' ys' deleteOrd _ ys = ys
sieve (x:xs) = x : sieve (deleteOrd [x+x,x+x+x..] xs) sieve _ = []
primes = sieve [2..]
Which seems reasonable, until you realize that every prime p we come up with is still "considered" by k deleteOrd filters, where k is the number of primes that preceeded it.
So, let's recap: the original algorithm is beautiful and simple, but it is NOT the actual Sieve of Eratosthenes, NOT because it uses modulus, but because fundamentally, at the highest level, it is a different algorithm. At the risk of beating a dead horse, let's see why it's not the real sieve.
What makes the sieve an efficient algorithm are the details of *what* gets "crossed off", *when*, and *how many times*. Suppose that we are finding the first 100 primes (i.e., 2 through 541), and have just discovered that 17 is prime. We will begin crossing off at 289 (i.e., 17 * 17) and cross off the multiples 289, 306, 323, ... , 510, 527, making fifteen crossings off in total. Notice that we cross off 306 (17 * 18), even though it is a multiple of both 2 and 3 and has thus already been crossed off twice. (The starting point of 17 * 17 is a pleasing, but actually *minor*, optimization for the *genuine* sieve.)
The algorithm is efficient because each composite number, c, gets crossed off f times, where f is the number of unique factors of c less than sqrt(c). The average value for f increases slowly, being less than 3 for the first 10^12 composites, and less than 4 for the first 10^34.
None of the algorithms we've discussed correspond to the above algorithm. It is not merely that they don't perform "optimizations", such as starting at the square of the prime, or that some of then use a divisibility check rather than using a simple increment. Rather, at a fundamental level, they all work differently than the real sieve. Following our earlier example, after finding that 17 is prime, the "phony" algorithm will check to see if 19 is divisible by 17 (in the case of Yitzchak's algorithm, divisibility is checked by comparing against 17*2), followed by 23, 29, 31, ... , 523, 527, checking a total of ninety-nine numbers for divisibility by 17. In fact, even if it did (somehow) begin at 289, it would examine all forty-five numbers that are not multiples of the primes prior to 17 (289, 293, 307, ... , 523, 527). It is also worth realizing that the set of numbers examined by the phony algorithm is not a superset of the ones examined by the real sieve (as we saw, the real algorithm crossed off 306, which the phony algorithm skips). In general, the speed of the phony algorithm depends on the number of primes that are *not* factors of each composite, whereas the speed of the real algorithm depends on the number of (unique) primes that *are*.
Hopefully at this point I've piqued your interest. Maybe you're wondering what our original sieve algorithm is, if it isn't the real sieve and the mod issue is just a red herring...? Maybe you're wondering what the the real sieve looks like, coded in Haskell. Or maybe you're still not convinced about any of it. If so, I invite you to read a draft of my JFP paper, available at:
http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
Best Regards,
Melissa.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Melissa O'Neill wrote:
FWIW, another way to write this code (without needing to think about how "fold bails early") is
primes = 2: [x | x <−[3,5..], all (\p −> x `mod` p > 0) (factorsToTry x)] where factorsToTry x = takeWhile (\p −> p*p <= x) primes
Well, you certainly thought of the fact that 'all' bails early :)
[...] But that's okay, because our initial algorithm ISN'T THE REAL SIEVE EITHER. [...] The thread ends with this algorithm from Yitzchak Gale:
[...] sieve (x:xs) = x : sieve (deleteOrd [x+x,x+x+x..] xs) sieve _ = []
Which seems reasonable, until you realize that every prime p we come up with is still "considered" by k deleteOrd filters, where k is the number of primes that preceeded it. [...] What makes the sieve an efficient algorithm are the details of *what* gets "crossed off", *when*, and *how many times*.
Ah, thank you for that fresh slap in the face. Somehow, the name 'sieve' clouded the fact that the point of a sieve is how to represent it as data structure in order to avoid too many multiple crosses. Iterating 'deleteOrd' organizes the "cross off" in linear time whereas logarithmic time is clearly conceivable. Though I think that the `mod`-algorithm counts as sieve, it just does a bad job at organizing the "crossing off".
The algorithm is efficient because each composite number, c, gets crossed off f times, where f is the number of unique factors of c less than sqrt(c). The average value for f increases slowly, being less than 3 for the first 10^12 composites, and less than 4 for the first 10^34.
This analysis misses the fact that managing the "cross off table" invariably needs log(#primes <= c) ~ log(c) extra time for every c. I'm not quite sure, but I think that this has to be multiplied (not added) with f. The other algorithms don't have an extra factor, although f is clearly larger for them. Regards, apfelmus

apfelmus wrote:
I think that the `mod`-algorithm counts as sieve, it just does a bad job at organizing the "crossing off".
I'd say that it might count as "a sieve", but *algorithmically* it is not "The Sieve of Eratosthenes". If you abstract away the algorithmic details to a mathematical essence, you can argue they're the same, but I'd argue that the algorithmic details are what make people want to use the Sieve of Eratosthenes in the first place. Algorithmically, when you say "remove all the multiples of 17" it really matters *how* you do it (c.f., an abstract mathematical perspective, where we might not care). I'd argue that Eratosthenes did say how to do it, and doing it a different way is misleading enough that we should NOT call the resulting code "The Sieve of Eratosthenes". Yitzchak Gale:
We did not cross out any number more than once. But we did consider each multiple of every prime, moving on if we found it already crossed off. My algorithm does exactly the same.
Actually it doesn't. Every composite gets handled by a stack of deleteOrd filters, each one trying to remove multiples of one prime. Whether you write sieve (x:xs) = x : sieve (deleteOrd [x+x,x+x+x..] xs) or sieve (x:xs) = x : sieve (filter (\c -> c `mod` x > 0) xs) you're essentially doing the same amount of work. Both ``deleteOrd [x +x,x+x+x..]'' and ``filter (\c -> c `mod` x > 0)'' do exactly the same job -- your version goes faster because it avoids division, but that's all. As I showed in my earlier post, with the worked example of removing all the multiples of 17, - Eratosthenes's sieve never says the equivalent of, "Hmm, I wonder if 19 is a multiple of 17" -- but both the basic "sieve" we began with and the deleteOrd version do - Eratosthenes's sieve crosses-off/looks-at 459 (i.e., 17 * 27), even though we crossed it off already when we crossed off multiples of 3 -- whether you call that crossing it off a second time, or merely stepping over it, it still needs to alight on 459 to get to 493 (17 * 29), which hasn't been crossed off yet The first way of crossing off multiples of 17 is inefficient -- it checks each composite against all the primes up to its smallest factor. That is exactly what the classic naive primes algorithm does. The second method is efficient -- it touches each composite once for each unique factor it has. Of course, the first is easy to implement as a one-liner and the second isn't, but (sadly), that doesn't make the first way right and the second way wrong. Yitzchak Gale also wrote:
It is true that I have never read Eritosthenes in the original, but my deleteOrd algorithm was meant to reproduce faithfully the sieve algorithm as I was taught it in grade school.
I don't know which version you learned in grade school. If you learned it as a mathematical abstraction (e.g., as a way to define what the primes are), you have faithfully reproduced that abstraction. If you learned it as a technique for people to use to efficiently produce lists of primes, you have *not* reproduced it, because I'll bet in grade school you never checked 19 for divisibility by 17. The bait and switch occurs when people remember only the mathematical abstraction and then look for elegant ways to recreate (only) that abstraction. If you do that in a way that is fundamentally different at an algorithmic level, it shouldn't come as a surprise when that implementation doesn't run as efficiently as the original algorithm. You also shouldn't be surprised when someone complains that you haven't really implemented that original algorithm.
A few days ago, I taught the sieve to my 6 year old daughter, the same way I learned it. She loved it!
Unless you broke out the set notation with your six year old, you probably explained it operationally. Thus you explained the algorithm, not a mathematical abstraction drawn from the algorithm. As such I'm sure you taught her the real thing. And if you want to teach her cool Haskell one-liners, the "sieve" example that began this thread is delightfully brief. But don't expect it to run quickly. And you'd be doing her a disservice and propagating confusion if you tell her that it's algorithmically the same as the genuine Sieve of Eratosthenes. Melissa.

Hi Melissa, You wrote:
- Eratosthenes's sieve never says the equivalent of, "Hmm, I wonder if 19 is a multiple of 17" -- but both the basic "sieve" we began with and the deleteOrd version do
So then maybe I taught my daughter the wrong thing. When she does 17, she moves ahead one number at a time. When she gets to a multiple of 17, she crosses it off. So yes, she does consider whether 19 is a multiple of 17. I tried to get her to use my way of deciding what is the next multiple. Then she could do something a little more like what you are saying by using the layout of the sieve to jump ahead more quickly to the next multiple. But she insists on using apfelmus' way.
- Eratosthenes's sieve crosses-off/looks-at 459 (i.e., 17 * 27), even though we crossed it off already when we crossed off multiples of 3 -- whether you call that crossing it off a second time, or merely stepping over it, it still needs to alight on 459 to get to 493 (17 * 29), which hasn't been crossed off yet
True, the deleteOrd algorithm has that extra optimization. So I guess my sieve is actually: crossOff :: Ord a => [a] -> [Either a a] -> [Either a a] crossOff xs@(x:xs') ys@(y:ys') | x < y' = crossOff xs' ys | x > y' = y : crossOff xs ys' | otherwise = Left y' : crossOff xs' ys' where y' = fromEither y crossOff _ y = y sieve (Right x:xs) = Right x : sieve (crossOff [x+x,x+x+x..] xs) sieve (Left x:xs) = Left x : sieve xs primes = catRights $ sieve $ map Right [2..] fromEither = either id id catRights :: [Either a b] -> [b] catRights (Right x:xs) = x : catRights xs catRights ( _:xs) = catRights xs catRights _ = [] That is the Sieve of Yitzchak. My daughter is using the Sieve of Apfelmus. I guess you can still claim that neither is the Sieve of Eratosthenes. Regards, Yitz

Yitzchak Gale wrote:
Melissa O'Neill wrote:
- Eratosthenes's sieve never says the equivalent of, "Hmm, I wonder if 19 is a multiple of 17" -- but both the basic "sieve" we began with and the deleteOrd version do
So then maybe I taught my daughter the wrong thing. When she does 17, she moves ahead one number at a time. When she gets to a multiple of 17, she crosses it off. So yes, she does consider whether 19 is a multiple of 17.
Yes, thanks Yitzchak, that's exactly the point. In order to cross off a number, one has to find it and then to cross it. The trouble comes from the fact that both notions are somewhat interchangeable. In a sense, 'deleteOrd' searches all subsequent numbers to find the next one to cross off. At the same time, it considers all subsequent numbers whether they should be crossed off or not. Melissa's priority queue walks all numbers and decides in O(log n) whether the scrutinee is the next number to be crossed. It's the 'find' part that is algorithmically expensive. The point about "Eratosthenes's sieve" is that it does not specify algorithmically how to find the next number to be crossed. It does not even define how to store (crossed) numbers, it stores them "on paper". Of course, the computer needs to know both and there is freedom of interpretation how to implement it. But I'm glad that Melissa pointed out that this implementation has to be chosen consciously. So, I argue that Yitzchak's 'deleteOrd' is a genuine "sieve of Eratosthenes". The naive 'filter (\x -> x `mod` p /= 0)' is it as well. But I agree that the latter deserves more a name like "transposed sieve of Eratosthenes". Regards, apfelmus

Hi Melissa, I enjoyed your article. I especially like your trick of starting with p*p. You wrote:
Which seems reasonable, until you realize that every prime p we come up with is still "considered" by k deleteOrd filters, where k is the number of primes that preceeded it.
It is true that I have never read Eritosthenes in the original, but my deleteOrd algorithm was meant to reproduce faithfully the sieve algorithm as I was taught it in grade school. We did not cross out any number more than once. But we did consider each multiple of every prime, moving on if we found it already crossed off. My algorithm does exactly the same. I do not deny that primes can be found more efficiently. But I believe that my algorithm is exactly what I was taught as the sieve. So it feels genuine to me. A few days ago, I taught the sieve to my 6 year old daughter, the same way I learned it. She loved it! She is currently working on memorizing the multiplication tables, so the sieve is intriguing to her. I'm not sure if lazy priority queues would be quite so intriguing, though. I hope I have not poisoned her mind. Regards, Yitz

i have to say that i find the folklore sieve quite acceptable as a sieve, whereas the faster test-against-primes is obviously different. but the discussion has prompted me to write yet another sieve, perhaps more acceptable to purists. instead of using (mod p) repeatedly and filtering directly, we zero out multiples of p, for all p, then filter out non-zero elements in a second phase: primes = 2 : filter (>2) (foldr sieve id primes [0..]) -- start sieve for p at position p*p, zeroing position off=(p-1)*(p-1) sieve p f xs@(off:_) = 0: tail (applyAt (p*p-off) (f . sieve' p) xs) -- zero positions that are multiples of p sieve' p = applyAt p (\(_:b)->sieve' p (0:b)) -- like splitAt, followed by (++), where only suffix is operated on -- infinite lists, non-negative offset applyAt 0 f xs = f xs applyAt n f (x:xs) = x:applyAt (n-1) f xs it seems to be a lot faster than the folklore sieve, and within a factor of 2 of the faster test-against-primes (given the repeated use of applyAt, I guess it could be sped up further by using a piecewise contiguous list representation, aka polymorphic ByteString). claus
participants (5)
-
apfelmus@quantentunnel.de
-
Claus Reinke
-
Melissa O'Neill
-
Nicolas Frisby
-
Yitzchak Gale