
Am Montag 04 Januar 2010 13:25:47 schrieb Will Ness:
Daniel Fischer
writes: Am Sonntag 03 Januar 2010 09:54:37 schrieb Will Ness:
Daniel Fischer
writes: But there's a lot of list constructuion and deconstruction necessary for the Euler sieve.
yes. Unless, of course, s smart compiler recognizes there's only one consumer for the values each multiples-list is producing, and keeps it stripped down to a generator function, and its current value. I keep thinkig a smart compiler could eliminate all these "span" calls and replace them with just some pointers manipulating...
Of course I'm no smart compiler, but I don't see how it could be even possible to replace the span calls with pointer manipulation when dealing with lazily generated (infinite, if we're really mean) lists. Even when you're dealing only with strict finite lists, it's not trivial to do efficiently.
I keep thinking that storage duplication with span, filter etc. is not really necessary, and can be replaced with some pointer chasing - especially when there's only one consumer present for the generated values.
What I mean is thinking of lists in terms of produce/consumer paradigm, as objects supporting the { pull, peek } interface, keeping the generator inside that would produce the next value on 'pull' request and keep it ready for any 'peek's.
Euler's sieve is
sieve (p:ps) xs = h ++ sieve ps (t `minus` map (p*) [p,p+2..]) where (h,t) = span (< p*p) xs
Not quite. That's a variant of the postponed filters, it crosses off e.g. 45 twice, once as 3*15 and once as 5*9 (okay, it has already been removed by the first, so let's say 45 appears in two lists of numbers to be removed if present). Euler's sieve is never attempting to remove a number more than once, that's the outstanding feature of it. Unfortunately, that also makes it hard to implement efficiently. The problem is that you can't forget the primes between p and p^2, you must remember them for the removal of multiples of p later on. An (inefficient but faithful) implementation would be euler ks@(p:rs) = p : euler (rs `minus` map (*p) ks) Its performance is really horrible though. A much better implementation is primes = 2:3:euler hprs [] (scanl (+) 5 (cycle [2,4])) where hprs = 5:drop 3 primes euler (p:ps) acc cs = h ++ euler ps (tail (acc ++ h)) (t `minus` comps) where (h,t) = span (< p*p) cs comps = map (*p) (acc ++ cs) still really bad. I don't yet see how to eat the cake and have it here.
I think that remark was meant to apply to composite removal, not Turner's sieve.
It is right there on page 2, right when the Turner's sieve is presented and discussed. The only explanation that I see is that she thought of it in regards to the imperative code, just as her analysis concentrates only on calculation aspects of the imperative code itself.
To be fair, she writes:
"Let us first describe the original “by hand” sieve algorithm as practiced by Eratosthenes. ... The starting point of p^2 is a pleasing but minor optimization, which can be made because lower multiples will have already been crossed off when we found the primes prior to p. ... (This optimization does not affect the time complexity of the sieve, however, so its absence from the code in Section 1 *is not our cause for worry*.)"
A-HA!
But its absense from _*that*_ _*code*_ WAS the *major* cause for worry, as dealing with it worked wonders on its complexity and constant factors.
I think you're overinterpreting it. Sure, it's not perfectly worded, but her concern was primarily that it's not an Eratosthenes sieve but trial division (and woefully inefficient at that), I think. Now, because it's trial division and not Eratosthenes, it has a major impact, thus the absence of the optimisation is a reinforcing consequence of the original cause for worry. I may be overdefending her, though, we can't really know what she meant.