
Hi Steve,
Steve
Hi Will,
I had previously tested the Melissa O'Neil prime function (using the primes 0.1.1 package) and found it to be fast (for a functional approach), but with high memory use.
To fully test a prime function, I think you should: 1. Test for more than the first 10^6 primes. Generating all primes to 10^6 is quite easy for modern computers. Most prime generating functions will run in less than 1 sec and look fast. Try generating all primes to 10^7 and 10^8 then you will see how 'fast' these lazy functional methods really are. 2. Measure memory use. As you move above 10^6 and test primes up to 10^7, 10^8, and 10^9, memory use becomes very important. A prime function with excessive memory use will soon consume all of the system's memory.
Here's another primes function which performs quite well.
primes :: [Int] primes = 2 : filter (isPrime primes) [3,5..] where isPrime (p:ps) n | mod n p == 0 = False | p*p > n = True | otherwise = isPrime ps n isPrime [] _ = False -- never used, but avoids compiler warning
Here's some results from my PC, for generating primes to 10^8.
10**6 10**7 10**8 secs MiB secs MiB secs MiB ------------------------------- 1 0.01 0 0.1 2 2 14 2 0.56 7 11.1 43 270 306 3 0.61 7 11.8 44 260 342 4 0.43 36 5.4 345 900 not finished
1 using a Haskell ST Array 2 your primes function 3 my primes function, listed above 4 Melissa O'Neil method from primes 0.1.1 package
To summarise the results from the tests I've run: - if you want fast functional primes, forget it, its not possible. - if you just want fast primes, then use C, or a Haskell array. - if performance does not matter and slow primes are acceptable, then use the purely functional approach.
Regards, Steve
you just have a fast PC that's all, :) so a million is not enough for you. My old laptop is 50 times slower. :) Seriously though, your conclusions seem entirely plausible to me. My goal here was to have a Haskell code for primes, that is decent. Nothing more. The reason your code is slightly slower is of course that in effect it recalculates the needed primes prefix for each new candidate. If you could somehow thread its length through, it might have sped it up some more. I've just tried it and it was twice slower than mine. (?) I didn't use the [Int] signature in both. Maybe it's not real issues we're dealing with here, but compiler/OS/CPU issues? (or have you've forgotten to put the [Int] signature into my code too, when tested? It runs twice as fast with it). Although your code has an advantage that it is very easy to add the wheel optimization to it. BTW I don't know about the code in the package, but the one in the article makes terrible faux-pas of adding each prime into the queue as soon as it is produced; this could easily account for a memory blow-up you're seeing. What's really needed, is to plug a decent PQ implementation into my framework, which does absolute minimum of repeated calculations it seems. What I have now, is this: qprimes = 2: 3: sieve emptyQ primes' 5 where primes' = tail qprimes sieve q (p:ps) x = h ++ sieve (addQ q' (2*p) (p*p+2*p)) ps (p*p+2) where (h,q') = noComps q [x,x+2..p*p-2] ...... The main deficiency of list-based sieves, as I've now came to realize and formulate in simple enough terms for myself to understand, is that they work with each number in isolation, and thus must test primes as well as composites. Testing composites on average is cheap, because most of them will have small factors; testing primes is costly. Imperative sieves avoid that by working over spans of numbers at once, so they get their primes for free, when they "see" gaps in produced/marked composites (I repeat myself here, but am not sure if you've read this my explanation in other posts). What counts here, is the direct access to random memory - or the numbers written out on a blackboard. Melissa's PQ approach tries to emulate that. Not crossing them off, but seeing the gaps in between. One is tempted to treat Haskell as high-level executable definition, hoping for a compiler to turn it into the fastest possible low-level code. I know I can translate it into C by hand fairly well; why wouldn't the compiler? It seems Haskell compilers have much room for improvement. :)