
Am Mittwoch, 28. November 2007 22:31 schrieb Andrew Coppin:
There are problems for which it's important to know how many times a given prime factor occurs. And there are other problems where it is merely necessary to know which primes are factors. I would say it's useful to have *both* functions available...
Yup
By the way, I'm now using a new and much uglier prime seive:
size = 1000000 :: Word64
calc_primes :: IO [Word64] calc_primes = do grid <- newArray (2,size) True seive 2 grid where seive :: Word64 -> IOUArray Word64 Bool -> IO [Word64] seive p g = do mapM_ (\n -> writeArray g n False) [p, 2*p .. size] mp' <- next (p+1) g case mp' of Nothing -> return [p] Just p' -> do ps <- seive p' g return (p:ps)
next :: Word64 -> IOUArray Word64 Bool -> IO (Maybe Word64) next p g = do if p == size then return Nothing else do t <- readArray g p if t then return (Just p) else next (p+1) g
Seems moderately fast. At least, I can find the 10,001st prime in under 1 second...
One thing: since You check the array bounds, the system needn't check them again, use unsafeWrite and unsafeRead. And use Int for the index, that would be MUCH faster. Another thing: you can stop sieving when p*p > size, another speedup Third thing: In my experience mapM_ is relatively slow, hand coded loops are faster (though much uglier), but YMMV Fourth thing: sieve only odd numbers Fifth thing: better use an STUArray, don't drag IO in if it's not necessary.
The obvious downside of course is that you must know how big to make the array at the start of your program, and you must compute the entire array before you can use any of it. Both limitations not precent in the lazy recursive list implementation...
The size of the array can easily be estimated by the prime number theorem, pi(x) ~ x/log x, where pi(x) is the number of primes not exceeding x, so for 10,001 primes, find x with x/log x about 10,000, add a bit for safety, a bound of 120,000 will do. If you want the n-th prime, you don't want to use the smaller primes anyway, if you need all primes, chances are that prime generation will take negligible time in comparison to your main algo anyway. Keep enjoying Project Euler, Daniel