Re: [Haskell-cafe] Another optimization question

17 May 2008


      On Sat, May 17, 2008 at 10:40 PM, Daniel Fischer
 wrote:
...
Am Samstag, 17. Mai 2008 19:52 schrieb anton muhin:
...
On Sat, May 17, 2008 at 8:19 PM, Jeroen  wrote:
...
Hi, I know there's been quite some performance/optimization post lately,
so I hope there still room for one more. While solving a ProjectEuler
problem (27), I saw a performance issue I cannot explain. I narrowed it
down to the following code (never mind that 'primes' is just [1..],
the problem is the same or worse with real primes):
primes :: [Int]
primes = [1..]
isPrime :: Int -> Bool
isPrime x = isPrime' x primes
   where isPrime' x (p:ps) | x == p = True
| x > p = isPrime' x ps
                           | otherwise = False
main = print $ length (filter (== True) (map isPrime [1..5000]))
$ time ./experiment1
5000
real    0m4.037s
user    0m3.378s
sys     0m0.060s
All good, but if I change isPrime to the simpeler
isPrime x = elem x (takeWhile (<= x) primes)
it takes twice as long:
time ./experiment2
5000
real    0m7.837s
user    0m6.532s
sys     0m0.141s
With real primes, it even takes 10 times as long.
I tried looking at the output of ghc -ddump-simpl,
as suggested in a previous post, but it's a bit over
my head (newby here...).
Any suggestions?
Just a thought: in your first approach you compare any element of the
list once.  In second---twice: once to check if <= x and second time
to check if it is equal to x.  That's a hypothesis,
I thought so, too, but I couldn't reproduce the behaviour, so I'm not sure
what happens. In fact, compiling without optimisations, the first version
takes almost twice as long as the second. Compiled with -O2, the second takes
about 13% more time.
Why not -O3?
...
Using a real list of primes,
What's the size of the real list?
...
dafis@linux:~/EulerProblems/Testing> ghc --make experiment -o expleriment3
[1 of 1] Compiling Main             ( experiment.hs, experiment.o )
Linking experiment3 ...
dafis@linux:~/EulerProblems/Testing> time ./experiment3
669
real    0m0.222s
user    0m0.220s
sys     0m0.000s
dafis@linux:~/EulerProblems/Testing> ghc --make experiment -o experiment4
[1 of 1] Compiling Main             ( experiment.hs, experiment.o )
Linking experiment4 ...
dafis@linux:~/EulerProblems/Testing> time ./experiment4
669
real    0m0.299s
user    0m0.290s
sys     0m0.000s
But
dafis@linux:~/EulerProblems/Testing> ghc -O2 --make experiment -o experiment3
[1 of 1] Compiling Main             ( experiment.hs, experiment.o )
Linking experiment3 ...
dafis@linux:~/EulerProblems/Testing> ghc -O2 --make experiment -o experiment4
[1 of 1] Compiling Main             ( experiment.hs, experiment.o )
Linking experiment4 ...
dafis@linux:~/EulerProblems/Testing> time ./experiment3
669
real    0m0.053s
user    0m0.040s
sys     0m0.010s
dafis@linux:~/EulerProblems/Testing> time ./experiment4
669
real    0m0.257s
user    0m0.250s
sys     0m0.010s
Wow!
I've no idea what optimising did to the first version, but apparently it
couldn't do much for the second.
...
but another
implementation of isPrime:
isPrime x = (== x) $ head $ dropWhile (< x) primes
With -O2, this is about 20% slower than the Jeroen's first version, without
optimisations 50% faster.
Strange.
Well, head has its overhead as well.  Cf. two variants:

firstNotLess :: Int -> [Int] -> Int
firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x

dropLess :: Int -> [Int] -> [Int]
dropLess s l@(x:xs) = if x < s then dropLess s xs else l

isPrime :: Int -> Bool
isPrime x = x == (firstNotLess x primes)

isPrime' :: Int -> Bool
isPrime' x = x == (head $ dropLess x primes)

On my box firstNotLess gives numbers pretty close (if not better) than
Jeroen's first variant, while head $ dropLess notably worse.
...
isPrime :: Int -> Bool
isPrime x = go primes
     where
       go (p:ps) = case compare x p of
                       LT -> False
                       EQ -> True
                       GT -> go ps
does best (on my box), with and without optimisations (very very slightly with
-O2) for a list of real primes, but not for [1 .. ].
And what happens for [1..]?
...
However, more than can be squished out of fiddling with these versions could
be gained from a better algorithm.
Definitely.
yours,
anton.