Re: [Haskell-cafe] Re: Another optimization question

18 May 2008

      Any chances you're using Integer instead of Int?  On my box switch to
Integer is quite expensive (as one could expect).

yours,
anton.

On Sun, May 18, 2008 at 9:45 AM, Jeroen  wrote:
...
Daniel Fischer  writes:
...
Am Samstag, 17. Mai 2008 22:48 schrieb anton muhin:
...
Why not -O3?
As far as I know - and Brandon S.Allbery said so, too - -O3 isn't better than
-O2.
...
...
Using a real list of primes,
What's the size of the real list?
arbitrary, however, since it's [Int], it will actually be at most 105097565
primes long, but since only numbers up to 5000 are checked, only 670 primes
will ever be considered - When I check numbers 1 to 50000 (5133 primes, so
5134 primes evaluated), with -O0 / -O2, it's
Jeroen 1 : 14.5 s / 2.4 s
Jeroen 2 : 52.5 s / 49.7 s
(== x) . head . dropWhile (< x) : 8.2 s /4.1 s
go primes : 5.5 s / 2.5 s
So Jeroen 1 is the best to be optimised :)
...
...
...
but another
implementation of isPrime:
isPrime x = (== x) $ head $ dropWhile (< x) primes
With -O2, this is about 20% slower than the Jeroen's first version,
without optimisations 50% faster.
Strange.
Well, head has its overhead as well.  Cf. two variants:
firstNotLess :: Int -> [Int] -> Int
firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x
dropLess :: Int -> [Int] -> [Int]
dropLess s l@(x:xs) = if x < s then dropLess s xs else l
isPrime :: Int -> Bool
isPrime x = x == (firstNotLess x primes)
isPrime' :: Int -> Bool
isPrime' x = x == (head $ dropLess x primes)
On my box firstNotLess gives numbers pretty close (if not better) than
for primes up to 50000, that's
6.8 s / 2.1 s with -O0 / -O2 respectively on mine
...
Jeroen's first variant, while head $ dropLess notably worse.
5.8 s / 2.4 s here.
...
...
isPrime :: Int -> Bool
isPrime x = go primes
     where
       go (p:ps) = case compare x p of
                       LT -> False
                       EQ -> True
                       GT -> go ps
does best (on my box), with and without optimisations (very very slightly
with -O2) for a list of real primes, but not for [1 .. ].
And what happens for [1..]?
With -O2, Jeroen 1 was best (1.62), nex firstNotLess (1.63), head . dropLess
(1.74), then in close succesion Jeroen 2 (1.92), go primes (1.96) and head .
dropWhile (1.99),
with -O0, it's
head . dropWhile (1.7 s, YES, that actually is faster with -O0 than with
-O2!), head . dropLess (2.0), Jeroen 2 and firstNotLess (2.1 s), go primes
(2.3 s), Jeroen 1 (3.2 s).
Weirder and weirder.
...
...
However, more than can be squished out of fiddling with these versions
could be gained from a better algorithm.
Definitely.
yours,
anton.
Thanks for the responses so far!
I only tested with -O2 and my primes implementation
is the Sieve of Eratosthenes and has signature
primes :: Integral a => [a]
What's also quite strange is that experiment2 is about
10 times time slower than experiment1 when
using primes (with the Eratosthenes formula) instead of [1..].
I redid the experiments with -prof and the output was quite revealing:
experiment1 (fastest):
total time  =        2.64 secs   (132 ticks @ 20 ms)
total alloc =     323,356 bytes  (excludes profiling overheads)
individual    inherited
COST CENTRE   entries  %time %alloc   %time %alloc
MAIN                 0   0.0    0.0   100.0  100.0
 main                1   0.0    0.5     0.0    0.5
 CAF                 4   0.0    0.0   100.0   99.0
 primes             1   9.8   61.9     9.8   61.9
 main               0   0.0   37.1    90.2   37.1
  isPrime        5000  90.2    0.0    90.2    0.0
 CAF                 4   0.0    0.4     0.0    0.4
experiment2 (slowest):
total time  =        6.12 secs   (306 ticks @ 20 ms)
total alloc = 350,473,356 bytes  (excludes profiling overheads)
individual    inherited
COST CENTRE entries  %time %alloc   %time %alloc
MAIN              0   0.0    0.0   100.0  100.0
 main             1   0.0    0.0     0.0    0.0
 CAF              4   0.0    0.0   100.0  100.0
 primes          1   0.0    0.1     0.0    0.1
 main            0   0.0    0.0   100.0   99.9
  isPrime     5000 100.0   99.9   100.0   99.9
 CAF              4   0.0    0.0     0.0    0.0
Would this be only because isPrime of experiment 2 builds
this temporary list (takeWhile) all the time?
Jeroen Baekelandt
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe