
On Sat, Sep 11, 2010 at 5:13 AM, michael rice
Which of these would be more costly for a long list?
f :: [Int] -> [Int] f [x] = [x] f (x:xs) = x + (head xs) : f xs
f :: [Int] -> [Int] f [x] = [x] f (x:y:xs) = x + y : f (y:xs)
Use Criterion[1] to find out: module Main where import Criterion.Main f1, f2, f3, f4 :: [Int] -> [Int] f1 [x] = [x] f1 (x:xs) = x + head xs : f1 xs f2 [x] = [x] f2 (x:y:xs) = x + y : f2 (y:xs) f3 [x] = [x] f3 (x:xs@(y:_)) = x + y : f3 xs f4 [x] = [x] f4 (x:y:xs) = x + y : go y xs where go x [] = [x] go x (y:xs) = x + y : go y xs benchMark s f = bench s $ whnf (\n -> sum $ f [1..n]) 1000000 main = defaultMain [ benchMark "f1" f1 , benchMark "f2" f2 , benchMark "f3" f3 , benchMark "f4" f4 ] now compile and run it: $ ghc --make Benchmark.hs -O2 -o benchmark ... $ ./benchmark warming up estimating clock resolution... mean is 24.29944 us (40001 iterations) found 1405 outliers among 39999 samples (3.5%) 721 (1.8%) high mild 684 (1.7%) high severe estimating cost of a clock call... mean is 1.844233 us (45 iterations) found 2 outliers among 45 samples (4.4%) 2 (4.4%) high severe benchmarking f1 collecting 100 samples, 1 iterations each, in estimated 7.917595 s bootstrapping with 100000 resamples mean: 80.04258 ms, lb 79.85129 ms, ub 80.24094 ms, ci 0.950 std dev: 1.000711 ms, lb 878.8460 us, ub 1.179558 ms, ci 0.950 found 1 outliers among 100 samples (1.0%) variance introduced by outliers: 0.990% variance is unaffected by outliers benchmarking f2 collecting 100 samples, 1 iterations each, in estimated 8.171391 s bootstrapping with 100000 resamples mean: 83.13315 ms, lb 82.93615 ms, ub 83.33348 ms, ci 0.950 std dev: 1.017999 ms, lb 904.5153 us, ub 1.174008 ms, ci 0.950 variance introduced by outliers: 0.990% variance is unaffected by outliers benchmarking f3 collecting 100 samples, 1 iterations each, in estimated 8.297014 s bootstrapping with 100000 resamples mean: 82.66586 ms, lb 82.34780 ms, ub 83.39774 ms, ci 0.950 std dev: 2.339937 ms, lb 976.2940 us, ub 4.133495 ms, ci 0.950 found 9 outliers among 100 samples (9.0%) 7 (7.0%) high mild 2 (2.0%) high severe variance introduced by outliers: 0.998% variance is unaffected by outliers benchmarking f4 collecting 100 samples, 1 iterations each, in estimated 8.080888 s bootstrapping with 100000 resamples mean: 80.80089 ms, lb 80.61719 ms, ub 80.99542 ms, ci 0.950 std dev: 968.1706 us, lb 872.7758 us, ub 1.097217 ms, ci 0.950 variance introduced by outliers: 0.990% variance is unaffected by outliers So to summarize from fastest to slowest: f1: mean: 80.04258 ms f4: mean: 80.80089 ms f3: mean: 82.66586 ms f2: mean: 83.13315 ms To find out why f1 is the fastest you can look at the core using ghc-core[2]: $ ghc-core -- -O2 Benchmark.hs f1 :: [Int] -> [Int] GblId f1 = \ (ds_d1h0 :: [Int]) -> case ds_d1h0 of _ { [] -> f11; : x_a12V ds1_d1h1 -> case ds1_d1h1 of _ { [] -> : @ Int x_a12V ([] @ Int); : ipv_s1hv ipv1_s1hw -> : @ Int (case x_a12V of _ { I# x1_a1k6 -> case ipv_s1hv of _ { I# y_a1ka -> I# (+# x1_a1k6 y_a1ka) } }) (f1_$sf1 ipv1_s1hw ipv_s1hv) } } f1_$sf1 :: [Int] -> Int -> [Int] GblId f1_$sf1 = \ (sc_s1FL :: [Int]) (sc1_s1FM :: Int) -> case sc_s1FL of _ { [] -> : @ Int sc1_s1FM ([] @ Int); : ipv_s1hv ipv1_s1hw -> : @ Int (case sc1_s1FM of _ { I# x_a1k6 -> case ipv_s1hv of _ { I# y_a1ka -> I# (+# x_a1k6 y_a1ka) } }) (f1_$sf1 ipv1_s1hw ipv_s1hv) } f4 :: [Int] -> [Int] GblId f4 = \ (ds_d1gl :: [Int]) -> case ds_d1gl of _ { [] -> f41; : x_a13f ds1_d1gm -> case ds1_d1gm of _ { [] -> : @ Int x_a13f ([] @ Int); : y_a13h xs_a13i -> : @ Int (plusInt x_a13f y_a13h) (f4_go y_a13h xs_a13i) } } f4_go = \ (x_a13k :: Int) (ds_d1gu :: [Int]) -> case ds_d1gu of _ { [] -> : @ Int x_a13k ([] @ Int); : y_a13m xs_a13n -> : @ Int (plusInt x_a13k y_a13m) (f4_go y_a13m xs_a13n) } f3 :: [Int] -> [Int] GblId f3 = \ (ds_d1gD :: [Int]) -> case ds_d1gD of _ { [] -> f31; : x_a13b ds1_d1gE -> case ds1_d1gE of _ { [] -> : @ Int x_a13b ([] @ Int); : y_a13e ds2_d1gF -> : @ Int (plusInt x_a13b y_a13e) (f3_$sf3 ds2_d1gF y_a13e) } } f3_$sf3 :: [Int] -> Int -> [Int] GblId f3_$sf3 = \ (sc_s1G3 :: [Int]) (sc1_s1G4 :: Int) -> case sc_s1G3 of _ { [] -> : @ Int sc1_s1G4 ([] @ Int); : y_a13e ds_d1gF -> : @ Int (plusInt sc1_s1G4 y_a13e) (f3_$sf3 ds_d1gF y_a13e) } f2 :: [Int] -> [Int] GblId f2 = \ (ds_d1gP :: [Int]) -> case ds_d1gP of _ { [] -> f21; : x_a137 ds1_d1gQ -> case ds1_d1gQ of _ { [] -> : @ Int x_a137 ([] @ Int); : y_a139 xs_a13a -> : @ Int (plusInt x_a137 y_a139) (f2_$sf2 xs_a13a y_a139) } } f2_$sf2 :: [Int] -> Int -> [Int] GblId f2_$sf2 = \ (sc_s1FU :: [Int]) (sc1_s1FV :: Int) -> case sc_s1FU of _ { [] -> : @ Int sc1_s1FV ([] @ Int); : y_a139 xs_a13a -> : @ Int (plusInt sc1_s1FV y_a139) (f2_$sf2 xs_a13a y_a139) } The reason that f1 is faster than the rest is that GHC is somehow able to unpack the Ints and use the more efficient +# instead of the slower plusInt. I don't immediately see the reason for the time difference between f4, f3 and f2. The inner loops all seem equivalent. Regards, Bas [1] http://hackage.haskell.org/package/criterion [2] http://hackage.haskell.org/package/ghc-core