[Haskell-cafe] Gobbler Benchmarks

21 Feb 2008

      Hello Folks,

There's been some discussions recently about the pros and cons of
various coding styles, particularly whether stack greedy or heap
greedy is best, and how (if) ghcs stack management in particular
should affect all this. In particular, the problem of implementing
an eager take function. Here's some real numbers measured with ghc
6.8.2 under windowsxp using AMD Athlon 64 1.8 GHz. The source code
can be found here..

http://homepages.nildram.co.uk/~ahey/Test1.zip

There are 4 possible implementations that have been tested:

-- Uses O(n) stack
stackGobbler :: Int -> [x] -> [x]
stackGobbler 0 _      = []
stackGobbler _ []     = []
stackGobbler n (x:xs) = let xs' = stackGobbler (n-1) xs
                         in  xs' `seq` (x:xs')

-- Uses O(n) heap instead, O(1) stack
heapGobbler :: Int -> [x] -> [x]
heapGobbler = heapGobbler' []
   where heapGobbler' rxs 0 _      = reverse rxs
         heapGobbler' rxs _ []     = reverse rxs
         heapGobbler' rxs n (x:xs) = heapGobbler' (x:rxs) (n-1) xs

-- Neils O(n) heap version, O(1) stack
neilGobbler :: Int -> [x] -> [x]
neilGobbler n xs = length res `seq` res
  where res = take n xs

-- Continuation passing O(n) heap version, O(1) stack
cpGobbler :: Int -> [x] -> [x]
cpGobbler = f id
  where f c 0 _      = c []
        f c _ []     = c []
        f c n (x:xs) = f (\xs' -> c (x:xs')) (n-1) xs

There are 16 tests for each, parameterised by p=0..15. Each test
takes 63*(2^p) elements from a test list of the same length, and
is repeated 2^(25-p) times. So in total, 63*(2^25) elements are
processed in each case (independent of p).

Here are the results in "myCpuTimePrecision = 15625000000" units
(the figure exported by System.CPUTime is wrong for me). To convert
these to actual time per element figures you need to multiply each
by 7.4 pS (I think :-). All tests were run with fixed stack and
heap sizes of 16 and 256 MiB respectively.

  p    stack heap  neil  cp
----------------------------
  0 -  1793  2684  4937  2593
  1 -  1860  2673  4897  2584
  2 -  1910  2673  4825  2578
  3 -  1927  2659  4819  2575
  4 -  1946  2657  4813  2574
  5 -  1950  2656  5048  2578
  6 -  1960  2711  5036  2627
  7 -  1976  2730  5126  2643
  8 -  2072  2900  5197  2813
  9 -  2439  3044  5153  2974
10 -  2685  3275  5371  3199
11 -  2760  3384  5466  3321
12 -  2930  3487  5525  3444
13 -  3181  3648  5813  3698
14 -  3698  3973  6417  4031
15 -  4727  4987  7964  5224

So pretty much as I expected. For smallish lists, stackGobbler is
easily the fastest, heapGobbler and cpGobbler are pretty similar,
and neilGobbler is the slowest (sorry Neil:-).

The performance of all is degraded as p increases. I guess this
is not too surprising, but stackGobbler seems to degrade faster
so that at p=15 there's not much difference between it and
heapGobbler/cpGobbler. I'm not sure what it is that causes stackGobbler
to be "unfairly" penalised this way, but I'm reminded of this post
from John Meacham..

http://haskell.org/pipermail/glasgow-haskell-users/2007-May/012470.html

The other big problem with stackgobbler in practice is the risk of
stack overflow. For p=15 it would not work at all for ghc default
stack limit.

Regards
--
Adrian Hey