
Hello Folks, There's been some discussions recently about the pros and cons of various coding styles, particularly whether stack greedy or heap greedy is best, and how (if) ghcs stack management in particular should affect all this. In particular, the problem of implementing an eager take function. Here's some real numbers measured with ghc 6.8.2 under windowsxp using AMD Athlon 64 1.8 GHz. The source code can be found here.. http://homepages.nildram.co.uk/~ahey/Test1.zip There are 4 possible implementations that have been tested: -- Uses O(n) stack stackGobbler :: Int -> [x] -> [x] stackGobbler 0 _ = [] stackGobbler _ [] = [] stackGobbler n (x:xs) = let xs' = stackGobbler (n-1) xs in xs' `seq` (x:xs') -- Uses O(n) heap instead, O(1) stack heapGobbler :: Int -> [x] -> [x] heapGobbler = heapGobbler' [] where heapGobbler' rxs 0 _ = reverse rxs heapGobbler' rxs _ [] = reverse rxs heapGobbler' rxs n (x:xs) = heapGobbler' (x:rxs) (n-1) xs -- Neils O(n) heap version, O(1) stack neilGobbler :: Int -> [x] -> [x] neilGobbler n xs = length res `seq` res where res = take n xs -- Continuation passing O(n) heap version, O(1) stack cpGobbler :: Int -> [x] -> [x] cpGobbler = f id where f c 0 _ = c [] f c _ [] = c [] f c n (x:xs) = f (\xs' -> c (x:xs')) (n-1) xs There are 16 tests for each, parameterised by p=0..15. Each test takes 63*(2^p) elements from a test list of the same length, and is repeated 2^(25-p) times. So in total, 63*(2^25) elements are processed in each case (independent of p). Here are the results in "myCpuTimePrecision = 15625000000" units (the figure exported by System.CPUTime is wrong for me). To convert these to actual time per element figures you need to multiply each by 7.4 pS (I think :-). All tests were run with fixed stack and heap sizes of 16 and 256 MiB respectively. p stack heap neil cp ---------------------------- 0 - 1793 2684 4937 2593 1 - 1860 2673 4897 2584 2 - 1910 2673 4825 2578 3 - 1927 2659 4819 2575 4 - 1946 2657 4813 2574 5 - 1950 2656 5048 2578 6 - 1960 2711 5036 2627 7 - 1976 2730 5126 2643 8 - 2072 2900 5197 2813 9 - 2439 3044 5153 2974 10 - 2685 3275 5371 3199 11 - 2760 3384 5466 3321 12 - 2930 3487 5525 3444 13 - 3181 3648 5813 3698 14 - 3698 3973 6417 4031 15 - 4727 4987 7964 5224 So pretty much as I expected. For smallish lists, stackGobbler is easily the fastest, heapGobbler and cpGobbler are pretty similar, and neilGobbler is the slowest (sorry Neil:-). The performance of all is degraded as p increases. I guess this is not too surprising, but stackGobbler seems to degrade faster so that at p=15 there's not much difference between it and heapGobbler/cpGobbler. I'm not sure what it is that causes stackGobbler to be "unfairly" penalised this way, but I'm reminded of this post from John Meacham.. http://haskell.org/pipermail/glasgow-haskell-users/2007-May/012470.html The other big problem with stackgobbler in practice is the risk of stack overflow. For p=15 it would not work at all for ghc default stack limit. Regards -- Adrian Hey