Re: [Haskell-cafe] Benchmarking and Garbage Collection

4 Mar 2010

      Jesper Louis Andersen wrote:
...
On Thu, Mar 4, 2010 at 7:16 PM, Neil Brown  wrote:
...
However, one thing I've found is that the libraries have noticeably
different behaviour in terms of the amount of garbage created.
In fact, CML relies on the garbage collector for some implementation
constructions. John H. Reppys "Concurrent Programming in ML" is worth
a read if you haven't. My guess is that the Haskell implementation of
CML is bloody expensive. It is based on the paper
http://www.cs.umd.edu/~avik/projects/cmllch/ where Chaudhuri first
constructs an abstract machine for CML and then binds this to the
Haskell MVar and forkIO constructions.
CML is indeed the library that has the most markedly different 
behaviour.  In Haskell, the CML package manages to produce timings like 
this for fairly simple benchmarks:

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    2.47s  (  2.49s elapsed)
  GC    time   59.43s  ( 60.56s elapsed)
  EXIT  time    0.00s  (  0.01s elapsed)
  Total time   61.68s  ( 63.07s elapsed)

  %GC time      96.3%  (96.0% elapsed)

  Alloc rate    784,401,525 bytes per MUT second

  Productivity   3.7% of total user, 3.6% of total elapsed

I knew from reading the code that CML's implementation would do 
something like this, although I do wonder if it triggers some 
pathological case in the GC.  The problem is that when I benchmark the 
program, it seems to finish it decent time; then spends 60 seconds doing 
GC before finally terminating!  So I need some way of timing that will 
reflect this; I wonder if just timing the entire run-time (and making 
the benchmarks long enough to not be swallowed by program start-up 
times, etc) is the best thing to do.  A secondary issue is whether I 
should even include CML at all considering the timings!

Thanks,

Neil