[Haskell-cafe] benchmarking pure code

31 Mar 2010

      Hello,

I'm writing a library for dealing with binders and I want to benchmark
it against DeBruijn, Locally Nameless, HOAS, etc.

One on my benchmark consists in 

  1. generating a big term \x.t
  2. substituting u fox in t

The part I want to benchmark is 2. In particular I would like that:

 a. \x.t is already evaluated when I run 2 (I don't want to measure the
    performances of the generator)
 b. the action of substituting u for x in t were measured as if I had to
    fully evaluate the result (by printing the resulting term for
    instance).

After looking at what was available on hackage, i set my mind on
strictbench, which basically calls (rnf x `seq` print "") and then uses
benchpress to measure the pure computation x.

Since I wanted (a), my strategy was (schematically):

  let t = genterm
  rnf t `seq` print ""
  bench (subst u t)

I got numbers I didn't expect so I ran the following program:

  let t = genterm
  print t
  bench (subst u t)

and then I got other numbers! Which were closer to what I think they
should be, so I may be happy with them, but all of this seems to
indicate that rnf doesn't behave as intended.

Then I did something different: I wrote two programs. One that prints the
result of (subst u t):

  let t = genterm
  let x = (subst u t)
  print x
  bench (print x)

recorded the numbers of that one and then ran the program:

  let t = genterm
  bench (print (subst u t))

got the numbers, and substracted the first ones to them.

By doing so, I'm sure that I get realistic numbers at least:
since I print the whole resulting term, I've got a visual "proof"
that it's been evaluated. But this is not very satisfactory.
Does anyone have an idea why calling rnf before the bench 
doesn't seem to "cache" the result as calling show does?
(my instances of NFData follow the scheme described in strictbench
documentation). If not, do you think that measuring (computation +
pretty printing time - pretty printing time) is ok?

Regards,
Paul

[Haskell-cafe] benchmarking pure code

Paul Brauner