Re: [Haskell-cafe] How to correctly benchmark code with Criterion?

18 Oct 2012

      On 18 October 2012 13:15, Janek S.  wrote:
...
...
Something like this might work, not sure what the canonical way is.
(...)
This is basically the same as the answer I was given on SO. My concerns about this solutions are:
- rnf requires its parameter to belong to NFData type class. This is not the case for some data
structures like Repa arrays.
For unboxed arrays of primitive types WHNF = NF.  That is, once the
array is constructed all its elements will be in WHNF.
...
- evaluate only evaluates its argument to WHNF - is this enough? If I have a tuple containing two
lists won't this only evaluate the tuple construtor and leave the lists as thunks? This is
actually the case in my code.
That is why you use "rnf" from the NFData type class. You use
"evaluate" to kick-start rnf which then goes ahead and evaluates
everything (assuming the NFData instance has been defined correctly.)
...
As I said previously, it seems that Criterion somehow evaluates the data so that time needed for
its creation is not included in the benchmark. I modified my dataBuild function to look lik this:
dataBuild gen = unsafePerformIO $ do
    let x = (take 6 $ randoms gen, take 2048 $ randoms gen)
    delayThread 1000000
    return x
When I ran the benchmark, criterion estimated the time needed to complete it to over 100 seconds
(which means that delayThread worked and was used as a basis for estimation), but the benchamrk
was finished much faster and there was no difference in the final result comparing to the normal
dataBuild function. This suggests that once data was created and used for estimation, the
dataBuild function was not used again. The main question is: is this observation correct? In this
question on SO:
http://stackoverflow.com/questions/6637968/how-to-use-criterion-to-measure-p...
one of the aswers says that there is no automatic memoization, while it looks that in fact the
values of dataBuild are memoized. I have a feeling that I am misunderstanding something.
If you bind an expression to a variable and then reuse that variable,
the expression is only evaluated once. That is, in "let x = expr in
..." the expression is only evaluated once. However, if you have "f y
= let x = expr in ..." then the expression is evaluated once per
function call.
...
...
I don't know if you have already read them,
but Tibell's slides on High Performance Haskell are pretty good:
http://www.slideshare.net/tibbe/highperformance-haskell
There is a section at the end where he runs several tests using Criterion.
I skimmed the slides and slide 59 seems to show that my concerns regarding WHNF might be true.
It's usually safe if you benchmark a function. However, you most
likely want the result to be in normal form.  The "nf" does this for
you. So, if your benchmark function has type "f :: X -> ([Double],
Double)", your benchmark will be:

  bench "f" (nf f input)

The first run will evaluate the input (and discard the runtime) and
all subsequent runs will evaluate the result to normal form. For repa
you can use deepSeqArray [1] if your array is not unboxed:

  bench "f'" (whnf (deepSeqArray . f) input)

[1]: http://hackage.haskell.org/packages/archive/repa/3.2.2.2/doc/html/Data-Array...

Re: [Haskell-cafe] How to correctly benchmark code with Criterion?

Thomas Schilling