
Dear list, during past few days I spent a lot of time trying to figure out how to write Criterion benchmarks, so that results don't get skewed by lazy evaluation. I want to benchmark different versions of an algorithm doing numerical computations on a vector. For that I need to create an input vector containing a few thousand elements. I decided to create random data, but that really doesn't matter - I could have as well use infinite lists instead of random ones. My problem is that I am not certain if I am creating my benchmark correctly. I wrote a function that creates data like this: dataBuild :: RandomGen g => g -> ([Double], [Double]) dataBuild gen = (take 6 $ randoms gen, take 2048 $ randoms gen) And I create benchmark like this: bench "Lists" $ nf L.benchThisFunction (L.dataBuild gen) The question is how to generate data so that its evaluation won't be included in the benchmark. I already asked this question on StackOverflow ( http://stackoverflow.com/questions/12896235/how-to-create-data-for-criterion... ) and got answer to use evaluate + force. After spending one day on testing this approach I came to conclusion that doing this does not seem to influence results of a benchmark in any way (I did stuf like unsagePerformIO + delayThread). On the other hand I looked into sources of criterion and I see that the benchmark code is run like this: evaluate (rnf (f x)) I am a Haskell newbie and perhaps don't interpret this correctly, but to me it looks as though criterion did not evaluate the possibly non-evaluated parameter x before running the benchmark, but instead evaluates the final result. Can someone provide an explanation on how this exactly works and how should I write my benchmarks so that results are correct? Janek