On Wed, Feb 6, 2013 at 2:09 AM, Simon Marlow <marlowsd@gmail.com> wrote:
This is slightly off topic, but I wanted to plant this thought in people's brains: we shouldn't place much significance in the average of a bunch of benchmarks (even the geometric mean), because it assumes that the benchmarks have a sensible distribution, and we have no reason to expect that to be the case. For example, in the results above, we wouldn't expect a 14.7% reduction in runtime to be seen in a typical program.
Using the median might be slightly more useful, which here would be something around 0% for runtime, though still technically dodgy. When I get around to it I'll modify nofib-analyse to report medians instead of GMs.