Are you really sure that your results are correct? I obviously did all my tests with -O2 on.
Please rerun your tests on the new "framework". Also note that this one contains tests for three cases:
- sorted
- revsorted
- randomized
From previous results I can guess that with randomized input Yhc's code will be ~3 times slower then the fastest quicksort out there.
But I'm not going to run O(n^2) algorithm to compare with O(n log n) - and this is the case for (rev?)sorted inputs.


Christopher Skrzêtnicki

On Tue, Mar 11, 2008 at 5:14 AM, Richard A. O'Keefe <ok@cs.otago.ac.nz> wrote:

On 11 Mar 2008, at 12:27 pm, Krzysztof Skrzêtnicki wrote:

> I've written little framework to work on. See sortbench.hs and
> sortbench.py attachments.
> Furthermore, I checked Yhc's implementation of sort: it is indeed
> very fast:

I took his earlier code and plugged yhc's sort into it.
Compiling with ghc -O2 using GHC 6.8.2, I found the yhc code (basically
variant of natural merge) to be considerably slower than some of the
alternatives.
(...)
When I run Krzystztof's test harness (which I have currently brought
up to
25 different sorting functions) with a list of the form [1..N] instead
of
a random list, suddenly all the variants of merge sort come out ahead of
all the variants of quick sort.  In fact his best version of quicksort,
qsort_iv, comes out fully 1155 times slower than the YHC algorithm on a
list of 10,000 ordered integers.  That can be improved by spending a bit
of effort on choosing a good pivot, but then the quicksort algorithms
are
no longer so competitive for randomised inputs.

This paper looks interesting, I'm going to implement those tests.

The classic "Engineering a Quicksort" paper by Bentley and McIlroy from
Software : Practice & Experience recommends a whole bunch of
distribution
shapes (one run, two runs, sawtooth, organ pipes, random, ...) that
should
be benchmarked before drawing too many conclusions.

This is the right point. A further work will be to add different input generators to run them too.


It is also wise to try more than one data type.  How do the different
algorithms compare on random samples from a Scrabble dictionary?  (Why
that particular question?  Because I mean to try it.)