
I had an idea, and it's kind of complicated, so I was wondering if I could get a sanity check from someone before putting *too* much time into it. Specifically, I was thinking about a sort of incremental sorting of sequences (in Data.Sequence). The idea is very closely related to the incremental sorting described by Navarro and Paredes ( http://www.dcc.uchile.cl/~gnavarro/ps/algor09.pdf ). The basic concept: sequences are represented as Hinze-Paterson 2-3 finger trees. They are fairly lazy, so building them from the top down tends to be a good thing. Suppose I have a sequence to be sorted. To start building the result sequence, I need to know which elements go in the first digit and which go in the last digit. To do this, I need two order statistics, starting with one separating the first digit from the rest, then one separating the last digit from the rest. Once I have separated these, I can produce the top of the sequence. The digit separation is the same all down the spine. Within digits, 2-3 trees split similarly. I can get the order statistics using median-of-medians. As Navarro and Paredes (and presumably others) describe, median-of-medians partially sorts the sequence in the process of finding the desired order statistic. Navarro and Paredes use a stack of indices into an array to take advantage of this, but I need things rather more incremental. What I was thinking is that I could use a finger tree of sequences (a sequence of sequences making it easy to find where a particular total length is reached) to separate the different segments from each other. Each time I need to split on a particular order statistic, I can isolate the necessary segment, use median-of-medians (splitting it into pieces), and then append the pieces onto either side as appropriate. Does anyone have a sense of 1. Whether this is sane and/or 2. What the chances are that its performance will be competitive with either of the current Data.Sequence sorting functions and/or 3. How I should actually implement median-of-medians for this particular application (as there seem to be very few Haskell implementations of the algorithm, and I've found no optimized ones)? Thanks, David Feuer