
Matthias Görgens wrote:
Thanks. I heard about the hylo-, ana- and catamorphisms before, but never explicitly used them. Time to get started.
You did use them explicitly :) , namely in treeSort = bootstrap partitionOnMedian bootstrap f = Fix . helper . f where helper = fmap (Fix . helper . f) partitionOnMedian :: (Ord a) => (S.Seq a) -> BTreeRaw a (S.Seq a) since bootstrap f = ana f In particular, we have this slight reformulation bootstrap f = helper' where helper = fmap helper' helper' = Fix . helper . f and hence bootstrap f = helper' where helper' = Fix . fmap helper' . f and thus bootstrap f = Fix . fmap (bootstrap f) . f which is the definition of ana f .
And yet another question: One can get the median in deterministic linear time. For quicksort choosing the median as pivot keeps the O(n log n) average running time and brings down the expected worst case, too. Do you know of a (natural) way to combine selecting the median and doing the quicksort, so that you don't compare unnecessarily?
The way to de-randomize quickselect is to calculate medians of medians. I.e. solve the problem for smaller instances first. I suspect if we follow this strategy with the reified quicksort call-trees, the de-randomized quicksort will look a lot like mergesort.
Interesting question! (Of which I don't know the answer of. ;) ) I never understood the median of median method because I always thought that it calculates an approximate median of approximate medians, which I now realize is not the case. I concur that there should be something better than "recursively wasting" comparisons in quickselect just for finding the median pivot in quicksort, especially since both are the same algorithm modulo lazy evaluation. Concerning quicksort looking like mergesort, it seems like a good idea to somehow merge the groups of 5 from the median-of-medians calculation. However, there is an important difference between quicksort and mergesort, namely lazy quicksort is able to produce the batch of the first k smallest elements in O(n + k log k) total time [1] while lazy mergesort needs O(n + k log n) total time [2] There is something about quicksort that makes it "better" than mergesort. [1] Proof goes like this: First, quicksort chooses a sublist of smallest items recursively until a list of the smallest k items remains, this takes O(n) time. Then the list consisting of the smallest k items is sorted in Θ(k log k). [2] Mergesort builds a tournament tree in Θ(n) time and then performs k tournaments that take O(log n) time each, so the the second phase is O(k log n). I need to think about this. Regards, apfelmus -- http://apfelmus.nfshost.com