I was thinking it should offer a randomized version (taking a generator), since randomized median algorithms provide the best expected performance. It could also offer a deterministic version using some variant of median-of-medians, intended for long lists. I guess it probably should retain the naive version for short lists. Some benchmarking would suggest a good cutoff. Has anyone come up with a better practical deterministic O(n) algorithm since median-of-medians? I saw a paper by Dorit Dor on reducing the number of comparisons to a bit under 3n, which also showed a lower bound of a bit over 2n, but the algorithm she gives strikes me as far too complex to be practical.
In my experience, doing much better than the naive algorithm for median is surprisingly hard, and involves a choice from a range of trade-offs. Did you have a particular better algorithm in mind?
If you did, you could write it, and contact the package author with a patch.
You also may be able to find something of use in Edward Kmett's order-statistics package: http://hackage.haskell.org/package/order-statistics
Cheers,
Gershom
On 9/1/12 3:26 PM, David Feuer wrote:
The median function in the hstats package uses a naive O(n log n)
algorithm. Is there another package providing an O(n) option? If not,
what would it take to get the package upgraded?
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe