
On Wed, Feb 24, 2010 at 4:39 PM, Barton C Massey
I think in the end everyone agreed that putting nubOrd in Data.Set (and nubInt in Data.IntSet?) would be a good thing. I just never got it done; I was burned out by the process, and I really struggled with getting Darcs to produce a reasonable-sized patch that could be submitted.
That sounds odd. You ask for help in #darcs?
I'll pick it up again if folks want, or someone else can. I've attached a nubOrd patch I had lying around; it's against ancient libraries and isn't even a context diff, but it should be sufficient to reconstruct what needs to be done. It includes QuickCheck tests and benchmarked as quite fast IIRC.
Some of the false positives in sort.txt, BTW, will be because the values often need to be sorted anyway. In this case, the speed win of nubOrd over (map head . group . sort) isn't so clear---it probably depends on the degree of replication in the input list. Somebody should do some benchmarking, probably. On the other hand, nubOrd preserves the input ordering, which is occasionally handy.
Looking over the various versions of 'nub' I've collected and my notes, I wonder if we might not be best off with 3 nubs: the current stable Eq, a stable Ord (using Sets), and an unstable Ord.
I have to wonder if some sort of nubOrdWith that uses a specified equality test wouldn't be a good thing also,
Hm. Seems kind of redundant with the Eq typeclass, and I don't think I saw any uses which were equivalent to nubOrdWith.
but I hate to spend any more time beating on this proposal.
Bart Massey bart@cs.pdx.edu
It just needs someone to take the time to put together a criterion benchmark module and figure out which of the bikesheds are best. It only took a year or so for Control.Monad.void, so no reason we couldn't have faster nubs in the 2011 GHC, -- gwern