
2008/8/26 Adrian Hey
Seeing as practically all Eq instances are also Ord instances, at the very least we could have O(n*(log n)) definitions for ..
nub :: Ord a => [a] -> [a] nub = nubBy compare
nubBy :: (a -> a -> Ordering) -> [a] -> [a] nubBy cmp xs ys = -- something using an AVL tree perhaps.
While I agree that it would be handy to have Ord-specific versions of these, I'd just like to reiterate/expand on something which Neil mentioned: map head . group . sort has the additional effect of: 1) Demanding the entire input list before it can produce even a single element, and 2) Sorting the result rather than keeping things in the order they first occurred in the input. A correct implementation of nub which made use of Ord would maintain (say) a Data.Set of elements already seen, as it traversed the list lazily, producing elements in the output as soon as new elements were seen in the input, and no later. This of course guarantees that you return them in the order that they're first seen as well. This is still O(n log n), but reduces correctly to O(k log k) when only k elements of the input are needed to get the desired number of elements of the resulting list. Please don't make nub any stricter! On a side note related to the request for inclusion of complexities, since Haskell is a lazy language, we really ought to have complexities written in terms of the demanded portion of the result. Of course, Data.Set and Data.Map are (structure) strict, so it doesn't affect them so much, but it would certainly be nice for the functions in Data.List to know the answer to "if If the input is size n and I only demand k of the elements of the result, then what is the complexity?", especially for things like sort, where a lazy implementation can, for instance, make the head of the list available in just O(n) time. - Cale