
Hi
I was looking at the definitions of nub (and hence nubBy) recently in connection with a trac ticket and realised that this is O(n^2) complexity! Ouch!
Complexity theory plus the Eq context makes that inevitable. You can't have nub over Eq in anything less than O(n^2)
I was going to say I sure hope nobody uses this in real programs, but I thought I'd better check first and I see that darcs seems to use it in a few places. Hmm..
I use it all the time, its dead handy. There are 12 instances in Hoogle, for example. If profiling later shows it to be a problem, I'd fix it - but I can't ever actually remember that being the case. Premature optimisation is the root of all evil.
Seeing as practically all Eq instances are also Ord instances, at the very least we could have O(n*(log n)) definitions for ..
nub :: Ord a => [a] -> [a] nub = nubBy compare
nubBy :: (a -> a -> Ordering) -> [a] -> [a] nubBy cmp xs ys = -- something using an AVL tree perhaps.
..or even better using the generalised trie stuff Jamie Brandon has been working on.
Having nubOrd is a great idea, I even proposed it previously, but people disagreed with me. Propose it and add it by all means.
So could nub and nubBy be deprecated and an appropriate health warning added to the Haddock?
No. They should say O(n^2) in the haddock documentation, but O(n^2) /= useless. Thanks Neil