
Hello Folks, I was looking at the definitions of nub (and hence nubBy) recently in connection with a trac ticket and realised that this is O(n^2) complexity! Ouch! I was going to say I sure hope nobody uses this in real programs, but I thought I'd better check first and I see that darcs seems to use it in a few places. Hmm.. How did we ever get stuff like this in the standard libs? I can only imagine this is relic from the days when Haskell was only used for research or pedagogical purposes only (not for real programs). Seeing as practically all Eq instances are also Ord instances, at the very least we could have O(n*(log n)) definitions for .. nub :: Ord a => [a] -> [a] nub = nubBy compare nubBy :: (a -> a -> Ordering) -> [a] -> [a] nubBy cmp xs ys = -- something using an AVL tree perhaps. ..or even better using the generalised trie stuff Jamie Brandon has been working on. Of course I'm not actually advocating this as it's probably bad policy to have a simple standard lib dependent on any complex non-standard lib. But if it just isn't possible to implement some functions with reasonable efficiency using simple definitions only then I think they really shouldn't be there at all. Instead we should make users "roll their own" using whatever complex non-standard lib they want. So could nub and nubBy be deprecated and an appropriate health warning added to the Haddock? In the mean time I think I'll put appropriate alternative definitions in the next AVL release and ask Jamie Brandon to consider doing the same in his generalised tries lib (GMap). Also, looking at a few other functions there like union(By) and intersection(By) make me quite suspicious. Maybe we need a thorough performance audit to get rid of implementations that are so insanely inefficient they should *never* be used. Regards -- Adrian Hey