
Er.. that last might be to hard. But Data.IntSet can support its own nub
version.
On Jan 1, 2015 8:42 AM, "David Feuer"
I think ordNub in Data.Set, hashNub in Data.HashSet or somewhere, and bitsNub in Data.Bits (for FiniteBits things). On Jul 14, 2013 7:21 AM, "Niklas Hambüchen"
wrote: tldr: nub is abnormally slow, we shouldn't use it, but we do.
As you might know, Data.List.nub is O(n²). (*)
As you might not know, almost *all* practical Haskell projects use it, and that in places where an Ord instance is given, e.g. happy, Xmonad, ghc-mod, Agda, darcs, QuickCheck, yesod, shake, Cabal, haddock, and 600 more (see https://github.com/nh2/haskell-ordnub).
I've taken the Ord-based O(n * log n) implementation from yi using a Set:
ordNub :: (Ord a) => [a] -> [a] ordNub l = go empty l where go _ [] = [] go s (x:xs) = if x `member` s then go s xs else x : go (insert x s) xs
and put benchmarks on
http://htmlpreview.github.io/?https://github.com/nh2/haskell-ordnub/blob/1f0... (compare `nub` vs `ordNub`).
`ordNub` is not only in a different complexity class, but even seems to perform better than nub for very small numbers of actually different list elements (that's the numbers before the benchmark names).
(The benchmark also shows some other potential problem: Using a state monad to keep the set instead of a function argument can be up to 20 times slower. Should that happen?)
What do you think about ordNub?
I've seen a proposal from 5 years ago about adding a *sort*Nub function started by Neil, but it just died.
(*) The mentioned complexity is for the (very common) worst case, in which the number of different elements in the list grows with the list (alias you don't have an N element list with always only 5 different things inside).
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe