
Niklas Hambüchen
writes: On 13/10/13 21:42, AntC wrote: ... If you use the Set library, that fact may be very visible! Because Set re-sequences the whole list, as per its Ord instance.
But List.nub preserves the list sequence (except for omitting duplicates).
I mean *exactly* what you say here.
ordNub behaves has the same behaviour as nub, while (Set.toList . Set.fromList) doesn't.
That's great, thank you.
[BTW I am still less than convinced that overall a Set-based ordNub is significantly more efficient. I suspect it depends on how big is your list.]
What do you mean?
ordNub is clearly in a different complexity class, ...
Yes, I'm not disputing that.
... and the benchmarks that I provided show not only this, but also that ordNub is *always* faster than nub, even for singleton lists.
Thanks Niklas, I hadn't spotted those benchmarks back in July. I'm surprised at that result for singletons (and for very small numbers of elements which are in fact each different). Especially because List's `nub` uses `filter` ==> fold, which should be tail-recursive. It seems to me that for small numbers, the Set-based approach still requires comparing each element to each other. Plus there's the overhead for building the Set and inserting each element into it -- where `insert` again walks the Set to find the insertion point. Then here's a further possible optimisation, instead of making separate calls to `member` and `insert`: * Make a single call to insert' :: (Ord a) => a -> Set a -> (Bool, Set a) * The Bool returns True if already a member. * Else returns an updated Set in the snd, with the element inserted.