
Jonathan Cast wrote:
On 10 Mar 2008, at 4:00 AM, Adrian Hey wrote:
Neil Mitchell wrote:
2) What does it do with duplicate elements in the list? I expect it deletes them. To avoid this, you'd need to use something like fromListWith, keeping track of how many duplicates there are, and expanding at the end. That would be wrong. Consider: data Foo = Foo Int Int instance Ord Foo where compare (Foo a _) (Foo b _) = compare a b
I would consider such an Ord instance to be hopelessly broken, and I don't think it's the responsibility of authors of functions with Ord constraints in their sigs (such as sort) to consider such possibilities or specify and control the behaviour of their behaviour for such instances. Trying to do this is what leads to horrors such as the "left biasing" of Data.Map (for example).
Data.Map is implicitly using such an Ord instance behind the scenes, and I think it has to to maintain its own invariants. If I take the `union' of two maps that take the same key to different values, I have to decide which value to use, even if every Ord instance supplied by my clients is 100% Adrian-compliant.
The biasing policy for Data.Map/Set is refering to (Set) elements, or (Map) keys, not the associated values (in a Map). So during an insertion op, if an "equal" element/key is found the Set/Map the biasing policy tells me which of the two "equal" elements/keys in incorporated in the resulting Set/Map. So there's an implied acceptance of the posibility that the choice is significant and that the two elements/keys can be both "equal" and "not equal" at the same time. This is crazy IMO. Implementors should not have to offer an guarantees about this and should be perfectly free to make their own unspecified choice regarding which of two "equal" values is used in any expression (on space efficiency grounds say). For example, the left biasing of insertion on Data.Set forces the implementation to burn O(log n) heap space creating a new "equal" set, even if the set already contains an old element that is "equal" to the new element. I would argue that in this situation it should be perfectly correct to simply return the old set instead. Regards -- Adrian Hey