
Hello All, I'm top posting because I'm getting bored and frustrated with this thread and I don't want to respond detail to everything Aaron has said below. Aaron: Where are you getting this equivalence stuff from? Searching the report for the word "equivalence" the only remotely relevant section seems to be in para. 17.6.. "When the “By” function replaces an Eq context by a binary predicate, the predicate is assumed to define an equivalence" Which is fair enough, but this is talking about the argument of "By" functions. The Haskell wiki refers me to wikipedia, which contains the words "In Haskell, a class Eq intended to contain types that admit equality would be declared in the following way" http://en.wikipedia.org/wiki/Type_class Not that this is necessarily authoritative, but it seems to be contaradicting some peoples interpretation. Also, on page 60 of the report I find the words "Even though the equality is taken at the list type.." So I don't know if all this is really is the correct reading of the report, but if so would like to appeal to movers and shakers in the language definition to please decide exactly what the proper interpretation of standard Eq and Ord "class laws" are and in the next version of the report give an explanation of these in plain English using terms that people without a Mathematics degree are likely to understand. Aaron's interpretation may indeed be very correct and precise, but I fear in reality this is just going to be incomprehensible to many programmers and a terrible source of bugs in "real world" code. I cite previous "left biasing" bugs Data.Map as evidence. I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa) which implies f x = f y for all definable f which implies (f x == f y) = True (for expression types which are instances of Eq). This pretty much requires structural equality for concrete types. For abstract types you can do something different provided functions which can give different answers for two "equal" arguments are not exposed. Anything else is just wrong (according to the language specification, even if it can be right in some mathematical sense). Before anyone jumps down my throat, I remind you that this is a request, not an assertion! :-) On the subject of ambiguity, bugs and maxima, I see we have in Data.List -- | 'maximum' returns the maximum value from a list, -- which must be non-empty, finite, and of an ordered type. -- It is a special case of 'Data.List.maximumBy', which allows the -- programmer to supply their own comparison function. maximum :: (Ord a) => [a] -> a maximum [] = errorEmptyList "maximum" maximum xs = foldl1 max xs -- | The 'maximumBy' function takes a comparison function and a list -- and returns the greatest element of the list by the comparison function. -- The list must be finite and non-empty. maximumBy :: (a -> a -> Ordering) -> [a] -> a maximumBy _ [] = error "List.maximumBy: empty list" maximumBy cmp xs = foldl1 max xs where max x y = case cmp x y of GT -> x _ -> y So I believe I'm correct in saying that maximumBy returns the last of several possible maximum elements of the list. This obviously needs specifying in the Haddock. Because maximumBy documentation is ambiguous in this respect, so is the overloaded maximum documentation. At least you can't figure it out from the Haddock. Despite this ambiguity, the statement that maximum is a special case of maximumBy is true *provided* max in the Ord instance is defined the way Aaron says is should be: (x==y = True) implies max x y = y. But it could be be made unconditionally true using.. maximum :: Ord a => [a] -> a maximum [] = error "List.maximum: empty list" maximum xs = maximumBy compare xs AFAICT, the original report authors either did not perceive an ambiguity in maximum, or just failed to notice and resolve it. If there is no ambiguity this could be for 2 reasons. 1 - It doesn't matter which maximum is returned because: (x==y) = True implies x=y 2 - It does matter, and the result is guaranteed to be the last maximum in all cases because: (x==y) = True implies max x y = y But without either of the above, it is unsafe to assume maximum = maximumBy compare Regarding the alleged "max law" this too is not mentioned in the Haddock for the Ord class, nor is it a "law" AFAICT from reading the report. The report (page 83) just says that the default methods are "reasonable", but presumably not manditory in any semantic sense. This interpretation also seems to be the intent of this from the second para. of Chapter 8: "The default method definitions, given with class declarations, constitute a specification only of the default method. They do not constitute a specification of the meaning of the method in all instances." I wouldn't dispute that the default definition is reasonable, but it's certainly not clear to me from the report that it's something that I can safely assume for all Ord instances. In fact AFAICS the report quite clearly telling me *not* to assume this. But I have to assume *something* for maximum to make sense, so I guess that must be: (x==y) = True implies x=y IOW "I have no idea if it's the first or last maximum that is returned, but who cares?" Again, the report doesn't make it clear that the (==) law above holds (at least not on page 82). But I think in the absence of any explicit statement to the contary I think most programmers would assume that it does apply. I think this is quite reasonable and I have no intention of changing my programming habits to cope with weird instances for which: (x == y) = True does not imply x=y or max x y is not safely interchangeble with max y x. I'm not saying some people are not right to want classes with more mathematically inspired "laws", but I see nothing in the report to indicate to me that Eq/Ord are those classes and consequently that the "naive" programmers interpretation of (==) is incorrect. Rather the contrary in fact. Regards -- Adrian Hey Aaron Denney wrote:
On 2008-03-12, Adrian Hey
wrote: Aaron Denney wrote:
On 2008-03-11, Adrian Hey
wrote: Having tried this approach myself too (with the clone) I can confirm that *this way lies madness*, so in future I will not be making any effort to define or respect "sane", unambiguous and stable behaviour for "insane" Eq/Ord instances for any lib I produce or hack on. Instead I will be aiming for correctness and optimal efficiency on the assumption that Eq and Ord instances are sensible. Good. But sensible only means that the Eq and Ord instances agree, not that x == y => f x == f y. So can I assume that max x y = max y x?
No. You can, however, assume that max x y == max y x. (Okay, this fails on Doubles, because of NaN. I'll agree that the Eq and Ord instances for Double are not sane.)
If not, how can I tell if I've made the correct choice of argument order.
When calling, or when defining max?
It depends on what types you're using, and which equivalence and ordering relations are being used.
When calling, and when it might matter which representative of an equivalence class comes back out (such as in sorts) you have no choice but to look at the documentation or implementation of max.
The Haskell report guarantees that x == y => max x y = y (and hence max y x = x), and the opposite choice for min. This is to ensure that (min x y, max x y) = (x,y) or (y,x). IOW, the report notices that choice of representatives for equivalence classes matters in some circumstances, and makes it easy to do the right thing. This supports the reading that Eq a is not an absolute equality relation, but an equivalence relation.
If I can't tell then I guess I have no alternative but document my arbitrary choice in the Haddock, and probably for the (sake of completeness) provide 2 or more alternative definitions of the "same" function which use a different argument order.
When defining max, yes, you must take care to make sure it useable for cases when Eq is an equivalence relation, rather than equality.
If you're writing library code, then it won't generally know whether Eq means true equality rather than equivalence. If this would let you optimize things, you need some other way to communicate this.
The common typeclasses are for generic, parameterizable polymorphism. Equivalence is a more generally useful notion than equality, so that's what I want captured by the Eq typeclass.
And no, an overloaded sort doesn't belong in Ord, either. If the implementation is truly dependent on the types in non-trivial, non-susbstitutable ways (i.e. beyond a substition of what <= means), then they should be /different/ algorithms.
It would be possible to right an "Equal a" typeclass, which does guarantee actual observable equality (but has no methods). Then you can write one equalSort (or whatever) of type equalSort :: (Eq a, Ord a, Equal a) => [a] -> [a] that will work on any type willing to guarantee this, but rightly fail on types that only define an equivalence relation.
A stable sort is more generally useful than an unstable one. It's composable for radix sorting on compound structures, etc. Hence we want to keep this guarantee.