
On Tue, Sep 20, 2011 at 6:05 PM, Chris Smith
There's nothing *wrong* with pragmatism, but in any case, we seem to agree on this. As I said earlier, we ought to impose a (rather arbitrary) total order on Float and Double, and then offer comparison with IEEE semantics as a separate set of functions when they are needed. (I wonder if Ocaml-style (<.) and (>.) and such are used anywhere.)
I think the only point of disagreement here is that I'm advocating the introduction of a partial ordering class (for which floating point values could be given a proper instance according to IEEE semantics) rather than treating floats as a special case. I would prefer going a step further and having two distinct total order classes to distinguish meaningful total orders from nonsense ones like for Float and Double, but perhaps that seems excessive to other people.
It's clear to me that Enum for Float means something coherent. If you're looking for a meaning independent of the instance, I'd argue you ought to be surprised if you find one, not the other way around. Why not look for a meaning for Monoid that's independent of the instance? There isn't one; instead, there are some rules that the instance is expected to satisfy, but there are plenty of types that have many possible Monoid instances, and we pick one and leave you to use newtypes if you wanted a different one.
I have to disagree here. Monoid has a very clear, narrow, type-independent meaning: the eponymous algebraic structure. The minimal definition of the class is a value and a binary operation; this is a very small interface, and the laws expected of an instance nearly exhaust the properties of these definitions, either by specifying behavior (e.g., associativity) or by deliberately not specifying (is the binary operation commutative? not in general, but it could be). Simply by satisfying the type signature, any instance is going to at least vaguely resemble a valid one, and checking the laws is straightforward. On the other hand, Enum has conversions to and from Int and a host of interdefined operations with at best loose guidelines for how they should behave. Does "toEnum . fromEnum = id" hold? Not in general. Does "succ . fromEnum = fromEnum . succ" hold? Probably not. I think. What do enumFrom, enumFromThen, &c. mean? What the instance author thought made sense, I suppose, since they're only defined as "what list range syntax desugars to". In the case of types that also have a Bounded instance there are further requirements, mostly relating to where runtime errors should be produced (gosh, that helps). Consider this: How many Enum instances do you think override the default definitions, not for efficiency, but in ways that give different results? How many Monoid instances do you think override mconcat in a way that gives a different answer than "foldr mappend mempty"? Here's a thought experiment. Imagine that, instead of Monoid, we had a type class called "Summarize" used mostly to desugar some sort of built in summation syntax. The main function used is "summarize :: (Summarize a) => [a] -> a", the class is described as a generalized "sum", and the motivating examples are all independent of the order of elements in the list (because addition is commutative, right). But nowhere is it specified what the behavior of the class should be, other than that it desugars the syntax in some way that presumably makes sense. It's not required that "summarize []" produce an identity value, it's not required that summarizing repeatedly should be associative, it's not required that reordering the list give the same summary, and so on. Most instances do have all these properties of course, but then someone makes a library with an extremely non-commutative instance for Summarize and we get a -cafe thread complaining about it and then I write a very long and tedious message all about how Summarize is underspecified and has no clear meaning and probably should be explicitly defined as some sort of monoid, either commutative or more general. But I digress. The ambiguity from Monoid is purely that many types have multiple ways to fulfill the very precise requirements of the class. The ambiguity of Enum is that it isn't clear what, if anything, the requirements even are, and nothing rules out a wide variety of equally valid instances other than a vague notion of which one "makes sense", a point on which reasonable people may disagree! Possibly a better example would be MonadPlus, for which (if memory serves me) there's some similar ambiguity about the laws an instance should follow, with inconsistency even in the standard library as to which interpretation is chosen, and resulting in actual confusion about what code should do.
I'm not saying that Enum must be left exactly as is... but I *am* saying that the ability to use floating point types in list ranges is important enough to save. For all its faults, at least the current language can do that. When the solution to the corner cases is to remove a pervasive and extremely useful feature, I start to get worried!
I have no desire to remove useful features. What I don't like is when features behave inconsistently in unclear ways between two cases that I would expect to be equivalent; the more useful the feature is, the more troubling this becomes. At best this results in generic functions defined on the class being nearly useless because you have no idea what they even mean out of context; at worst it creates serious bugs due to invalid assumptions, as I think is demonstrated by the (blatantly incorrect) Ord instance for floats causing the illusion of data loss in standard data structures. Given that a major purpose of Enum is to translate numeric ranges, the fact that it can have dramatically different behavior for different numeric types strikes me as deeply problematic, and an endless source of bugs in potentia. In fact, I would (and will, should the opportunity arise) actively advise people new to the language to avoid the list range syntax when floating point types are involved because of the pitfalls, or to at least only use it in the [x, y..] form.
Yes, I could see (somehow in small steps that preserve backward compatibility for reasonable periods) building some kind of clearer relationship between Ord, Enum, and Ix, possibly separating Enum from a new Range class that represents the desugaring of list ranges, or whatever... but this idea of "I don't think this expresses a deep underlying relationship independent of type, so let's just delete it without regard to how useful it is" is very short-sighted.
Having a deep underlying meaning for type classes isn't just for the sake of elegance; having a well-defined, consistent meaning removes a great deal of cognitive load in working with code because it narrows dramatically the context required to know what an expression means. Writ large this is the principle behind equational reasoning and parametricity, which are the most powerful concepts available for reasoning about Haskell code. Type classes with unclear semantics undermine this, and while an Enum constraint may not be as nefarious as, say, Typeable would be, it's arguably closer to that than to something simple and coherent like Monoid or Functor. Alas, these properties are as fragile as they are useful. Take the humble and harmless "show" function, for instance. One might occasionally think that it would be handy to have an ambient implementation, allowing a value of any type to be converted to a string, even if only as a dummy value like "<<function>>". But allowing this without a Show constraint suffices to destroy the guarantees of parametricity, as surely as does any function with "unsafe" in its name! A terrible price for such a trifling convenience. "Civilization advances by extending the number of operations which we can perform without thinking about them." Deep underlying meaning has a deep utility of its own, but only to the extent to which it is kept absolute. ...and that is, at egregious length, why I find Enum dissatisfying. - C.