instance Enum Double considered not entirely great?

older
Much faster complex monad stack...

Evan Laforge

20 Sep 2011 20 Sep '11

5:09 a.m.

So I've been excising occurrences of the [n..m] syntax from my code. The reason is that I had been using [0..5] as a shorthand for [0, 1, 2, 3, 4, 5], mostly in tests. This works for integral types and it appears to work for floating point types. Then I tried switching to a fixed point format, and discovered my mistake. Enum is supposed to enumerate every value between the two points, and the result is memory exhaustion. I had already been switching from enumFrom because they accumulate floating point errors, in favor of a couple of more appropriate functions: -- | Enumerate an inclusive range. Uses multiplication instead of successive -- addition to avoid loss of precision. -- -- Also it doesn't require an Enum instance. range :: (Num a, Ord a) => a -> a -> a -> [a] range start end step = go 0 where go i | val > end = [] | otherwise = val : go (i+1) where val = start + (i*step) -- | Like 'range', but includes the end. range_end :: (Num a, Ord a) => a -> a -> a -> [a] range_end start end step = go 0 where go i | val >= end = [end] | otherwise = val : go (i+1) where val = start + (i*step) -- | Infinite range. range_ :: (Num a) => a -> a -> [a] range_ start step = go 0 where go i = start + (i*step) : go (i+1) So I think the Enum instances for Float and Double are sort of a trap. They misled me into thinking the .. syntax is the range function defaulting to a step of 1, led to precision errors I had to debug then fix (and I've noticed lead to the odd stack overflow question), then led to more disaster when I switched to an Enum instance that lived up to Enum's real purpose. The problem is that the [..] syntax is convenient and tempting, especially for tests, wanting a numeric range (rather than enumeration) is very common, and as far as I know the 'range' function isn't in the standard library. So it seems like one of these things that seem convenient in the short term but come around and bite you later. Using the right functions from the start avoids all that. Is there any support for the idea of removing Enum instances for floating point numbers?

Show replies by date

Chris Smith

20 Sep 20 Sep

2:59 p.m.

On Mon, 2011-09-19 at 22:09 -0700, Evan Laforge wrote:

...

Then I tried switching to a fixed point format, and discovered my mistake. Enum is supposed to enumerate every value between the two points, and the result is memory exhaustion.

I'm not sure where you read that "Enum is supposed to enumerate every value between the two points". It's not in the API documentation, and I don't see it in the Haskell Report. The better way to look at this is that the notion of `succ` and `pred` is dependent on the type, much like `mappend` has no particular meaning until a Monoid instance is given for the type. It's fairly well established, though undocumented, that Num types ought to have succ = (+1) and pred = (subtract 1), so if your fixed point type doesn't do that, I'd suggest it is the problematic part of this. It would be a shame if we lost an occasionally useful and easy to read language feature because of concerns with the disadvantages of some hypothetical bad implementation.

...

Is there any support for the idea of removing Enum instances for floating point numbers?

I certainly hope not. Instead, perhaps the issue should be brought up with the fixed-point number library you're using, and they could fix their Enum instance to be more helpful. -- Chris

Jake McArthur

3:34 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 10:59 AM, Chris Smith wrote:

...

I certainly hope not. Instead, perhaps the issue should be brought up with the fixed-point number library you're using, and they could fix their Enum instance to be more helpful.

I'm the author of the library in question. I intentionally chose the semantics for Enum as it is, although I'm not sure I'm prepared to say that Enum "should" have this behavior in general. Regardless, enumerating all values makes more sense to me, and we still have enumFromThen and enumFromThenTo, so no power is lost anyway. Float and Double are the only types I know of that don't enumerate all values by default in enumFrom and enumFromTo. - Jake

Jake McArthur

3:38 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 10:59 AM, Chris Smith wrote:

...

The better way to look at this is that the notion of `succ` and `pred` is dependent on the type, much like `mappend` has no particular meaning until a Monoid instance is given for the type. It's fairly well established, though undocumented, that Num types ought to have succ = (+1) and pred = (subtract 1), so if your fixed point type doesn't do that, I'd suggest it is the problematic part of this.

I forgot to address this in my last email. I disagree that it is well established for succ = (+1) and pred = subtract 1 for Num instances. In fact, for types that are also instances of Bounded where the Num instances have overflow this is required to be *not* true. Not realizing this is a often a source of performance problems or even bugs in Haskell programs. - Jake

Ketil Malde

3:39 p.m.

New subject: instance Enum Double considered not entirely great?

Chris Smith writes:

...

It would be a shame if we lost an occasionally useful and easy to read

You forgot "confusing"? Expecting Enum to enumerate all inhabitants of a type seems very reasonable to me, and seems to hold for all non-floating point types. A numeric range [a..a+n] might be expected to have a+n+1 elements, but that doesn't hold either for Float and Double. I think Enum for floating point values is broken - but it is reality, so we need to deal with it.

...

Instead, perhaps the issue should be brought up with the fixed-point number library you're using, and they could fix their Enum instance to be more helpful.

Or just avoid Enum, and define "range" or something similar instead. -k -- If I haven't seen further, it is by standing in the footprints of giants

Daniel Fischer

4:16 p.m.

New subject: instance Enum Double considered not entirely great?

On Tuesday 20 September 2011, 17:39:49, Ketil Malde wrote:

...

Chris Smith writes:

...
It would be a shame if we lost an occasionally useful and easy to read

You forgot "confusing"? Expecting Enum to enumerate all inhabitants of a type seems very reasonable to me, and seems to hold for all non-floating point types.

And Rational (more generally, Ratio a). (Why does everybody forget that?) Enumerating all inhabitants of a type (within some range) is only possible if there are well-defined successors and predecessors (modulo bounds). For Double and Float, there are (excepting NaNs), so it could be done, but arguably that would be *far less* useful than the current instances. For Rational, no such luck.

...

A numeric range [a..a+n] might be expected to have a+n+1 elements, but that doesn't hold either for Float and Double. I think Enum for floating point values is broken

Yes, it is. Like Eq and Ord.

...

- but it is reality, so we need to deal with it.

Like Eq and Ord, it's just too damn convenient to have it. So much nicer to write [0, 0.25 .. 1000] instead of numericEnumFromThenTo 0 0.25 1000 and (x /= y) instead of doublesDifferentOrNaN x y

Paterson, Ross

4:47 p.m.

New subject: instance Enum Double considered not entirely great?

Daniel Fischer writes:

...

...
A numeric range [a..a+n] might be expected to have a+n+1 elements, but that doesn't hold either for Float and Double. I think Enum for floating point values is broken

Yes, it is. Like Eq and Ord.

.. only more so. And the brokenness has infected Rational: try [1,3..20]::[Rational]

Casey McCann

7:28 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 12:47 PM, Paterson, Ross wrote:

...

Daniel Fischer writes:

...
...
A numeric range [a..a+n] might be expected to have a+n+1 elements, but that doesn't hold either for Float and Double. I think Enum for floating point values is broken

Yes, it is. Like Eq and Ord.

.. only more so. And the brokenness has infected Rational: try

[1,3..20]::[Rational]

I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result. Getting an unexpected element in a list doesn't really seem as bad as silently corrupting entire data structures. For numeric ranges of fractional values, I expect that what one typically wants is either "the two end points, and N intermediate values evenly spaced" or "the starting value, and successive increments up to some cut-off", with the default increment being 1 or -1 as appropriate. The current Enum instance splits the difference and gets you some of both, except that you might get the wrong values or something past the cut-off. Similarly, the current Ord instance splits the difference between a coherent total order (what Ord instances should be) and the semantically correct partial order (what floats should have, as defined in the IEEE standard). - C.

Chris Smith

7:48 p.m.

On Tue, 2011-09-20 at 15:28 -0400, Casey McCann wrote:

...

I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result.

Definitely Ord is worse. I'd very much like to see the Ord instance for Float and Double abandon the IEEE semantics and just put "NaN" somewhere in there -- doesn't matter where -- and provide new functions for the IEEE semantics. As for Enum, if someone were to want a type class to represent an enumeration of all the values of a type, then such a thing is reasonable to want. Maybe you can even reasonably wish it were called Enum. But it would be the *wrong* thing to use as a desugaring for list range notation. List ranges are very unlikely to be useful or even meaningful for most such enumerations (what is [ Red, Green .. LightPurple]?); and conversely, as we've seen in this thread, list ranges *are* useful in situations where they are not a suitable way of enumerating all values of a type. -- Chris

Jake McArthur

8:22 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 3:48 PM, Chris Smith wrote:

...

But it would be the *wrong* thing to use as a desugaring for list range notation. List ranges are very unlikely to be useful or even meaningful for most such enumerations (what is [ Red, Green .. LightPurple]?); and conversely, as we've seen in this thread, list ranges *are* useful in situations where they are not a suitable way of enumerating all values of a type.

This makes me wonder if maybe the reason this discussion is happening at all is that we don't have a well-defined meaning for what Enum *is*. At this point, it seems like the only answer is that it's whatever the instance says it is, which I find unsatisfying. What exactly does Enum enumerate? To me, the list syntax sugar looks like I'm specifying bounds, so it makes sense to include all values within those bounds (and honestly, having instances for Float, Double, and Rational sounds like a mistake, given this), but clearly that is not what it means to everybody. What does it mean to you? What makes the current behavior more useful than the proposed behavior? You say we've seen that this behavior is useful in this thread, but I'm not sure what it is we have seen. - Jake

Alexander Solla

8:40 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 1:22 PM, Jake McArthur wrote:

...

On Tue, Sep 20, 2011 at 3:48 PM, Chris Smith wrote:

...
But it would be the *wrong* thing to use as a desugaring for list range notation. List ranges are very unlikely to be useful or even meaningful for most such enumerations (what is [ Red, Green .. LightPurple]?); and conversely, as we've seen in this thread, list ranges *are* useful in situations where they are not a suitable way of enumerating all values of a type.

This makes me wonder if maybe the reason this discussion is happening at all is that we don't have a well-defined meaning for what Enum *is*.

Enum is the class that represents enumerable types. In other words, the class of things that can be injected into the natural numbers. These types inherit an order from the natural numbers, ordering by images under this injection. Now, we might not like that order, and it might not agree with an Ord instance, but it exists.

roconnor＠theorem.ca

21 Sep 21 Sep

3:33 a.m.

On Tue, 20 Sep 2011, Alexander Solla wrote:

...

On Tue, Sep 20, 2011 at 1:22 PM, Jake McArthur wrote: On Tue, Sep 20, 2011 at 3:48 PM, Chris Smith wrote: > But it would be the *wrong* thing to use as a desugaring for list range > notation. List ranges are very unlikely to be useful or even meaningful > for most such enumerations (what is [ Red, Green .. LightPurple]?); and > conversely, as we've seen in this thread, list ranges *are* useful in > situations where they are not a suitable way of enumerating all values > of a type.

This makes me wonder if maybe the reason this discussion is happening at all is that we don't have a well-defined meaning for what Enum *is*.

Enum is the class that represents enumerable types. In other words, the class of things that can be injected into the natural numbers. These types inherit an order from the natural numbers, ordering by images under this injection.

Now, we might not like that order, and it might not agree with an Ord instance, but it exists.

For what it's worth, at some point in time I was sketching a proposal to split the Enum class into two classes because I felt that two distinct ideas were being conflated. Unfortunately this was years ago and I have forgotten what the details I was thinking. Perhaps someone can reconstruct a proposal along these lines. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

Casey McCann

6:31 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 11:33 PM, wrote:

...

For what it's worth, at some point in time I was sketching a proposal to split the Enum class into two classes because I felt that two distinct ideas were being conflated. Unfortunately this was years ago and I have forgotten what the details I was thinking. Perhaps someone can reconstruct a proposal along these lines.

Considering the desugaring of list range syntax in general, rather than the Enum class as such, I would argue for *three* ideas, which are all supported with varying degrees of success by the current implementation: 1) Exhaustive enumeration of a finite range, where the desired meaning of [a..z] is almost exactly that of "Data.Ix.range (a, z)". 2) Iterative generation of a sequence, where the desired meaning of [a, b..z] is iterating a function implicitly defined by the offset between a and b, plus an optional takeWhile using some predicate determined by z. The nature of the offset, predicate, &c. would be defined on a per-type basis, possibly including a default offset for when b isn't specified, but personally I'd rather just disallow that in this case. 3) Evenly-spaced divisions of an infinite range, where the desired meaning of [a,b..z] assumes that the distance from a to b evenly divides the distance from a to z, and the result is a list containing (1 + (z-a)/(b-a)) elements such that all differences between successive elements are equal, with a and z present as the first and last elements. For most types other than fractional numbers and floats, the third interpretation isn't well-defined and the first coincides both with an Ix instance (if one exists) and with the second interpretation using the smallest nonzero offset. Note that the first interpretation does not require a total ordering, and in fact the Ord constraint on Ix is somewhat misleading and nonsensical. As such, the first interpretation naturally extends to more general ranges than what the second can describe. For rationals, floats, approximations of the reals, or any other type with a conceptually infinite number of values in a range, the first interpretation isn't well-defined, and the second and third interpretations should coincide when all three parameters are equal, ignoring rounding errors and inexact representations. The current Enum class attempts to be something like an ill-defined mixture of all three, and where the interpretations don't coincide, the disagreement between them is a likely source of bugs. Worse still, the instance for floating point values mixes the naively expected results of both the second and third in a very counterintuitive way: the "enum to" value at the end behaves neither as an upper bound (the sequence may exceed it in an effort to avoid rounding errors) nor as a final element (it may not be in the sequence at all, even if it has an exact floating point representation). This seems needlessly confusing to me and is arguably broken no matter which way you slice it. My thoughts are that the first interpretation is most naturally suited to list range syntax, that the second would be better served by a slightly different syntax to make the predicate more explicit, and that the third bugs the crap out of me because it's really very useful but I can't think of a concise and unambiguous syntax for it. - C.

Brandon Allbery

6:41 p.m.

New subject: instance Enum Double considered not entirely great?

On Wed, Sep 21, 2011 at 14:31, Casey McCann wrote:

...

My thoughts are that the first interpretation is most naturally suited to list range syntax, that the second would be better served by a slightly different syntax to make the predicate more explicit, and that the third bugs the crap out of me because it's really very useful but I can't think of a concise and unambiguous syntax for it.

Based on what you said, I'm wondering if the first gets basic fromTo syntax, the third gets fromThenTo syntax, and the second strikes me as a simplified form of list comprehension and might possibly be phrased as a cross between range and comprehension. Although the most "correct" such cross has an ambiguity with the comma... can we still use | as the delimiter, read as "such that"? ([a .. z | filter]) -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

Casey McCann

7:11 p.m.

New subject: instance Enum Double considered not entirely great?

On Wed, Sep 21, 2011 at 2:41 PM, Brandon Allbery wrote:

...

On Wed, Sep 21, 2011 at 14:31, Casey McCann wrote:

...
My thoughts are that the first interpretation is most naturally suited to list range syntax, that the second would be better served by a slightly different syntax to make the predicate more explicit, and that the third bugs the crap out of me because it's really very useful but I can't think of a concise and unambiguous syntax for it.

Based on what you said, I'm wondering if the first gets basic fromTo syntax, the third gets fromThenTo syntax, and the second strikes me as a simplified form of list comprehension and might possibly be phrased as a cross between range and comprehension. Although the most "correct" such cross has an ambiguity with the comma... can we still use | as the delimiter, read as "such that"? ([a .. z | filter])

Hmm. I actually wrote "(..) better served by some variation on a list comprehension" there at first before editing it to be more non-committal. Interesting to see someone else immediately go for the same idea. Anyway, I think this can already be expressed using GHC's generalized list comprehensions, but the result is more verbose than I would like for this particular very common case. My first thought on resolving ambiguity is to rely on having something distinct following a "..", e.g. desugaring "[a, b.. | < z]" as "takeWhile (< z) [a, b..]", where anything ending in ".. ]" is taken to be an infinite iterated sequence. This is only slightly more verbose than the current form, arguably more readable, and certainly more explicit. Would need to be more clearly specified what forms the predicate expression could have, however. The fromThenTo syntax doesn't seem entirely satisfactory for the third case, because it creates ambiguity if the step size doesn't evenly divide the range size. Having the first and last elements appear exactly as given in the sequence and having the interval sizes be as consistent as possible are pretty much the entire purpose here, so I'm not sure how to reconcile that. Perhaps rounding the specified interval to the nearest divisor? Kind of a hack, but seems to best approximate the intent (as well as being resilient in the face of imprecision, which is also important). - C.

Evan Laforge

20 Sep 20 Sep

8:42 p.m.

New subject: instance Enum Double considered not entirely great?

...

This makes me wonder if maybe the reason this discussion is happening at all is that we don't have a well-defined meaning for what Enum *is*. At this point, it seems like the only answer is that it's

I agree, that's why I suggested a separate 'range' function. I think the "all values in this range" definition is reasonable (and incidentally, Data.Fixed.Binary's Enum coincides with Data.Fixed, so I think its the floats that are the odd ones out). Enum is explicitly not a numeric notion because it's meant to work for any enumeration style ADT, so it seems clear to me it's about enumeration rather than a numeric range. There's clearly a lot of overlap between the two interpretations, but we have Double taking Enum as a numeric range, Data.Fixed taking it as an enumeration, integral types overlap so you can't tell, and 'data C = Red | Green | Blue deriving Enum' is clearly taking it as enumeration. Some people have suggested that it's simply an ambiguous class which has multiple interpretations, like Monoid, and thus you just need to know which one you're dealing with. Fair enough, but is that a good thing? I think it makes it harder to move between numeric types. I don't mind the Monoid thing because I haven't had to move between two Monoid types and expect them to work similarly, but float -> fixed is kind of like String -> Text in that its a likely axis of change. I would also be pretty surprised if Text came up with a different interpretation of Monoid than String :) On Tue, Sep 20, 2011 at 9:16 AM, Daniel Fischer wrote:

...

...
- but it is reality, so we need to deal with it.

Like Eq and Ord, it's just too damn convenient to have it. So much nicer to write [0, 0.25 .. 1000] instead of numericEnumFromThenTo 0 0.25 1000 and (x /= y) instead of doublesDifferentOrNaN x y

Interestingly, "too damn convenient" was one of my motivations *against* the instance. It's not even that much more convenient, if you write it 'range' then 'range 0 1000 0.25' is not much longer than [0, 0.25 .. 1000], but, I'd argue the latter is a trap, and has a good chance of causing a subtle bug in your app eventually if it gets big enough. The problem is that the list syntax is built-in and looks nice and so is very tempting.

Chris Smith

9:35 p.m.

On Tue, 2011-09-20 at 16:22 -0400, Jake McArthur wrote:

...

This makes me wonder if maybe the reason this discussion is happening at all is that we don't have a well-defined meaning for what Enum *is*.

Certainly, we don't have a type-independent definition for Enum. I'm not sure whether it's possible to obtain that or not. Keep in mind that for plenty of common type classes, this is not possible. For example, consider Monoid. By writing a monoid instance, you're making the rather ridiculous claim that you are specifying *the* way to define a Monoid on that type, when of course there are more than one, and there's no formal way to say what that Monoid should do independent of whatever operation happens to be most commonly used for combining values of that type. Same for many Functor or Applicative or Monad instances. So yes, we don't know how to define a type-independent meaning... but I'm completely sure that it would be a mistake to start dropping useful things from Haskell just because we're unable to put our finger on a formalism for describing them precisely without assigning type-dependent meanings.

...

What exactly does Enum enumerate?

I'd say that just like the case with Monoid, Enum means whatever it's most useful for it to mean with respect to some particular type. We could probably be a little more specific with laws that we expect the instance to follow, such as: enumFromTo a b == enumFromThenTo a (succ a) b and so on. But it's not always possible to define a notion

...

To me, the list syntax sugar looks like I'm specifying bounds, so it makes sense to include all values within those bounds (and honestly, having instances for Float, Double, and Rational sounds like a mistake, given this)

It's unclear to me how you get from (has bounds) to (must include *everything* in those bounds). I'd definitely agree that for instances of Enum that are also instances of Ord, you'd expect (all (>= a) [a ..]) and related properties.

...

What does it mean to you? What makes the current behavior more useful than the proposed behavior?

[...]

...

You say we've seen that this behavior is useful in this thread, but I'm not sure what it is we have seen.

More specifically, what I said is that we've seen that list range notation is useful in some situations where a complete enumeration of possible values is not useful, or where such an enumeration isn't the same one we'd hope for out of a list range. What I meant was that we've noticed that the current behavior on Float is incompatible with being a complete enumeration. I'm taking it for granted that the current behavior on Float is useful; I honestly don't see how you could argue with that. People use it all the time; I used it just this morning. Of course it's useful. -- Chris

Evan Laforge

10:05 p.m.

New subject: instance Enum Double considered not entirely great?

...

complete enumeration. I'm taking it for granted that the current behavior on Float is useful; I honestly don't see how you could argue with that. People use it all the time; I used it just this morning. Of course it's useful.

It was useful but not as useful as a 'range' function. I thought it was quite useful and used it even though I *already knew* about 'last [0, 0.1 .. 2] > 2'. But of course this eventually lead to a tricky bug, followed by a "never again" and witch hunt for all uses of [..] with floats :) Maybe it's just me, but would guess that your students would find it perfectly clear and not confusing at all... until they found out about that little thing with 'last' above... *then* they might be confused. Maybe this is just an argument that Double's enumFromThenTo should use multiplication instead of successive addition?

Donn Cave

8:22 p.m.

New subject: instance Enum Double considered not entirelygreat?

Quoth Chris Smith , ...

...

As for Enum, if someone were to want a type class to represent an enumeration of all the values of a type, then such a thing is reasonable to want. Maybe you can even reasonably wish it were called Enum. But it would be the *wrong* thing to use as a desugaring for list range notation. List ranges are very unlikely to be useful or even meaningful for most such enumerations (what is [ Red, Green .. LightPurple]?); and conversely, as we've seen in this thread, list ranges *are* useful in situations where they are not a suitable way of enumerating all values of a type.

It isn't a life or death requirement, but I have dealt with enum values that are ordered and that could usefully be represented in a range. Even your example -- given a set of hues, it's sensible to order them following the spectrum, so [Cyan .. YellowGreen] might represent the hues in the set that are at all in the green range. I'm frankly using Enum as an equivalent to C "enum", symbolic integers, where the range is more or less implicit as by default the integer values are [0..]. While I (vaguely) understand that Haskell's Enum is different, the C idiom should at least testify that there's some practical value in the notion of a range of Enum, or something like it. Though not with floating point values, of course. Donn

Casey McCann

9:28 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 3:48 PM, Chris Smith wrote:

...

On Tue, 2011-09-20 at 15:28 -0400, Casey McCann wrote:

...
I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result.

Definitely Ord is worse. I'd very much like to see the Ord instance for Float and Double abandon the IEEE semantics and just put "NaN" somewhere in there -- doesn't matter where -- and provide new functions for the IEEE semantics.

It should be first, to make floating point values consistent with applying Maybe to a numeric type. Personally, I contend that the most correct solution is to distinguish between meaningful ordering relations and ones used for algorithmic convenience. As another example, the type (Integer, Integer), regarded as Cartesian coordinates, has no meaningful ordering at all but does have an obvious arbitrary total order (i.e., the current Ord instance). For purposes like implementing Data.Set.Set, we don't care at all whether the ordering used makes any particular kind of sense; we care only that it is consistent and total. For semantically-meaningful comparisons, we want the semantically-correct answer and no other. For types with no meaningful order at all, or with a meaningful total order that we can use, there is no ambiguity, but floating point values have both a semantic partial order and an obvious arbitrary total order which disagree about NaN. In the true spirit of compromise the current Ord instance fails to implement both, ensuring that things work incorrectly all the time rather than half the time. That said, in lieu of introducing multiple new type classes, note that the Haskell Report specifically describes Ord as representing a total order[0], so the current instances for floating point values seem completely indefensible. Since removing the instances entirely is probably not a popular idea, the least broken solution would be to define NaN as equal to itself and less than everything else, thus accepting the reality of Ord as the "meaningless arbitrary total order" type class I suggested above and leaving Haskell bereft of any generic semantic comparisons whatsoever. Ah, pragmatism.

...

As for Enum, if someone were to want a type class to represent an enumeration of all the values of a type, then such a thing is reasonable to want. Maybe you can even reasonably wish it were called Enum. But it would be the *wrong* thing to use as a desugaring for list range notation. List ranges are very unlikely to be useful or even meaningful for most such enumerations (what is [ Red, Green .. LightPurple]?); and conversely, as we've seen in this thread, list ranges *are* useful in situations where they are not a suitable way of enumerating all values of a type.

It's not clear that Enum, as it stands, actually means anything coherent at all. Consider again my example of integer (x, y) coordinates. Naively, what would [(0, 0) .. (3, 3)] appear to mean? Why, obviously it's the sixteen points whose coordinates range from 0 to 3, except it isn't because Enum isn't defined on pairs and doesn't work that way anyhow. Could we describe this range with an iteration function and a comparison? No, because the Ord instance here is intrinsically nonsensical. And yet, the intent is simple and useful, so why don't we have a standard type class for it?[1] This would seem to be the obvious, intuitive interpretation for range syntax with starting and ending values. To the extent that Enum can be given a coherent interpretation (which requires ignoring many existing instances), it seems to describe types with unary successor/predecessor operations. As such, instances for Rational, floating point values, and the like are patently nonsensical and really should be removed. An obvious generalization would be to define Enum based on an "increment" operation of some sort, in which case those instances could be defined reasonably with a default increment of 1, which is merely dubious, rather than ridiculous. The increment interpretation would be very natural for infinite lists defined with open ranges and an optional step size. Absent from the above is any interpretation of expressions like [0,2 ..11], which are ill-defined anyway, as evidenced by that expression producing lists of different lengths depending on what type is chosen for the numeric literals. Myself, I'm content to declare that use of range syntax a mistake in general, and insist that an unbounded range and something like takeWhile be used instead. - C. [0]: See here: http://www.haskell.org/onlinereport/haskell2010/haskellch6.html#x13-1290006.... [1]: Spoiler: We do.

Chris Smith

10:05 p.m.

On Tue, 2011-09-20 at 17:28 -0400, Casey McCann wrote:

...

Since removing the instances entirely is probably not a popular idea, the least broken solution would be to define NaN as equal to itself and less than everything else, thus accepting the reality of Ord as the "meaningless arbitrary total order" type class I suggested above and leaving Haskell bereft of any generic semantic comparisons whatsoever. Ah, pragmatism.

There's nothing *wrong* with pragmatism, but in any case, we seem to agree on this. As I said earlier, we ought to impose a (rather arbitrary) total order on Float and Double, and then offer comparison with IEEE semantics as a separate set of functions when they are needed. (I wonder if Ocaml-style (<.) and (>.) and such are used anywhere.)

...

It's not clear that Enum, as it stands, actually means anything coherent at all.

It's clear to me that Enum for Float means something coherent. If you're looking for a meaning independent of the instance, I'd argue you ought to be surprised if you find one, not the other way around. Why not look for a meaning for Monoid that's independent of the instance? There isn't one; instead, there are some rules that the instance is expected to satisfy, but there are plenty of types that have many possible Monoid instances, and we pick one and leave you to use newtypes if you wanted a different one. I'm not saying that Enum must be left exactly as is... but I *am* saying that the ability to use floating point types in list ranges is important enough to save. For all its faults, at least the current language can do that. When the solution to the corner cases is to remove a pervasive and extremely useful feature, I start to get worried! Yes, I could see (somehow in small steps that preserve backward compatibility for reasonable periods) building some kind of clearer relationship between Ord, Enum, and Ix, possibly separating Enum from a new Range class that represents the desugaring of list ranges, or whatever... but this idea of "I don't think this expresses a deep underlying relationship independent of type, so let's just delete it without regard to how useful it is" is very short-sighted. -- Chris Smith

Maciej Marcin Piechotka

10:38 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, 2011-09-20 at 16:05 -0600, Chris Smith wrote:

...

On Tue, 2011-09-20 at 17:28 -0400, Casey McCann wrote:

...
Since removing the instances entirely is probably not a popular idea, the least broken solution would be to define NaN as equal to itself and less than everything else, thus accepting the reality of Ord as the "meaningless arbitrary total order" type class I suggested above and leaving Haskell bereft of any generic semantic comparisons whatsoever. Ah, pragmatism.

There's nothing *wrong* with pragmatism, but in any case, we seem to agree on this. As I said earlier, we ought to impose a (rather arbitrary) total order on Float and Double, and then offer comparison with IEEE semantics as a separate set of functions when they are needed. (I wonder if Ocaml-style (<.) and (>.) and such are used anywhere.)

Doesn't seems so: http://www.haskell.org/hoogle/?hoogle=%3E. http://www.haskell.org/hoogle/?hoogle=%3C. +1 for: class Eq a => IEEEEq a where (<.) :: a -> a -> Bool (>.) :: a -> a -> Bool Regards

Daniel Fischer

11:23 p.m.

New subject: instance Enum Double considered not entirely great?

On Wednesday 21 September 2011, 00:38:12, Maciej Marcin Piechotka wrote:

...

+1 for:

class Eq a => IEEEEq a where (<.) :: a -> a -> Bool (>.) :: a -> a -> Bool

Regards

-1 for the class name, too easy to miscount the Es. And perhaps it would be better to add the IEEE compliant(?) comparisons to the RealFloat class. class (RealFrac a, Floating a) => RealFloat a where ... -- lots of stuff we already have (==.) :: a -> a -> Bool (<.) :: a -> a -> Bool ... However, I don't particularly like adding just a dot, that's too easily overlooked. On the other hand, I don't have a compelling idea either. (.==.), (.<.) would at least double the chances to spot the difference. Anyway, +1 for an Ord instance with a total order (consistent with the natural order where applicable) and putting the IEEE comparisons somewhere else.

Evan Laforge

11:27 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 4:23 PM, Daniel Fischer wrote:

...

On Wednesday 21 September 2011, 00:38:12, Maciej Marcin Piechotka wrote:

...
+1 for:

class Eq a => IEEEEq a where (<.) :: a -> a -> Bool (>.) :: a -> a -> Bool

Regards

-1 for the class name, too easy to miscount the Es.

Aww, I was looking forward to being able to use "i quadruple-e q" in conversation.

Maciej Marcin Piechotka

21 Sep 21 Sep

12:27 a.m.

New subject: instance Enum Double considered not entirely great?

On Wed, 2011-09-21 at 01:23 +0200, Daniel Fischer wrote:

...

On Wednesday 21 September 2011, 00:38:12, Maciej Marcin Piechotka wrote:

...
+1 for:

class Eq a => IEEEEq a where (<.) :: a -> a -> Bool (>.) :: a -> a -> Bool

Regards

-1 for the class name, too easy to miscount the Es.

s/for:/for something like/

...

And perhaps it would be better to add the IEEE compliant(?) comparisons to the RealFloat class.

class (RealFrac a, Floating a) => RealFloat a where ... -- lots of stuff we already have (==.) :: a -> a -> Bool (<.) :: a -> a -> Bool ...

However, I don't particularly like adding just a dot, that's too easily overlooked. On the other hand, I don't have a compelling idea either. (.==.), (.<.) would at least double the chances to spot the difference.

Anyway, +1 for an Ord instance with a total order (consistent with the natural order where applicable) and putting the IEEE comparisons somewhere else.

Felipe Almeida Lessa

1:59 a.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 7:38 PM, Maciej Marcin Piechotka wrote:

...

+1 for:

class Eq a => IEEEEq a where (<.) :: a -> a -> Bool (>.) :: a -> a -> Bool

We already have this but it is hidden inside a library. It's called PartialOrd [1] and it's on the logfloat package. class PartialOrd a where cmp :: a -> a -> Maybe Ordering ... http://hackage.haskell.org/packages/archive/logfloat/0.12.1/doc/html/Data-Nu... -- Felipe.

Casey McCann

1:49 a.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 6:05 PM, Chris Smith wrote:

...

There's nothing *wrong* with pragmatism, but in any case, we seem to agree on this. As I said earlier, we ought to impose a (rather arbitrary) total order on Float and Double, and then offer comparison with IEEE semantics as a separate set of functions when they are needed. (I wonder if Ocaml-style (<.) and (>.) and such are used anywhere.)

I think the only point of disagreement here is that I'm advocating the introduction of a partial ordering class (for which floating point values could be given a proper instance according to IEEE semantics) rather than treating floats as a special case. I would prefer going a step further and having two distinct total order classes to distinguish meaningful total orders from nonsense ones like for Float and Double, but perhaps that seems excessive to other people.

...

It's clear to me that Enum for Float means something coherent. If you're looking for a meaning independent of the instance, I'd argue you ought to be surprised if you find one, not the other way around. Why not look for a meaning for Monoid that's independent of the instance? There isn't one; instead, there are some rules that the instance is expected to satisfy, but there are plenty of types that have many possible Monoid instances, and we pick one and leave you to use newtypes if you wanted a different one.

I have to disagree here. Monoid has a very clear, narrow, type-independent meaning: the eponymous algebraic structure. The minimal definition of the class is a value and a binary operation; this is a very small interface, and the laws expected of an instance nearly exhaust the properties of these definitions, either by specifying behavior (e.g., associativity) or by deliberately not specifying (is the binary operation commutative? not in general, but it could be). Simply by satisfying the type signature, any instance is going to at least vaguely resemble a valid one, and checking the laws is straightforward. On the other hand, Enum has conversions to and from Int and a host of interdefined operations with at best loose guidelines for how they should behave. Does "toEnum . fromEnum = id" hold? Not in general. Does "succ . fromEnum = fromEnum . succ" hold? Probably not. I think. What do enumFrom, enumFromThen, &c. mean? What the instance author thought made sense, I suppose, since they're only defined as "what list range syntax desugars to". In the case of types that also have a Bounded instance there are further requirements, mostly relating to where runtime errors should be produced (gosh, that helps). Consider this: How many Enum instances do you think override the default definitions, not for efficiency, but in ways that give different results? How many Monoid instances do you think override mconcat in a way that gives a different answer than "foldr mappend mempty"? Here's a thought experiment. Imagine that, instead of Monoid, we had a type class called "Summarize" used mostly to desugar some sort of built in summation syntax. The main function used is "summarize :: (Summarize a) => [a] -> a", the class is described as a generalized "sum", and the motivating examples are all independent of the order of elements in the list (because addition is commutative, right). But nowhere is it specified what the behavior of the class should be, other than that it desugars the syntax in some way that presumably makes sense. It's not required that "summarize []" produce an identity value, it's not required that summarizing repeatedly should be associative, it's not required that reordering the list give the same summary, and so on. Most instances do have all these properties of course, but then someone makes a library with an extremely non-commutative instance for Summarize and we get a -cafe thread complaining about it and then I write a very long and tedious message all about how Summarize is underspecified and has no clear meaning and probably should be explicitly defined as some sort of monoid, either commutative or more general. But I digress. The ambiguity from Monoid is purely that many types have multiple ways to fulfill the very precise requirements of the class. The ambiguity of Enum is that it isn't clear what, if anything, the requirements even are, and nothing rules out a wide variety of equally valid instances other than a vague notion of which one "makes sense", a point on which reasonable people may disagree! Possibly a better example would be MonadPlus, for which (if memory serves me) there's some similar ambiguity about the laws an instance should follow, with inconsistency even in the standard library as to which interpretation is chosen, and resulting in actual confusion about what code should do.

...

I'm not saying that Enum must be left exactly as is... but I *am* saying that the ability to use floating point types in list ranges is important enough to save. For all its faults, at least the current language can do that. When the solution to the corner cases is to remove a pervasive and extremely useful feature, I start to get worried!

I have no desire to remove useful features. What I don't like is when features behave inconsistently in unclear ways between two cases that I would expect to be equivalent; the more useful the feature is, the more troubling this becomes. At best this results in generic functions defined on the class being nearly useless because you have no idea what they even mean out of context; at worst it creates serious bugs due to invalid assumptions, as I think is demonstrated by the (blatantly incorrect) Ord instance for floats causing the illusion of data loss in standard data structures. Given that a major purpose of Enum is to translate numeric ranges, the fact that it can have dramatically different behavior for different numeric types strikes me as deeply problematic, and an endless source of bugs in potentia. In fact, I would (and will, should the opportunity arise) actively advise people new to the language to avoid the list range syntax when floating point types are involved because of the pitfalls, or to at least only use it in the [x, y..] form.

...

Yes, I could see (somehow in small steps that preserve backward compatibility for reasonable periods) building some kind of clearer relationship between Ord, Enum, and Ix, possibly separating Enum from a new Range class that represents the desugaring of list ranges, or whatever... but this idea of "I don't think this expresses a deep underlying relationship independent of type, so let's just delete it without regard to how useful it is" is very short-sighted.

Having a deep underlying meaning for type classes isn't just for the sake of elegance; having a well-defined, consistent meaning removes a great deal of cognitive load in working with code because it narrows dramatically the context required to know what an expression means. Writ large this is the principle behind equational reasoning and parametricity, which are the most powerful concepts available for reasoning about Haskell code. Type classes with unclear semantics undermine this, and while an Enum constraint may not be as nefarious as, say, Typeable would be, it's arguably closer to that than to something simple and coherent like Monoid or Functor. Alas, these properties are as fragile as they are useful. Take the humble and harmless "show" function, for instance. One might occasionally think that it would be handy to have an ambient implementation, allowing a value of any type to be converted to a string, even if only as a dummy value like "<<function>>". But allowing this without a Show constraint suffices to destroy the guarantees of parametricity, as surely as does any function with "unsafe" in its name! A terrible price for such a trifling convenience. "Civilization advances by extending the number of operations which we can perform without thinking about them." Deep underlying meaning has a deep utility of its own, but only to the extent to which it is kept absolute. ...and that is, at egregious length, why I find Enum dissatisfying. - C.

Evan Laforge

20 Sep 20 Sep

9:56 p.m.

New subject: instance Enum Double considered not entirely great?

...

I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result. Getting an unexpected element in a list doesn't really seem as bad as silently corrupting entire data structures.

Whoah, that's scary. What are some examples of this happening? Does this mean it's unsafe to store Doubles in a Map?

Casey McCann

10:20 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 5:56 PM, Evan Laforge wrote:

...

...
I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result. Getting an unexpected element in a list doesn't really seem as bad as silently corrupting entire data structures.

Whoah, that's scary. What are some examples of this happening? Does this mean it's unsafe to store Doubles in a Map?

Well, you can safely store Doubles in a Map as long as you use a key type with a bug-free Ord instance. :] Anyway, the problem with the Ord instance on floating point values is that it attempts (but fails anyway) to implement the ordering semantics given by the IEEE spec, which requires that all ordering comparisons return false if either argument is NaN, and that NaN /= NaN returns true. This is not the behavior normally expected from an Ord instance! Adding insult to injury, the "compare" function returns a value of type Ordering (which assumes a consistent total order), so the instance contradicts itself: "compare (0/0) (0/0)" gives GT, but "0/0 > 0/0" is false. This plays havoc with the search tree used internally by Set and Map, the result being that if you have any NaN values in the data structure, you may not be able to find other values anymore. Because NaN values never compare equal to themselves, I'm not sure if it's even possible to remove them from the structure, and because of tree rebalancing I'm not sure how to predict what the impact of one or more NaNs would be over multiple operations on the data structure. In short: Using Doubles in a Set, or as the key to a Map, should be regarded as a bug until proven otherwise (i.e., proving that NaN will never be inserted). If you'd like to see an explicit demonstration (which you can try in GHCi yourself!) see here: http://stackoverflow.com/questions/6399648/what-happens-to-you-if-you-break-... where I use it as an example of why it's important for type class instances to obey the relevant laws. - C.

Daniel Fischer

10:58 p.m.

New subject: instance Enum Double considered not entirely great?

On Wednesday 21 September 2011, 00:20:09, Casey McCann wrote:

...

This plays havoc with the search tree used internally by Set and Map, the result being that if you have any NaN values in the data structure, you may not be able to find other values anymore. Because NaN values never compare equal to themselves, I'm not sure if it's even possible to remove them from the structure,

filter (not . isNaN) resp. filterWithKey (\k _ -> not $ isNaN k)

...

and because of tree rebalancing I'm not sure how to predict what the impact of one or more NaNs would be over multiple operations on the data structure.

Yuck. Don't even try to predict that (unless you absolutely have to).

...

In short: Using Doubles in a Set, or as the key to a Map, should be regarded as a bug until proven otherwise (i.e., proving that NaN will never be inserted).

If you'd like to see an explicit demonstration (which you can try in GHCi yourself!) see here: http://stackoverflow.com/questions/6399648/what-happens-to-you-if-you-br eak-the-monad-laws/6399798#6399798 where I use it as an example of why it's important for type class instances to obey the relevant laws.

Nice and short.

Casey McCann

11:23 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 6:58 PM, Daniel Fischer wrote:

...

On Wednesday 21 September 2011, 00:20:09, Casey McCann wrote:

...
Because NaN values never compare equal to themselves, I'm not sure if it's even possible to remove them from the structure,

filter (not . isNaN)

resp.

filterWithKey (\k _ -> not $ isNaN k)

Er, right. Yes, of course. I'm not sure what I was thinking there. :] Though that still leaves the question of any damage done in the meantime, unless the filtering would repair the tree in the process.

...

...
and because of tree rebalancing I'm not sure how to predict what the impact of one or more NaNs would be over multiple operations on the data structure.

Yuck. Don't even try to predict that (unless you absolutely have to).

Agreed. The consequence of not trying, however, is that it isn't viable to let things slide at all--every insertion must be checked for NaNs, because otherwise you lose any guarantee that the tree will be valid next time you use it. One can imagine a similar data structure designed to be resilient and predictable in the face of ill-behaved comparisons, but surely it would be easier to just fix the problem instances!

...

...
If you'd like to see an explicit demonstration (which you can try in GHCi yourself!) see here: http://stackoverflow.com/questions/6399648/what-happens-to-you-if-you-br eak-the-monad-laws/6399798#6399798 where I use it as an example of why it's important for type class instances to obey the relevant laws.

Nice and short.

Yes, and credit where due for the original example. :] Don't recall which -cafe thread that came from, though. - C.

Daniel Fischer

21 Sep 21 Sep

12:20 a.m.

New subject: instance Enum Double considered not entirely great?

On Wednesday 21 September 2011, 01:23:48, Casey McCann wrote:

...

On Tue, Sep 20, 2011 at 6:58 PM, Daniel Fischer

wrote:

...
On Wednesday 21 September 2011, 00:20:09, Casey McCann wrote:

...
Because NaN values never compare equal to themselves, I'm not sure if it's even possible to remove them from the structure,

filter (not . isNaN)

resp.

filterWithKey (\k _ -> not $ isNaN k)

Er, right. Yes, of course. I'm not sure what I was thinking there. :] Though that still leaves the question of any damage done in the meantime, unless the filtering would repair the tree in the process.

With fromList . Prelude.filter (not . isNaN) . toList and the corresponding for Maps, you'll get a valid NaN-less tree. However, damage done in the meantime generally couldn't be undone.

...

...
...
and because of tree rebalancing I'm not sure how to predict what the impact of one or more NaNs would be over multiple operations on the data structure.

Yuck. Don't even try to predict that (unless you absolutely have to).

Agreed. The consequence of not trying, however, is that it isn't viable to let things slide at all--every insertion must be checked for NaNs, because otherwise you lose any guarantee that the tree will be valid next time you use it.

Yes, where NaNs matter, you always have to check (well, unless you *know* that your calculations don't produce any NaNs). Btw, -0.0 can be problematic too.

...

One can imagine a similar data structure designed to be resilient and predictable in the face of ill-behaved comparisons, but surely it would be easier to just fix the problem instances!

Except that people might expect IEEE semantics for (==), (<) etc. However, nowadays I tend to think that making the Eq and Ord instances well-behaved (wrt the class contract) and having separate IEEE comparisons would overall be preferable. There is still the question whether all NaNs should be considered equal or not [and where Ord should place NaNs].

...

...
...
If you'd like to see an explicit demonstration (which you can try in GHCi yourself!) see here: http://stackoverflow.com/questions/6399648/what-happens-to-you-if-you -br eak-the-monad-laws/6399798#6399798 where I use it as an example of why it's important for type class instances to obey the relevant laws.

Nice and short.

Yes, and credit where due for the original example. :] Don't recall which -cafe thread that came from, though.

Google suggests "Exception for NaN" from May.

Casey McCann

2:18 a.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 8:20 PM, Daniel Fischer wrote:

...

Yes, where NaNs matter, you always have to check (well, unless you *know* that your calculations don't produce any NaNs). Btw, -0.0 can be problematic too.

How so? As far as I can tell Ord and Eq treat it as equal to 0.0 in every way, which is correct and shouldn't break any expected behavior. I don't think it's required that distinguishable values be unequal, and while I imagine arguments could be made both ways on whether that would be a good idea, I don't see any way that could cause problems in code polymorphic on instances of Eq or Ord, which is the main concern to my mind.

...

Except that people might expect IEEE semantics for (==), (<) etc.

Yes, but probably fewer people than expect Map and Set to work correctly. :]

...

However, nowadays I tend to think that making the Eq and Ord instances well-behaved (wrt the class contract) and having separate IEEE comparisons would overall be preferable. There is still the question whether all NaNs should be considered equal or not [and where Ord should place NaNs].

IEEE semantics are incompatible with Ord regardless. The problem can be fixed by changing Ord, removing the instance completely, or changing the instance to ignore the IEEE spec. I think the latter is the least bad option in the big picture. I still don't see why it makes sense to add separate IEEE comparisons instead of just adding a standard partial order class, though. Surely posets are common enough to justify the abstraction, and it surprises me that one isn't already included. No doubt there are at least three or four different partial ordering classes on Hackage already. As for where Ord should place NaN, I still suggest it be the least element, to be consistent with the Ord instance for Maybe. If different NaNs are unequal, that may change matters.

...

Google suggests "Exception for NaN" from May.

Ah, yes, wherein someone suggested that comparing to NaN should be a runtime error rather than give incorrect results. A strictly more correct approach, but not one I find satisfactory... - C.

Richard O'Keefe

3:47 a.m.

New subject: instance Enum Double considered not entirely great?

On 21/09/2011, at 2:18 PM, Casey McCann wrote:

...

I still don't see why it makes sense to add separate IEEE comparisons instead of just adding a standard partial order class, though.

In any mathematical partial order, we expect x `le` x to be a law. But in IEEE arithmetic, if x is a NaN, x `le` x is false. I don't see how to reconcile these. I agree that a standard partial order class would be nice.

Felipe Almeida Lessa

4 a.m.

New subject: instance Enum Double considered not entirely great?

On Wed, Sep 21, 2011 at 12:47 AM, Richard O'Keefe wrote:

...

In any mathematical partial order, we expect x `le` x to be a law. But in IEEE arithmetic, if x is a NaN, x `le` x is false. I don't see how to reconcile these.

I agree that a standard partial order class would be nice.

logfloat package's PartialOrd solution is to have le :: PartialOrd a => a -> a -> Maybe Bool In IEEE arithmetic, if x is a NaN and y is anything else, x `le` y == y `le` x == Nothing. Cheers, -- Felipe.

Casey McCann

1:57 p.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 11:47 PM, Richard O'Keefe wrote:

...

On 21/09/2011, at 2:18 PM, Casey McCann wrote:

...
I still don't see why it makes sense to add separate IEEE comparisons instead of just adding a standard partial order class, though.

In any mathematical partial order, we expect x `le` x to be a law. But in IEEE arithmetic, if x is a NaN, x `le` x is false. I don't see how to reconcile these.

Ah, true. There is an obvious way to reconcile this that almost suffices, and is what I'd had in mind--simply declare that, just as positive and negative zero are distinct values but identified with each other by the ordering, let NaN be "disidentified" with itself. Essentially this treats NaN as representing an unbounded collection of distinct, but indistinguishable and incomparable, values, where you never end up getting the same one twice. This interpretation is self-consistent so long as the expressions being compared are distinct to begin with, but now that you point it out explicitly I realize it not only can't be justified when comparing syntactically identical terms, but that given equivalent expressions it would imply that a pure function gives different results each time, which is not in any way a satisfactory result of something that's trying to *improve* the semantics involved! So that's a bust. Bother. Specialized comparisons providing IEEE semantics seems the best option after all, then. I'd still like to see a standard partial order type class, but apparently it wouldn't help in this case. - C.

Daniel Fischer

4:09 a.m.

New subject: instance Enum Double considered not entirely great?

On Wednesday 21 September 2011, 04:18:38, Casey McCann wrote:

...

On Tue, Sep 20, 2011 at 8:20 PM, Daniel Fischer

wrote:

...
Yes, where NaNs matter, you always have to check (well, unless you *know* that your calculations don't produce any NaNs). Btw, -0.0 can be problematic too.

How so? As far as I can tell Ord and Eq treat it as equal to 0.0 in every way,

Yes. Which can be inconvenient if you are interested in whether you got a -0.0, so if that's the case, you can't simply use (== -0.0). Okay, problematic is a too strong word, but it's another case that may require special treatment.

...

which is correct and shouldn't break any expected behavior. I don't think it's required that distinguishable values be unequal,

But desirable, IMO.

...

and while I imagine arguments could be made both ways on whether that would be a good idea, I don't see any way that could cause problems in code polymorphic on instances of Eq or Ord,

It wouldn't do that, as far as I'm aware.

...

which is the main concern to my mind.

...
Except that people might expect IEEE semantics for (==), (<) etc.

Yes, but probably fewer people than expect Map and Set to work correctly. :]

True.

...

...
However, nowadays I tend to think that making the Eq and Ord instances well-behaved (wrt the class contract) and having separate IEEE comparisons would overall be preferable. There is still the question whether all NaNs should be considered equal or not [and where Ord should place NaNs].

IEEE semantics are incompatible with Ord regardless. The problem can be fixed by changing Ord, removing the instance completely, or changing the instance to ignore the IEEE spec. I think the latter is the least bad option in the big picture.

Agreed.

...

I still don't see why it makes sense to add separate IEEE comparisons

Pure and simple: speed. That is what the machine instructions, and hence the primops, deliver.

...

instead of just adding a standard partial order class, though. Surely posets are common enough to justify the abstraction, and it surprises me that one isn't already included. No doubt there are at least three or four different partial ordering classes on Hackage already.

As for where Ord should place NaN, I still suggest it be the least element, to be consistent with the Ord instance for Maybe.

Seems reasonable.

...

If different NaNs are unequal, that may change matters.

Yeah, if there are a lot of them, it might be better to put them at the end [that is, make them larger than any non-NaN].

...

...
Google suggests "Exception for NaN" from May.

Ah, yes, wherein someone suggested that comparing to NaN should be a runtime error rather than give incorrect results. A strictly more correct approach, but not one I find satisfactory...

Umm, 'more correct' only in some sense. Definitely unsatisfactory. do x <- someList y <- someComputation guard (not $ isNaN y) z <- someOtherComputation return (if z < 3 then foo z else bar z)

Ketil Malde

7:43 a.m.

New subject: instance Enum Double considered not entirely great?

Daniel Fischer writes:

...

...
...
Btw, -0.0 can be problematic too.

...

...
How so? As far as I can tell Ord and Eq treat it as equal to 0.0 in every way,

...

Yes. Which can be inconvenient if you are interested in whether you got a -0.0, so if that's the case, you can't simply use (== -0.0).

For instance, somebody might have the idea to store floating point values in a HashSet, which might (or might not) produce a different result from the regular Set in this case. Conversely, you might get different values out of the set than the ones you put into it, which could be surprising. IMO, it's definitely a good practice to avoid otherwise distinguishable values comparing as equal. -k -- If I haven't seen further, it is by standing in the footprints of giants

Casey McCann

6:39 p.m.

New subject: instance Enum Double considered not entirely great?

On Wed, Sep 21, 2011 at 12:09 AM, Daniel Fischer wrote:

...

Yes. Which can be inconvenient if you are interested in whether you got a -0.0, so if that's the case, you can't simply use (== -0.0). Okay, problematic is a too strong word, but it's another case that may require special treatment.

Hmm. I was going to suggest that it's not a major concern so long as the distinction can't be observed without using functions specific to floating point values, since that preserves consistent behavior for polymorphic functions, but... that's not true, because the sign is preserved when dividing by zero! So we currently have the following behavior: 0 == (-0) = True 1/0 == 1/(-0) = False signum (-0) = 0.0 signum (1/0) = 1.0 signum (1/(-0)) = -1.0 All of which is, I believe, completely correct according to IEEE semantics, but seems to cause very awkward problems for any sensible semantics of Haskell's type classes. ...sigh.

...

...
which is correct and shouldn't break any expected behavior. I don't think it's required that distinguishable values be unequal,

But desirable, IMO.

I'm ambivalent. I can see it making sense for truly equivalent values, where there's a reasonable expectation that anything using them should give the same answer, or when there's a clearly-defined normal form that values may be reduced to. But as demonstrated above, this isn't the case with signed zeros if Num is available as well as Eq.

...

...
I still don't see why it makes sense to add separate IEEE comparisons

Pure and simple: speed. That is what the machine instructions, and hence the primops, deliver.

Oh, I assume the IEEE operations would be available no matter what, possibly as separate operations monomorphic to Float and Double, that they'd be used to define the partial ordering instance, and could be imported directly from some appropriate module. But as it turns out the partial ordering isn't valid anyway, so I retract this whole line of argument.

...

...
Ah, yes, wherein someone suggested that comparing to NaN should be a runtime error rather than give incorrect results. A strictly more correct approach, but not one I find satisfactory...

Umm, 'more correct' only in some sense. Definitely unsatisfactory.

More correct in the very narrow sense of producing fewer incorrect answers, according to Haskell semantics. :] That it would produce fewer answers in general and a great deal more bottoms is another matter. Certainly not useful, and in fact actively counterproductive given that the whole purpose of silent NaNs is to allow computations to proceed without handling exceptions at every step along the way. I'm becoming increasingly convinced that the only strictly coherent approach in the overall scheme of things would be to banish floating point values from most of the standard libraries except where they can be given correct implementations according to Haskell semantics, and instead provide a module (not re-exported by the Prelude) that gives operations using precise IEEE semantics and access to all the expected primops and such. As you said above, the importance of floating point values is for speed, and the IEEE semantics are designed to support that. So I'm happy to consider floats as purely a performance optimization that should only be used when number crunching is actually a bottleneck. Let Rational be the default fractional type instead and save everyone a bunch of headaches. - C.

Daniel Fischer

8:06 p.m.

New subject: instance Enum Double considered not entirely great?

On Wednesday 21 September 2011, 20:39:09, Casey McCann wrote:

...

On Wed, Sep 21, 2011 at 12:09 AM, Daniel Fischer

wrote:

...
Yes. Which can be inconvenient if you are interested in whether you got a -0.0, so if that's the case, you can't simply use (== -0.0). Okay, problematic is a too strong word, but it's another case that may require special treatment.

Hmm. I was going to suggest that it's not a major concern so long as the distinction can't be observed without using functions specific to floating point values, since that preserves consistent behavior for polymorphic functions, but... that's not true, because the sign is preserved when dividing by zero! So we currently have the following behavior:

0 == (-0) = True 1/0 == 1/(-0) = False signum (-0) = 0.0 signum (1/0) = 1.0 signum (1/(-0)) = -1.0

All of which is, I believe, completely correct according to IEEE semantics,

Yup.

...

but seems to cause very awkward problems for any sensible semantics of Haskell's type classes.

Well, that's something you risk whenever you have an Eq instance regarding some non-identical values as equal. Some function may distinguish between them, cf. e.g. showTree in Data.Set/Map for a non-floating-point example.

...

...sigh.

...
...
which is correct and shouldn't break any expected behavior. I don't think it's required that distinguishable values be unequal,

But desirable, IMO.

I'm ambivalent. I can see it making sense for truly equivalent values, where there's a reasonable expectation that anything using them should give the same answer, or when there's a clearly-defined normal form that values may be reduced to.

Yes, it's not an absolute, but if your Eq instance declares distinguishable values equal, you better have a very good reason for it. The reason for Data.Set/Map is good enough, I think. -0.0 == 0.0 is borderline. If Double/Float get Eq and Ord instances avoiding the NaN poison, I'd prefer to distinguish -0.0 from 0.0 too, leaving the identification to the IEEE comparisons.

...

But as demonstrated above, this isn't the case with signed zeros if Num is available as well as Eq.

...
...
I still don't see why it makes sense to add separate IEEE comparisons

Pure and simple: speed. That is what the machine instructions, and hence the primops, deliver.

Oh, I assume the IEEE operations would be available no matter what, possibly as separate operations monomorphic to Float and Double, that

That too, but I want to keep the polymorphic variants available, it's easier to change a few type signatures near the top than hunting through the entire project to replace eqDouble with eqFloat etc. and recompile everything.

...

they'd be used to define the partial ordering instance, and could be imported directly from some appropriate module.

But as it turns out the partial ordering isn't valid anyway, so I retract this whole line of argument.

...
...
Ah, yes, wherein someone suggested that comparing to NaN should be a runtime error rather than give incorrect results. A strictly more correct approach, but not one I find satisfactory...

Umm, 'more correct' only in some sense. Definitely unsatisfactory.

More correct in the very narrow sense of producing fewer incorrect answers, according to Haskell semantics. :] That it would produce fewer answers in general and a great deal more bottoms is another matter. Certainly not useful, and in fact actively counterproductive given that the whole purpose of silent NaNs is to allow computations to proceed without handling exceptions at every step along the way.

Quite.

...

I'm becoming increasingly convinced that the only strictly coherent approach in the overall scheme of things would be to banish floating point values from most of the standard libraries except where they can

Hmm. I don't particularly like that idea. Correctly handling floating point numbers isn't trivial - So What? They're extremely useful, they deserve their place. Put a bumper over the sharpest edges, write "Enter at your own risk" on the garage door, that's enough.

...

be given correct implementations according to Haskell semantics, and instead provide a module (not re-exported by the Prelude) that gives operations using precise IEEE semantics and access to all the expected primops and such. As you said above, the importance of floating point values is for speed, and the IEEE semantics are designed to support that. So I'm happy to consider floats as purely a performance optimization that should only be used when number crunching is actually a bottleneck.

...

Let Rational be the default fractional type instead and save everyone a bunch of headaches.

If only things were so easy. You can't satisfactorily define functions like sqrt, exp, log, sin, cos ... for Rational, so for a large class of tasks you need floating point numbers (yes, one could also use arbitrary precision numbers of some kind), regardless of performance considerations. And unfortunately even plain arithmetic quickly leads to huge numbers with Rational. When dealing with fractional numbers, you can only choose which headache you prefer. Each is the lesser evil for some tasks.

Daniel Fischer

20 Sep 20 Sep

10:21 p.m.

New subject: instance Enum Double considered not entirely great?

On Tuesday 20 September 2011, 23:56:53, Evan Laforge wrote:

...

...
I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as a result. Getting an unexpected element in a list doesn't really seem as bad as silently corrupting entire data structures.

Whoah, that's scary. What are some examples of this happening? Does this mean it's unsafe to store Doubles in a Map?

Too lazy to work out the details, but since NaN ? x = False for ? any of <, >, <=, >=, (==, /=) and compare is defined on terms of these, all results of compare involving a NaN are GT. member and insert in Data.set use compare to find out where to go, so inserting NaNs puts them at the max position. Insert a couple, and rebalancing can put one above non-NaN values, oops.

Brandon Allbery

21 Sep 21 Sep

1:15 a.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 17:56, Evan Laforge wrote:

...

...
I actually think the brokenness of Ord for floating point values is worse in many ways, as demonstrated by the ability to insert a value into a Data.Set.Set and have other values "disappear" from the set as

Whoah, that's scary. What are some examples of this happening? Does this mean it's unsafe to store Doubles in a Map?

It's unsafe to use them as a key ((Set a) being effectively (Map a ())), but that should be pretty obvious anyway given that (==) isn't reliable for Double. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

Chris Smith

20 Sep 20 Sep

4:24 p.m.

On Tue, 2011-09-20 at 17:39 +0200, Ketil Malde wrote:

...

You forgot "confusing"?

I didn't forget it; whether it's confusing or not depends on the perspective you're coming from. The kids in my beginning programming class are using Enum (via the list syntactic sugar) on Float and don't get confused... so perhaps we ought to ask what the cause of the confusion is.

...

Expecting Enum to enumerate all inhabitants of a type seems very reasonable to me, and seems to hold for all non-floating point types.

Floating point (and fixed point, for that matter) types approximate real numbers, which of course have no possible enumeration of all values. Even if you want to say they approximate rational numbers, it remains the case that the rationals have no linearly ordered enumeration of all their values, which would be needed to be compatible with the approximation. It seems to me particularly pointless to define an Enum instance that focuses on, above all else, the inaccuracy of that approximation. Incidentally, you can add Rational to the list of types that define Enum that way and don't enumerate all possible values. And the Haskell Report gives a generic implementation of Enum in terms of Num, which behaves that way. Perhaps I was understating the case in saying the behavior was established but undocumented; rather, it's explicitly documented in the Haskell Report, just not as a requirement for programmer-defined instances of the Num class (because that's not the job of the Report).

...

Or just avoid Enum, and define "range" or something similar instead.

If Haskell defined list syntax in terms of something that's not called Enum, that would be fine. Renaming is never all that big a deal. But the list sugar is a big deal, and I don't think there's any point at all in leaving the list sugar associated with something as minor as building a representation of the inaccuracy of your approximations. -- Chris

Ketil Malde

10:04 p.m.

New subject: instance Enum Double considered not entirely great?

...

...
You forgot "confusing"?

...

I didn't forget it

Sorry, I should have written "omitted".

...

perhaps we ought to ask what the cause of the confusion is.

tr.v. e·nu·mer·at·ed, e·nu·mer·at·ing, e·nu·mer·ates 1. To count off or name one by one; list: A spokesperson enumerated the strikers' demands. 2. To determine the number of; count. Regardless of how easily the children you teach get it, to me enumerating is a different thing from repeatedly adding one.

...

...
Expecting Enum to enumerate all inhabitants of a type seems very reasonable to me, and seems to hold for all non-floating point types.

...

Floating point (and fixed point, for that matter) types approximate real numbers, which of course have no possible enumeration of all values. [..] It seems to me particularly pointless to define an Enum instance that focuses on, above all else, the inaccuracy of that approximation.

Yes. But we need an Enum instance to get the syntactic sugar of [1..10], so one is defined anyway.

...

Perhaps I was understating the case in saying the behavior was established but undocumented; rather, it's explicitly documented in the Haskell Report.

Absolutely, it is, as I've said, the reality - like it or not.

...

...
Or just avoid Enum, and define "range" or something similar instead.

...

If Haskell defined list syntax in terms of something that's not called Enum, that would be fine. Renaming is never all that big a deal. But the list sugar is a big deal, and I don't think there's any point at all in leaving the list sugar associated with something as minor as building a representation of the inaccuracy of your approximations.

I must admit I don't understand this comment. If the fixpoint library wants to provide the functionality (producing all values between two points), and can't/shouldn't use Enum, surely it must provide a different function, and let go of the list sugar? -k -- If I haven't seen further, it is by standing in the footprints of giants

Chris Smith

10:11 p.m.

On Wed, 2011-09-21 at 00:04 +0200, Ketil Malde wrote:

...

...
If Haskell defined list syntax in terms of something that's not called Enum, that would be fine. Renaming is never all that big a deal. But the list sugar is a big deal, and I don't think there's any point at all in leaving the list sugar associated with something as minor as building a representation of the inaccuracy of your approximations.

I must admit I don't understand this comment. If the fixpoint library wants to provide the functionality (producing all values between two points), and can't/shouldn't use Enum, surely it must provide a different function, and let go of the list sugar?

Sorry to be unclear. I mean that instead of removing a useful instance, if people find the use of Enum for Float to be objectionable, then perhaps (via language extensions, deprecation, all the usual backward compatibility slow-change stuff) the desugaring of list ranges should be changed to not use something with a name you'd object to, rather than just removing the feature. In any case, as long as Enum *is* the backing for list desugaring, it seems like a mistake to define instances that are completely unuseful for list desugaring. -- Chris

Richard O'Keefe

21 Sep 21 Sep

3:29 a.m.

New subject: instance Enum Double considered not entirely great?

On 21/09/2011, at 2:59 AM, Chris Smith wrote:

...

On Mon, 2011-09-19 at 22:09 -0700, Evan Laforge wrote:

...
Then I tried switching to a fixed point format, and discovered my mistake. Enum is supposed to enumerate every value between the two points, and the result is memory exhaustion.

% ghci Prelude> [1.0..2.0]::[Double] [1.0,2.0] (..) for Doubles is using (+1), not nextAfter, and is NOT enumerating every value between 1.0 and 2.0

...

import Ratio Prelude Ratio> [1%2..7%2] :: [Ratio Int] [1 % 2,3 % 2,5 % 2,7 % 2]

(..) for (Ratio a) is using (+1), and is NOT enumerating the infinitely many values between 1.5 and 3.5. Why should your fixed point format behave any differently?

...

I certainly hope not. Instead, perhaps the issue should be brought up with the fixed-point number library you're using, and they could fix their Enum instance to be more helpful.

So Chris Smith's response is absolutely right here; the problem is the fixed-point library implementing enumeration incompatibly with the built-in numeric types.

Jake McArthur

11:42 a.m.

New subject: instance Enum Double considered not entirely great?

On Tue, Sep 20, 2011 at 11:29 PM, Richard O'Keefe wrote:

...

On 21/09/2011, at 2:59 AM, Chris Smith wrote:

...
On Mon, 2011-09-19 at 22:09 -0700, Evan Laforge wrote:

...
Then I tried switching to a fixed point format, and discovered my mistake. Enum is supposed to enumerate every value between the two points, and the result is memory exhaustion.

% ghci Prelude> [1.0..2.0]::[Double] [1.0,2.0]

(..) for Doubles is using (+1), not nextAfter, and is NOT enumerating every value between 1.0 and 2.0

...
import Ratio Prelude Ratio> [1%2..7%2] :: [Ratio Int] [1 % 2,3 % 2,5 % 2,7 % 2]

(..) for (Ratio a) is using (+1), and is NOT enumerating the infinitely many values between 1.5 and 3.5.

Why should your fixed point format behave any differently?

Evan's claim was that Double and Ratio are doing the incorrect thing; the evidence you gave may support your point, but it supports his as well. My claim now, and I think Evan agrees although I am not sure, is that Double and Ratio shouldn't be instances of Enum at all, since enumerating (a simulation of) the reals and enumerating the rationals in order is nonsensical. I also find that toEnum . fromEnum /= id annoying; anything that relies on it, like EnumSet/EnumMap [1], goes down the toilet. Other things I think are reasonable to expect are also broken; for example, toEnum . succ . fromEnum /= succ (granted, it is reasonable to expect this to be broken considering that toEnum . fromEnum is broken). With fixed point numbers, it makes sense to have an Enum instance. Enumeration is reasonable because most applications for fixed point arithmetic do *not* want to pretend that they are real numbers; you almost always want to be aware of the current precision and whether you might overflow or need more precision. This situation is no different from Word or Int. toEnum and fromEnum are also inverses. No expectations are violated here unless you have already gotten used to the broken Float, Double, and Rational instances. - Jake [1] http://www.haskell.org/haskellwiki/EnumSet_EnumMap

Richard O'Keefe

22 Sep 22 Sep

11:02 p.m.

New subject: instance Enum Double considered not entirely great?

On 21/09/2011, at 11:42 PM, Jake McArthur wrote:

...

With fixed point numbers, it makes sense to have an Enum instance.

What is the use case?

...

Enumeration is reasonable because most applications for fixed point arithmetic do *not* want to pretend that they are real numbers;

But that does not mean you want to pretend they are integers, and having an Enum instance is basically about pretending to be integers.

...

you almost always want to be aware of the current precision and whether you might overflow or need more precision.

There are at least two defensible understandings of what a fixed point number means. One is appropriate for finance, which is that the numbers are exact rational numbers of the form m/b^n for integer m, n and integer b > 1. (For example, when I was born, it made sense to think of money as m/960, where m is the number of farthings, giving you a fixed point number representing pounds.) On that reading, addition, subtraction, multiplication, integer quotient, and remainder are exact, and each other division has to be given an explicit rounding method. It is difficult to fit this understanding into Haskell (although given the fact that it _is_ possible to do type-level arithmetic, not _impossible_). The real problem is fitting it into the class system, because (+) :: Fixed m -> Fixed n -> Fixed (Max m n) (*) :: Fixed m -> Fixed n -> Fixed (Plus m n), while compare :: Fixed m -> Fixed n -> Ordering makes sense for any (types representing naturals) m, n. The other understanding is appropriate for engineering (think of ADCs and DACs) and is that the numbers are approximate. That seems to be what you have in mind. Across the spectrum of programming languages, other understandings also exist: I'm aware of one programming language where "fixed" point numbers are limited to 31 digits of precision and morph into a weird sort of floating point rather than go over the precision limit, and another where fixed point numbers are really arbitrary precision rationals that *print* to limited precision (OUCH).

...

This situation is no different from Word or Int. toEnum and fromEnum are also inverses. No expectations are violated here unless you have already gotten used to the broken Float, Double, and Rational instances.

Let's face it, Enum badly needs some revision. We have toEnum :: Int -> a fromEnum :: a -> Int and yet we have instance Enum Integer How is _that_ supposed to work? Or instance Enum Int64 on a system where Int is 32 bits? And yet '..' syntax makes perfect sense for any size of integer. I do think that '..' syntax for Float and Double could be useful, but the actual definition is such that, well, words fail me. [1.0..3.5] => [1.0,2.0,3.0,4.0] ???? Why did anyone ever think _that_ was a good idea? I would love to see a law that (Ord t, Enum t) => (∀a, b :: t) x ∈ [a..b] ⇒ a ≤ x && x ≤ b -- not valid (sigh) This is not the same as a law that (Ord t, Enum t) => (∀a, b, x :: t) a ≤ x && x ≤ b ⇒ x ∈ [a..b] -- not valid (ho hum) As things currently stand, neither of these laws is valid. It is even easy to find a value for b :: Double such that [b..b] is empty. Not good. As things c

Jake McArthur

23 Sep 23 Sep

12:05 a.m.

New subject: instance Enum Double considered not entirely great?

On Thu, Sep 22, 2011 at 7:02 PM, Richard O'Keefe wrote:

...

On 21/09/2011, at 11:42 PM, Jake McArthur wrote:

...
With fixed point numbers, it makes sense to have an Enum instance.

What is the use case?

I'm not quite sure how to answer this. I'm speaking in a fairly general sense.

...

But that does not mean you want to pretend they are integers, and having an Enum instance is basically about pretending to be integers.

Even if you are trying to fit your explanation of Enum to the existing behavior, I disagree because ([0.0, 0.1 .. 1.0] :: Double) is inconsistent with that explanation. I think the only reasonable characterization of Enum's current behavior is that it provides some way to take same-sized steps through a range of values, and that's about all it promises.

...

There are at least two defensible understandings of what a fixed point number means. One is appropriate for finance, which is that the numbers are exact rational numbers of the form m/b^n for integer m, n and integer b > 1. [...]

The other understanding is appropriate for engineering (think of ADCs and DACs) and is that the numbers are approximate. That seems to be what you have in mind.

I have both in mind, although it's actually the former style of fixed point arithmetic that I'm more interested in (and I disagree that there is such a distinction between finance and engineering... precision is very important in engineering). The former is not as difficult as you have characterized it, and my fixed-point package provides an interface similar to (although still different from) what you described. Care is taken to make the precision clear in the types and documentation. There isn't currently much choice for rounding methods for division, but this is something I intend to correct in the future. Evan also pointed out some bugs and other lacking functionality which I intend to address this weekend. Fixed point arithmetic is really only useful as an approximation when you have a statically bounded range of values to work in; otherwise, you might as well just use floating point arithmetic anyway.

...

Let's face it, Enum badly needs some revision. [...]

Agreed.

...

As things c

I think your email got cut off here. - Jake

Chris Smith

25 Sep 25 Sep

5:20 p.m.

Would it be an accurate summary of this thread that people are asking for (not including quibbles about naming and a few types): class Ord a => Enum a where succ :: a -> a pred :: a -> a fromEnum :: a -> Int(eger) toEnum :: Int(eger) -> a -- No instance for Float/Double class Ord a => Range a where rangeFromTo :: a -> a -> [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a -> a -> a -> [a] inRange :: (a, a) -> a -> Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples class Range a => InfiniteRange a where -- [1] rangeFrom :: a -> [a] rangeFromThen :: a -> a -> [a] -- Has instances for Float/Double -- No instances for tuples class Range a => Ix a where index :: (a, a) -> a -> Int rangeSize :: (a, a) -> Int -- Again no instances for Float/Double. Having an instance here implies -- that the rangeFrom* are "complete", containing all 'inRange' values class (RealFrac a, Floating a) => RealFloat a where ... -- existing stuff (.<.), (.<=.), (.>.), (.>=.), (.==.) :: a -> a -> Bool -- these are IEEE semantics when applicable instance Ord Float where ... -- real Ord instance where NaN has a place There would be the obvious properties stated for types that are instances of both Enum and Range, but this allows for non-Enum types to still be Range instances. If there's general agreement on this, then we at least have a proposal, and one that doesn't massively complicate the existing system. The next step, I suppose would be to implement it in an AltPrelude module and (sadly, since Enum is changing meaning) a trivial GHC language extension. Then the real hard work of convincing more people to use it would start. If that succeeds, the next hard work would be finding a compatible way to make the transition... I'm not happy with InfiniteRange, but I imagine the alternative (runtime errors) won't be popular in the present crowd. -- Chris

Daniel Fischer

8:41 p.m.

New subject: instance Enum Double considered not entirely great?

On Sunday 25 September 2011, 19:20:52, Chris Smith wrote:

...

Would it be an accurate summary of this thread that people are asking for (not including quibbles about naming and a few types):

Not quite, I'm afraid.

...

class Ord a => Enum a where succ :: a -> a pred :: a -> a fromEnum :: a -> Int(eger) toEnum :: Int(eger) -> a -- No instance for Float/Double

I'm not in favour of introducing an Ord constraint here. For data WeekDay = Sunday ... data Month = January ... an Ord instance would be dubious, but Enum is plenty fine.

...

class Ord a => Range a where rangeFromTo :: a -> a -> [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a -> a -> a -> [a] inRange :: (a, a) -> a -> Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples

Don't mix range and arithmetic sequences. I want arithmetic sequences for Double, Float and Rational, but not range. (For Float and Double one could implement range [all values between the given bounds, in increasing order, would be the desired/expected semantics for that, I think?], but I'm rather sure that's not what one does normally want, and for Rational, you can't even implement it.) Also, I doubt whether rangeFromThenTo is a useful addition to range, I don't see how it would be natural for tuples. (The Ix instance for tuples doesn't use the lexicographic ordering, but the box-partial order - presumably so would the Range instance, so the 'distance' between two tuples would depend on the given bounds. Using the box-partial order is fine for range, but seems weird for blahFromThenTo.)

...

class Range a => InfiniteRange a where -- [1] rangeFrom :: a -> [a] rangeFromThen :: a -> a -> [a] -- Has instances for Float/Double -- No instances for tuples

class Range a => Ix a where index :: (a, a) -> a -> Int rangeSize :: (a, a) -> Int

-- Again no instances for Float/Double. Having an instance here implies -- that the rangeFrom* are "complete", containing all 'inRange' values

Ho Hum. So Range would continue the same ambiguity/confusion that started this thread, albeit in mitigated form. Separating range from arithmetic (or 'fixed-step-size') sequences is cleaner (we'd lose default methods anyway, you need Enum or Num && Ord for them, but we now have numericEnumFrom* to make Enum instances for Num types easier, we could move the current default methods out of the class to have enumEnumFrom* so that writing instances for Enum types would be easier).

...

class (RealFrac a, Floating a) => RealFloat a where ... -- existing stuff (.<.), (.<=.), (.>.), (.>=.), (.==.) :: a -> a -> Bool -- these are IEEE semantics when applicable

instance Ord Float where ... -- real Ord instance where NaN has a place

Yes. I have pondered leaving Eq and Ord for Double and Float as is and providing a newtype wrapper with container/sort-safe instances, but that'd be cumbersome, people wouldn't know they exist and (when) they have to use them, urk. Also, although it's a change in behaviour, it doesn't badly break backwards compatibility., as far as I can see (I hope x /= x isn't heavily used as a NaN test). So yes, definitely yes.

...

There would be the obvious properties stated for types that are instances of both Enum and Range, but this allows for non-Enum types to still be Range instances.

If there's general agreement on this, then we at least have a proposal, and one that doesn't massively complicate the existing system. The next step, I suppose would be to implement it in an AltPrelude module and (sadly, since Enum is changing meaning) a trivial GHC language extension. Then the real hard work of convincing more people to use it would start. If that succeeds, the next hard work would be finding a compatible way to make the transition...

I'm not happy with InfiniteRange, but I imagine the alternative (runtime errors) won't be popular in the present crowd.

Chris Smith

9:13 p.m.

...

Don't mix range and arithmetic sequences. I want arithmetic sequences for Double, Float and Rational, but not range. (For Float and Double one could implement range [all values between the given bounds, in increasing order, would be the desired/expected semantics for that, I think?],

Okay, fine, I tried. Obviously, I'm opposed to just flat removing features from the language, especially when they are so useful that they are being used without any difficulty at all by the 12 year olds I'm teaching right now. Someone (sorry, not me) should really write up the proposed change to Ord for Float/Double and shepherd them through the haskell-prime process. That one shouldn't even be controversial; there's already an isNaN people should be using for NaN checks, and any code relying on the current behavior is for all intents and purposes broken anyway. The only question is whether to add the new methods to RealFloat (breaking on the bizarre off chance that someone has written a nonstandard RealFloat instance), or add a new IEEE type class. -- Chris Smith

Daniel Fischer

10:08 p.m.

New subject: instance Enum Double considered not entirely great?

On Sunday 25 September 2011, 23:13:47, Chris Smith wrote:

...

...
Don't mix range and arithmetic sequences. I want arithmetic sequences for Double, Float and Rational, but not range. (For Float and Double one could implement range [all values between the given bounds, in increasing order, would be the desired/expected semantics for that, I think?],

Okay, fine, I tried. Obviously, I'm opposed to just flat removing features from the language, especially when they are so useful that they are being used without any difficulty at all by the 12 year olds I'm teaching right now.

Agreed. But If we want a change to remove a wart, we should try to remove it completely. We can still settle for "make it smaller" if it doesn't work out.

...

Someone (sorry, not me) should really write up the proposed change to Ord for Float/Double

Okay.

...

and shepherd them through the haskell-prime process.

Uh oh. I ope that can be done with a libraries proposal. (Ian says yes :-D)

...

That one shouldn't even be controversial; there's already an isNaN people should be using for NaN checks, and any code relying on the current behavior is for all intents and purposes broken anyway. The only question is whether to add the new methods to RealFloat (breaking on the bizarre off chance that someone has written a nonstandard RealFloat instance), or add a new IEEE type class.

Add to RealFloat, default to the Eq/Ord functions, I'd say. But that's not the only question. Is -0.0 == 0.0 or not? I lean towards no because of 1/x, but I'm not wedded to that. And: distinguish NaNs or identify them all? I lean towards identifying them all, I've never cared for whether they come from 0/0, Infinity - Infinity or what, but I could be convinced.

Richard O'Keefe

26 Sep 26 Sep

9:34 p.m.

New subject: instance Enum Double considered not entirely great?

On 26/09/2011, at 11:08 AM, Daniel Fischer wrote:

...

And: distinguish NaNs or identify them all? I lean towards identifying them all, I've never cared for whether they come from 0/0, Infinity - Infinity or what, but I could be convinced.

There are very many bit patterns that count as NaNs, but only two classes worth distinguishing: qNaNs and sNaNs. The payload bits have no cross-platform significance; if it comes to that, there are two opposing conventions for which value of the is-it-quiet bit, so it's not clear if any distinction is worth making.

Yitzchak Gale

3:52 p.m.

New subject: instance Enum Double considered not entirely great?

Chris Smith wrote:

...

class Ord a => Range a where...

Before adding a completely new Range class, I would suggest considering Paul Johnson's Ranged-sets package: http://hackage.haskell.org/package/Ranged-sets Ranges have many more natural operations, and interactions with other classes, than you mention. The Ranged-sets package is well thought out and general. It distinguishes carefully between discrete and non-discrete types, and provides appropriate instances for the standard types. Thanks, Yitz

Chris Smith

5:54 p.m.

On Mon, 2011-09-26 at 18:52 +0300, Yitzchak Gale wrote:

...

Chris Smith wrote:

...
class Ord a => Range a where...

Before adding a completely new Range class, I would suggest considering Paul Johnson's Ranged-sets package:

Well, my goal was to try to find a minimal and simple answer that doesn't break anything or add more complexity. So I don't personally find the idea of adding multiple *more* type classes appealing. In any case, it doesn't make much difference either way. It's clear that no one is going to be satisfied here, so there's really no point in making any change. In fact, if this conversation leads to changes, it looks like it will just break a bunch of code and make Haskell harder to use. -- Chris Smith

Roman Leshchinskiy

28 Sep 28 Sep

9:56 p.m.

New subject: instance Enum Double considered not entirely great?

On 25/09/2011, at 18:20, Chris Smith wrote:

...

class Ord a => Range a where rangeFromTo :: a -> a -> [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a -> a -> a -> [a] inRange :: (a, a) -> a -> Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples

class Range a => InfiniteRange a where -- [1] rangeFrom :: a -> [a] rangeFromThen :: a -> a -> [a] -- Has instances for Float/Double -- No instances for tuples

I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should ensure that they can be defined generically for all types. The rationale is that other data structures (like arrays) want to provide similar functions without having to go through lists. Roman

Ivan Lazar Miljenovic

10:23 p.m.

New subject: instance Enum Double considered not entirely great?

On 29 September 2011 07:56, Roman Leshchinskiy wrote:

...

On 25/09/2011, at 18:20, Chris Smith wrote:

...
class Ord a => Range a where rangeFromTo :: a -> a -> [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a -> a -> a -> [a] inRange :: (a, a) -> a -> Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples

class Range a => InfiniteRange a where -- [1] rangeFrom :: a -> [a] rangeFromThen :: a -> a -> [a] -- Has instances for Float/Double -- No instances for tuples

I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should ensure that they can be defined generically for all types. The rationale is that other data structures (like arrays) want to provide similar functions without having to go through lists.

Wouldn't this require something like the ListLike class? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Roman Leshchinskiy

10:50 p.m.

New subject: instance Enum Double considered not entirely great?

On 28/09/2011, at 23:23, Ivan Lazar Miljenovic wrote:

...

On 29 September 2011 07:56, Roman Leshchinskiy wrote:

...
On 25/09/2011, at 18:20, Chris Smith wrote:

...
class Ord a => Range a where rangeFromTo :: a -> a -> [a] -- subsumes Ix.range / Enum.enumFromTo rangeFromThenTo :: a -> a -> a -> [a] inRange :: (a, a) -> a -> Bool -- Does have instances for Float/Double. List ranges desugar to this. -- Also has instances for tuples

class Range a => InfiniteRange a where -- [1] rangeFrom :: a -> [a] rangeFromThen :: a -> a -> [a] -- Has instances for Float/Double -- No instances for tuples

I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should ensure that they can be defined generically for all types. The rationale is that other data structures (like arrays) want to provide similar functions without having to go through lists.

Wouldn't this require something like the ListLike class?

Not at all. You could have something like: class Enum a where enumFromToSize :: a -> a -> Integer advance :: a -> Integer -> a ... And then [x..y] would desugar to map (advance x) (enumFromTo_Integer 0 $ enumFromToSize x y) where enumFromTo_Integer would be primitive. Of course, it's possible to design a much more efficient interface but this should give a general idea. An added benefit would be that you could generate the sequence in parallel (which is quite crucial for, e.g., DPH). Basically, the requirements would be that you can get the size of a range and compute the nth element of a range (or, equivalently, split the range) in constant time. Are there any Enum instances which don't satisfy this (apart from the broken floating point instances which *could* satisfy this)? As it stands, none of the array libraries that I've participated in designing and writing can use the Enum class properly (or, in the case of DPH, at all). For instance, vector has 230 lines of code (including comments) and 16 rules to implement enumFromTo (the vector version) halfway efficiently when the element type is known statically. I haven't bothered with enumFromThenTo so far. Interestingly, GHC's *list* library has to jump through similar hoops to make enumFromTo and enumFromThenTo work with foldr/build fusion (again, only when the element type is known statically). IMO, making enumFromThen and friends into methods just doesn't work, not even for lists really. Roman

Chris Smith

23 Sep 23 Sep

4:06 a.m.

On Fri, 2011-09-23 at 11:02 +1200, Richard O'Keefe wrote:

...

I do think that '..' syntax for Float and Double could be useful, but the actual definition is such that, well, words fail me. [1.0..3.5] => [1.0,2.0,3.0,4.0] ???? Why did anyone ever think _that_ was a good idea?

In case you meant that as a question, the reason is this: Prelude> [0.1, 0.2 .. 0.3] [0.1,0.2,0.30000000000000004] Because of rounding error, an implementation that meets your proposed law would have left out 0.3 from that sequence, when of course it was intended to be there. This is messy for the properties you want to state, but it's almost surely the right thing to do in practice. If the list is longer, then the most likely way to get it right is to follow the behavior as currently specified. Of course it's messy, but the world is a messy place, especially when it comes to floating point arithmetic. If you can clear this up with a better explanation of the properties, great! But if you can't, then we ought to reject the kind of thinking that would remove useful behavior when it doesn't fit some theoretical properties that looked nice until you consider the edge cases. -- Chris

Evan Laforge

5:09 a.m.

New subject: instance Enum Double considered not entirely great?

...

If you can clear this up with a better explanation of the properties, great! But if you can't, then we ought to reject the kind of thinking that would remove useful behavior when it doesn't fit some theoretical properties that looked nice until you consider the edge cases.

It's not just theoretical properties, I've spent hours tracking down a bug due to this.

Richard O'Keefe

26 Sep 26 Sep

8:02 a.m.

New subject: instance Enum Double considered not entirely great?

On 23/09/2011, at 4:06 PM, Chris Smith wrote:

...

On Fri, 2011-09-23 at 11:02 +1200, Richard O'Keefe wrote:

...
I do think that '..' syntax for Float and Double could be useful, but the actual definition is such that, well, words fail me. [1.0..3.5] => [1.0,2.0,3.0,4.0] ???? Why did anyone ever think _that_ was a good idea?

In case you meant that as a question, the reason is this:

Prelude> [0.1, 0.2 .. 0.3] [0.1,0.2,0.30000000000000004]

That shows why it is a *BAD* idea. 0.3 comes out as 0.29999999999999998890 so the final value is clearly and unambiguously *outside* the requested range.

...

Because of rounding error, an implementation that meets your proposed law would have left out 0.3 from that sequence, when of course it was intended to be there.

But the output shown does NOT include 0.3 in the sequence. 0.3 `elem` [0.1, 0.2 .. 0.3] is False.

...

This is messy for the properties you want to state, but it's almost surely the right thing to do in practice.

I flatly deny that. I have access to several programming languages that offer 'REAL DO', including Fortran, R, and Smalltalk. They all do the same thing; NONE of them overshoots the mark. If I *wanted* the range to be enlarged a little bit, I would enlarge it myself: [0.1, 0.2 .. 0.3+0.001] perhaps.

...

If the list is longer, then the most likely way to get it right is to follow the behavior as currently specified.

I don't see the length of the list as having much relevance; if the bug shows up in a list of length 3, it is clearly not likely to be any better for longer lists. This is NOT by any stretch of the imagination, it is a BUG. If you have used REAL DO in almost any other programming language, you will be shocked and dismayed by its behaviour in Haskell. Programming constructs that are implemented to do what would probably meant if you were an idiot instead of what you *asked* for are dangerous.

...

If you can clear this up with a better explanation of the properties, great! But if you can't, then we ought to reject the kind of thinking that would remove useful behavior when it doesn't fit some theoretical properties that looked nice until you consider the edge cases.

I don't see any useful behaviour here. I see an implausibly motivated bug and while I _have_ written REAL DO in the past (because some languages offer only one numeric type), I cannot imagine wishing to do so in Haskell, thanks to this bug. What I want now is a compiler option, on by default, to assure me that I am *not* using floating point numeration in Haskell.

Lennart Augustsson

4:57 p.m.

New subject: instance Enum Double considered not entirely great?

I totally agree with you. Haskell is very broken when it comes to [x..y] for floating point. It's an attempt to make it more "friendly" for naive users, but there is no way FP can be made friendly. Any such attempts will fail, so make it usable for people who understand FP instead. -- Lennart On Mon, Sep 26, 2011 at 10:02 AM, Richard O'Keefe wrote:

...

On 23/09/2011, at 4:06 PM, Chris Smith wrote:

...
On Fri, 2011-09-23 at 11:02 +1200, Richard O'Keefe wrote:

...
I do think that '..' syntax for Float and Double could be useful, but the actual definition is such that, well, words fail me. [1.0..3.5] => [1.0,2.0,3.0,4.0] ???? Why did anyone ever think _that_ was a good idea?

In case you meant that as a question, the reason is this:

Prelude> [0.1, 0.2 .. 0.3] [0.1,0.2,0.30000000000000004]

That shows why it is a *BAD* idea. 0.3 comes out as 0.29999999999999998890 so the final value is clearly and unambiguously *outside* the requested range.

...
Because of rounding error, an implementation that meets your proposed law would have left out 0.3 from that sequence, when of course it was intended to be there.

But the output shown does NOT include 0.3 in the sequence.

0.3 `elem` [0.1, 0.2 .. 0.3]

is False.

...
This is messy for the properties you want to state, but it's almost surely the right thing to do in practice.

I flatly deny that. I have access to several programming languages that offer 'REAL DO', including Fortran, R, and Smalltalk. They all do the same thing; NONE of them overshoots the mark.

If I *wanted* the range to be enlarged a little bit, I would enlarge it myself: [0.1, 0.2 .. 0.3+0.001] perhaps.

...
If the list is longer, then the most likely way to get it right is to follow the behavior as currently specified.

I don't see the length of the list as having much relevance; if the bug shows up in a list of length 3, it is clearly not likely to be any better for longer lists. This is NOT by any stretch of the imagination, it is a BUG. If you have used REAL DO in almost any other programming language, you will be shocked and dismayed by its behaviour in Haskell.

Programming constructs that are implemented to do what would probably meant if you were an idiot instead of what you *asked* for are dangerous.

...
If you can clear this up with a better explanation of the properties, great! But if you can't, then we ought to reject the kind of thinking that would remove useful behavior when it doesn't fit some theoretical properties that looked nice until you consider the edge cases.

I don't see any useful behaviour here. I see an implausibly motivated bug and while I _have_ written REAL DO in the past (because some languages offer only one numeric type), I cannot imagine wishing to do so in Haskell, thanks to this bug. What I want now is a compiler option, on by default, to assure me that I am *not* using floating point numeration in Haskell.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson

4:53 p.m.

New subject: instance Enum Double considered not entirely great?

If you do [0.1, 0.2 .. 0.3] it should leave out 0.3. This is floating point numbers and if you don't understand them, then don't use them. The current behaviour of .. for floating point is totally broken, IMO. -- Lennart On Fri, Sep 23, 2011 at 6:06 AM, Chris Smith wrote:

...

On Fri, 2011-09-23 at 11:02 +1200, Richard O'Keefe wrote:

...
I do think that '..' syntax for Float and Double could be useful, but the actual definition is such that, well, words fail me. [1.0..3.5] => [1.0,2.0,3.0,4.0] ???? Why did anyone ever think _that_ was a good idea?

In case you meant that as a question, the reason is this:

Prelude> [0.1, 0.2 .. 0.3] [0.1,0.2,0.30000000000000004]

Because of rounding error, an implementation that meets your proposed law would have left out 0.3 from that sequence, when of course it was intended to be there. This is messy for the properties you want to state, but it's almost surely the right thing to do in practice. If the list is longer, then the most likely way to get it right is to follow the behavior as currently specified. Of course it's messy, but the world is a messy place, especially when it comes to floating point arithmetic.

If you can clear this up with a better explanation of the properties, great! But if you can't, then we ought to reject the kind of thinking that would remove useful behavior when it doesn't fit some theoretical properties that looked nice until you consider the edge cases.

-- Chris

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Chris Smith

5:50 p.m.

On Mon, 2011-09-26 at 18:53 +0200, Lennart Augustsson wrote:

...

If you do [0.1, 0.2 .. 0.3] it should leave out 0.3. This is floating point numbers and if you don't understand them, then don't use them. The current behaviour of .. for floating point is totally broken, IMO.

I'm curious, do you have even a single example of when the current behavior doesn't do what you really wanted anyway? Why would you write an upper bound of 0.3 on a list if you don't expect that to be included in the result? I understand that you can build surprising examples with stuff that no one would really write... but when would you really *want* the behavior that pretends floating point numbers are an exact type and splits hairs? I'd suggest that if you write code that depends on whether 0.1 + 0.1 + 0.1 <= 0.3, for any reason other than to demonstrate rounding error, you're writing broken code. So I don't understand the proposal to change this notation to create a bunch of extra broken code. -- Chris

Richard O'Keefe

27 Sep 27 Sep

12:14 a.m.

New subject: instance Enum Double considered not entirely great?

On 27/09/2011, at 6:50 AM, Chris Smith wrote:

...

On Mon, 2011-09-26 at 18:53 +0200, Lennart Augustsson wrote:

...
If you do [0.1, 0.2 .. 0.3] it should leave out 0.3. This is floating point numbers and if you don't understand them, then don't use them. The current behaviour of .. for floating point is totally broken, IMO.

I'm curious, do you have even a single example of when the current behavior doesn't do what you really wanted anyway? Why would you write an upper bound of 0.3 on a list if you don't expect that to be included in the result?

Because upper bounds are *UPPER BOUNDS* and are NOT as a rule included in the result. If you write [0,2..9] you - DO expect 0 in the result - DON'T expect 9 in the result - would be outraged if 10 were in the result. You would "write an upper bound of 0.3 on a list" if you wanted the list to be bounded above by 0.3.

...

I understand that you can build surprising examples with stuff that no one would really write... but when would you really *want* the behavior that pretends floating point numbers are an exact type and splits hairs?

No, the *existing* behaviour is the behaviour that pretends floating point numbers are exact and splits hairs and fudges results. #include int main(void) { double x; int i; for (i = 0; (x = 0.0 + i*0.1) <= 0.3; i++) printf("%.20f\n", x); return 0; } Output: 0.00000000000000000000 0.10000000000000000555 0.20000000000000001110 Just as expected. An x > 0.3 in the output would be wrong.

Donn Cave

2:54 a.m.

New subject: instance Enum Double considered not entirelygreat?

Quoth "Richard O'Keefe" , [ ... re " Why would you write an upper bound of 0.3 on a list if you don't expect that to be included in the result? " ]

...

Because upper bounds are *UPPER BOUNDS* and are NOT as a rule included in the result. If you write [0,2..9] you - DO expect 0 in the result - DON'T expect 9 in the result - would be outraged if 10 were in the result.

Pardon the questions from the gallery, but ... I can sure see that 0.3 shouldn't be included in the result by overshooting the limit (i.e., 0.30000000000000004), and the above expectations about [0,2..9] are obvious enough, but now I have to ask about [0,2..8] - would you not expect 8 in the result? Or is it not an upper bound? Donn

Chris Smith

3:55 a.m.

New subject: instance Enum Double considered not entirelygreat?

On Mon, 2011-09-26 at 19:54 -0700, Donn Cave wrote:

...

Pardon the questions from the gallery, but ... I can sure see that 0.3 shouldn't be included in the result by overshooting the limit (i.e., 0.30000000000000004), and the above expectations about [0,2..9] are obvious enough, but now I have to ask about [0,2..8] - would you not expect 8 in the result? Or is it not an upper bound?

Donn, [0, 2 .. 8] should be fine no matter the type, because integers of those sizes are all exactly representable in all of the types involved. The reason for the example with a step size of 0.1 is that 0.1 is actually an infinitely repeating number in base 2 (because the denominator has a prime factor of 5). So actually *none* of the exact real numbers 0.1, 0.2, or 0.3 are representable with floating point types. The corresponding literals actually refer to real numbers that are slightly off from those. Furthermore, because the step size is not *exactly* 0.1, when it's added repeatedly in the sequence, the result has some (very small) drift due to repeated rounding error... just enough that by the time you get in the vacinity of 0.3, the corresponding value in the sequence is actually *neither* the rational number 0.3, *nor* the floating point literal 0.3. Instead, it's one ulp larger than the floating point literal because of that drift. So there are two perspectives here. One is that we should think in terms of exact values of the type Float, which means we'd want to exclude it, because it's larger than the top end of the range. The other is that we should think of approximate values of real numbers, in which case it's best to pick the endpoint closest to the stated one, to correct for what's obviously unintended drift due to rounding. So that's what this is about: do we think of Float as an approximate real number type, or as an exact type with specific values. If the latter, then "of course" you exclude the value that's larger than the upper range. If the former, then using comparison operators like '<' implies a proof obligation that the result of the computation remains stable (loosely speaking, the function continuous) at that boundary despite small rounding error in either direction. In that case, creating a language feature where, in the *normal* case of listing the last value you expect in the list, 0.3 (as an approximate real number) is excluded from this list just because of technicalities about the representation is an un-useful implementation, to say the least, and makes it far more difficult to satisfy that proof obligation. Personally, I see floating point values as approximate real numbers. Anything else in unrealistic: the *fact* of the matter is that no one is reasoning about ulps or exact rational values when they use Float and Double. In practice, even hardware implementations of some floating point functions have indeterminate results in the exact sense. Often, the guarantee provided by an FPU is that the result will be within one ulp of the correct answer, which means the exact value of the answer isn't even known! So, should we put all floating point calculations in the IO monad because they aren't pure functions? Or can we admit that people use floating point to approximate reals and accept the looser reasoning? -- Chris Smith

Richard O'Keefe

4:35 a.m.

New subject: instance Enum Double considered not entirelygreat?

On 27/09/2011, at 4:55 PM, Chris Smith wrote:

...

So there are two perspectives here. One is that we should think in terms of exact values of the type Float, which means we'd want to exclude it, because it's larger than the top end of the range. The other is that we should think of approximate values of real numbers, in which case it's best to pick the endpoint closest to the stated one, to correct for what's obviously unintended drift due to rounding.

My perspective on this is neither of the above. My perspective is that if I say I want numbers in the range L..U, anything outside that range is wrong and could be catastrophically wrong.

...

So that's what this is about: do we think of Float as an approximate real number type, or as an exact type with specific values.

That may be the way you see it, but it's not the way I see it. The way I see it is that the *order* properties are more fundamental than the *arithmetic* properties, and that it is far less important to produce value which is approximately the end point (after all, [0,2..9] doesn't include 10) than it is to ensure that all the results are in the interval. And I would be *delighted* if Haskell's compare raised an exception if either argument were a NaN; that's what Real.compare does in SML. Floating point *operations* are approximate; floating point *numbers* are, according to the relevant standards (IEEE, LIA), precise.

Donn Cave

7:29 a.m.

New subject: instance Enum Double considered notentirelygreat?

Quoth Chris Smith , ...

...

So that's what this is about: do we think of Float as an approximate real number type, or as an exact type with specific values. If the latter, then "of course" you exclude the value that's larger than the upper range. If the former, then using comparison operators like '<' implies a proof obligation that the result of the computation remains stable (loosely speaking, the function continuous) at that boundary despite small rounding error in either direction. In that case, creating a language feature where, in the *normal* case of listing the last value you expect in the list, 0.3 (as an approximate real number) is excluded from this list just because of technicalities about the representation is an un-useful implementation, to say the least, and makes it far more difficult to satisfy that proof obligation.

It doesn't appear to me to be a technicality about the representation - the value we're talking about excluding is not just represented as greater than 0.3, it is greater than 0.3 when applied in computations. For example you can subtract 0.3 and get a nonzero value (5.55e-17.) Isn't the problem here with ranges really that they're computed in a way that doesn't work for FP? I mean, when I specify [0.1,0.2..0.5], I really do mean to include 0.5 among those values, as you surmise - so I imply a computation that actually produces that value, i.e., 0.5::Double. The disappointment with iterative addition is not that its fifth value [should be] omitted because it's "technically" greater, it's that range generation via iterative addition does not yield the values I specified. In other words, a range is a shorthand expression, where the correct value is the one that would have resulted from evaluating the corresponding list of constant expressions. If we don't know how to generate that list correctly for Double, then I suppose range shouldn't be supported, but I thinking I'm seeing hints that we may indeed know how to do this? (I sure don't! though of course for a specific case it can be easy, e.g., (map (/ 10.0) [1..5]).) Donn

Chris Smith

3:49 p.m.

New subject: instance Enum Double considered notentirelygreat?

On Tue, 2011-09-27 at 00:29 -0700, Donn Cave wrote:

...

It doesn't appear to me to be a technicality about the representation - the value we're talking about excluding is not just represented as greater than 0.3, it is greater than 0.3 when applied in computations.

Sure, the exact value is greater than 0.3. But to *predict* that, you have to know quite a bit about the technicalities of how floating point values are represented. For example, you need to know that 0.1 has no exact representation as a floating point number, and that the closest approximation is greater than the exact real number 0.1, and that the difference is great enough that adding it twice adds up to a full ulp of error.

...

For example you can subtract 0.3 and get a nonzero value (5.55e-17.)

Again, if you're working with floating point numbers and your program behaves in a significantly different way depending on whether you get 0 or 5.55e-17 as a result, then you're doing something wrong.

...

The disappointment with iterative addition is not that its fifth value [should be] omitted because it's "technically" greater, it's that range generation via iterative addition does not yield the values I specified.

I certainly don't agree that wanting the exact value from a floating point type is a reasonable expectation. The *only* way to recover those results is to do the math with the decimal or rational values instead of floating point numbers. You'll get the rounding error from floating point regardless of how you do the computation, because the interval just isn't really 0.1. The difference between those numbers is larger than 0.1, and when you step by that interval, you won't hit 0.5. You could calculate the entire range using Rational and then convert each individual value after the fact. That doesn't seem like a reasonable default, since it has a runtime performance cost. Of course you're welcome to do it when that's what you need.

...

last ([0.1, 0.2 .. 0.5]) == 0.5 False

...

last (map fromRational [0.1, 0.2 .. 0.5]) == 0.5 True