Repair to floating point enumerations?

Dear Haskell-Primers (and libraries). Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles: Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2] What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report. http://haskell.org/onlinereport/basic.html#sect6.3.4 "For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i. We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.: http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family: numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m) The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like [ 0.0, 0.1 .. 0.3 ] to yield the sequence [ 0.0, 0.1, 0.2 ] that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3. Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon." This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain. I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome. But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98. Thoughts? Regards, Malcolm

Hi Malcolm, The current behaviour does sound like a bug, and the revised behaviour does sound like a fix - and one that has a sensible explanation if people trip over it. In general having floating point be a member of classes such as Eq has some obvious problems, but I realise is a necessity for practical programming. Given that we have Eq Double, then Enum Double seems no worse. If we don't alter H98 then a valid program under H98 vs H' will give different results without any warning - that seems really bad. In addition, having two instances around for one typeclass/type pair in a big library (base) which are switched with some flag seems like a nightmare for compiler writers. So I think a good solution would be to fix H98 as a typo, and include it in H'. Thanks Neil
-----Original Message----- From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Malcolm Wallace Sent: 15 October 2008 10:41 am To: haskell-prime@haskell.org Cc: libraries@haskell.org Subject: Repair to floating point enumerations?
Dear Haskell-Primers (and libraries).
Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2]
What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report.
http://haskell.org/onlinereport/basic.html#sect6.3.4
"For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.:
http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like
[ 0.0, 0.1 .. 0.3 ]
to yield the sequence
[ 0.0, 0.1, 0.2 ]
that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3.
Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon."
This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain.
I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome.
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
Thoughts?
Regards, Malcolm _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

On Wed, 2008-10-15 at 11:25 +0100, Mitchell, Neil wrote:
Hi Malcolm,
The current behaviour does sound like a bug, and the revised behaviour does sound like a fix - and one that has a sensible explanation if people trip over it. In general having floating point be a member of classes such as Eq has some obvious problems, but I realise is a necessity for practical programming. Given that we have Eq Double, then Enum Double seems no worse.
If we don't alter H98 then a valid program under H98 vs H' will give different results without any warning - that seems really bad. In addition, having two instances around for one typeclass/type pair in a big library (base) which are switched with some flag seems like a nightmare for compiler writers. So I think a good solution would be to fix H98 as a typo, and include it in H'.
I would take the contrary position and say H98 should be left alone and the change should be proposed for H'. The argument is that H98 is done and the revised report was published 7 years ago. Changing H98 now just doesn't seem to make much sense. If we're talking about changing the meaning of 'valid' programs then doing that at a boundary like H98 -> H' seems much more sensible than having to explain the difference in H98-pre2008 programs vs H98-post2008. People will not expect all programs to be exactly the same between H98 and H' (otherwise there would be little need for a new standard). Yes H' is mostly compatible extensions but not all of it. Duncan

On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
Dear Haskell-Primers (and libraries).
Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2]
What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report.
http://haskell.org/onlinereport/basic.html#sect6.3.4
"For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.:
http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like
[ 0.0, 0.1 .. 0.3 ]
to yield the sequence
[ 0.0, 0.1, 0.2 ]
that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3.
Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon."
This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain.
I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome.
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
Thoughts?
It sounds like a bad fix to me. It seems important that the [0.0,0.1.. x] notation should work correctly in the common cases. And the common case really is that the final value is intended as an exact multiple of the increment. Why not look for a heuristic that gets the common cases right, rather than going with an elegant wrong solution? After all, these enumerations are most often used by people who neither care nor know how they're implemented, but who most likely would prefer if haskell worked as well as matlab, python, etc. One reasonable option would be to actually take into account the expected roundoff error (which isn't hard to compute for simple sums like this). It would also be a good idea to reduce that roundoff error by using multiplication rather than addition to create the enumeration (which also allows us to ensure that finite enumerations terminate). e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate. Admittedly, this is a stupid thing to compute, but it's also stupid to add many small numbers together in sequence, since it always serves to increase the roundoff error. It's true that most people's C code would be just as naive, but we're writing Haskell, and you're talking about the Prelude, which should be written intelligently, not using the simplest code, in such a way that it won't bite programmers. David

I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
of 0.1 get exactly what they deserve if x is not included. Floating
point numbers are not the real numbers, and the sooner they learn that
the better. We can fudge this all we like, but 0.1 is never going to
be exactly representable as a binary floating point number no matter
what we do.
On Wed, Oct 15, 2008 at 3:44 PM, David Roundy
On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
Dear Haskell-Primers (and libraries).
Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2]
What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report.
http://haskell.org/onlinereport/basic.html#sect6.3.4
"For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.:
http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like
[ 0.0, 0.1 .. 0.3 ]
to yield the sequence
[ 0.0, 0.1, 0.2 ]
that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3.
Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon."
This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain.
I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome.
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
Thoughts?
It sounds like a bad fix to me. It seems important that the [0.0,0.1.. x] notation should work correctly in the common cases. And the common case really is that the final value is intended as an exact multiple of the increment.
Why not look for a heuristic that gets the common cases right, rather than going with an elegant wrong solution? After all, these enumerations are most often used by people who neither care nor know how they're implemented, but who most likely would prefer if haskell worked as well as matlab, python, etc.
One reasonable option would be to actually take into account the expected roundoff error (which isn't hard to compute for simple sums like this).
It would also be a good idea to reduce that roundoff error by using multiplication rather than addition to create the enumeration (which also allows us to ensure that finite enumerations terminate). e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate. Admittedly, this is a stupid thing to compute, but it's also stupid to add many small numbers together in sequence, since it always serves to increase the roundoff error. It's true that most people's C code would be just as naive, but we're writing Haskell, and you're talking about the Prelude, which should be written intelligently, not using the simplest code, in such a way that it won't bite programmers.
David _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Here's a counter-proposal: numericEnumFromThenTo :: RealFloat a => a -> a -> a -> [a] numericEnumFrom 0 = map fromInteger [1..] numericEnumFrom n = map ((n+).fromInteger) [1..] numericEnumFromThen n m = map (\x -> n+fromInteger x*(m-n)) [1..] numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m) epsilon :: RealFloat a => a epsilon = 1/2^(floatDigits (undefined :: a)) This uses quite a reasonable approximation for the roundoff error, and has the advantage of not inappropriately returning _|_. It does sometimes create duplicate entries in the list, but I think that is better that returning an infinite list of duplicate entries as the code proposed below does. And yes, the fuzzy comparison is a bit ugly, but at least it means that every user is not forced to implement fuzzy comparison in their quick and dirty code (which is the only thing this syntax is good for). David On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:
I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple of 0.1 get exactly what they deserve if x is not included. Floating point numbers are not the real numbers, and the sooner they learn that the better. We can fudge this all we like, but 0.1 is never going to be exactly representable as a binary floating point number no matter what we do.
On Wed, Oct 15, 2008 at 3:44 PM, David Roundy
wrote: On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
Dear Haskell-Primers (and libraries).
Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2]
What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report.
http://haskell.org/onlinereport/basic.html#sect6.3.4
"For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.:
http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like
[ 0.0, 0.1 .. 0.3 ]
to yield the sequence
[ 0.0, 0.1, 0.2 ]
that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3.
Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon."
This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain.
I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome.
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
Thoughts?
It sounds like a bad fix to me. It seems important that the [0.0,0.1.. x] notation should work correctly in the common cases. And the common case really is that the final value is intended as an exact multiple of the increment.
Why not look for a heuristic that gets the common cases right, rather than going with an elegant wrong solution? After all, these enumerations are most often used by people who neither care nor know how they're implemented, but who most likely would prefer if haskell worked as well as matlab, python, etc.
One reasonable option would be to actually take into account the expected roundoff error (which isn't hard to compute for simple sums like this).
It would also be a good idea to reduce that roundoff error by using multiplication rather than addition to create the enumeration (which also allows us to ensure that finite enumerations terminate). e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate. Admittedly, this is a stupid thing to compute, but it's also stupid to add many small numbers together in sequence, since it always serves to increase the roundoff error. It's true that most people's C code would be just as naive, but we're writing Haskell, and you're talking about the Prelude, which should be written intelligently, not using the simplest code, in such a way that it won't bite programmers.
David _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

I agree with David, we should be using multiplication, not addition. However, I think that under the law of least surprise, we should require that for all a,b,z: all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z]. For example, anything in the neighborhood of this is just unfair, even if it's within David's fudge factor: Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55] [-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity] If I want to include the terminating value, then what I really want is probably some f such that: f (6 :: Int) 0 0.55 = [0,0.11,0.22,0.33,0.44,0.55] --Lane On Wed, 15 Oct 2008, David Roundy wrote:
Here's a counter-proposal:
numericEnumFromThenTo :: RealFloat a => a -> a -> a -> [a] numericEnumFrom 0 = map fromInteger [1..] numericEnumFrom n = map ((n+).fromInteger) [1..] numericEnumFromThen n m = map (\x -> n+fromInteger x*(m-n)) [1..] numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m)
epsilon :: RealFloat a => a epsilon = 1/2^(floatDigits (undefined :: a))
This uses quite a reasonable approximation for the roundoff error, and has the advantage of not inappropriately returning _|_. It does sometimes create duplicate entries in the list, but I think that is better that returning an infinite list of duplicate entries as the code proposed below does.
And yes, the fuzzy comparison is a bit ugly, but at least it means that every user is not forced to implement fuzzy comparison in their quick and dirty code (which is the only thing this syntax is good for).
David
On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:
I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple of 0.1 get exactly what they deserve if x is not included. Floating point numbers are not the real numbers, and the sooner they learn that the better. We can fudge this all we like, but 0.1 is never going to be exactly representable as a binary floating point number no matter what we do.
On Wed, Oct 15, 2008 at 3:44 PM, David Roundy
wrote: On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
Dear Haskell-Primers (and libraries).
Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98 Prelude, regarding numeric enumerations for Floats/Doubles:
Prelude> [0, 0.3 .. 1.1] [0.0,0.3,0.6,0.899999,1.2]
What is odd is that the last value of the list is much larger than the specified termination value. But this is exactly as specified by the Haskell'98 Report.
http://haskell.org/onlinereport/basic.html#sect6.3.4
"For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
We have discussed this question (and related ones, such as whether Float and Double belong in the Enum class at all) several times before, and I do not wish to rehash all of those points again e.g.:
http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html http://www.haskell.org/pipermail/haskell/2001-October/008218.html http://www.haskell.org/pipermail/haskell/2002-October/010607.html
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
The particular feature of note is that the odd fudge factor of (1/2 * the increment) is removed. The inexact nature of floating point numbers would therefore cause a specification like
[ 0.0, 0.1 .. 0.3 ]
to yield the sequence
[ 0.0, 0.1, 0.2 ]
that is, to omit the upper bound, because (3 * 0.1) is actually represented as 0.30000000000004, strictly greater than 0.3.
Phil argues that this behaviour is more desirable: "the simple fix is that the user must add a suitable epsilon to the upper bound. The key word here is *suitable*. The old definitions included completely bizarre and often highly unsuitable choices of epsilon."
This proposal seems to me to improve the consistency of the enumeration syntax across both the integral and floating types. Some users may still be surprised, but the surprise will be easier to explain.
I am bringing this question to the attention of all who are interested in Haskell Prime, because it seems like a sensible and well-reasoned change. Discussion on whether to adopt this proposal for H' is welcome.
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
Thoughts?
It sounds like a bad fix to me. It seems important that the [0.0,0.1.. x] notation should work correctly in the common cases. And the common case really is that the final value is intended as an exact multiple of the increment.
Why not look for a heuristic that gets the common cases right, rather than going with an elegant wrong solution? After all, these enumerations are most often used by people who neither care nor know how they're implemented, but who most likely would prefer if haskell worked as well as matlab, python, etc.
One reasonable option would be to actually take into account the expected roundoff error (which isn't hard to compute for simple sums like this).
It would also be a good idea to reduce that roundoff error by using multiplication rather than addition to create the enumeration (which also allows us to ensure that finite enumerations terminate). e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate. Admittedly, this is a stupid thing to compute, but it's also stupid to add many small numbers together in sequence, since it always serves to increase the roundoff error. It's true that most people's C code would be just as naive, but we're writing Haskell, and you're talking about the Prelude, which should be written intelligently, not using the simplest code, in such a way that it won't bite programmers.
David _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Christopher Lane Hinson wrote:
I agree with David, we should be using multiplication, not addition. However, I think that under the law of least surprise, we should require that for all a,b,z:
all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
so that [0,0.1..0.3] doesn't include the terminating value that's a little more than the literal 0.3?
For example, anything in the neighborhood of this is just unfair, even if it's within David's fudge factor:
Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55] [-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity]
but that's a significant fudge, 0.5 versus 0.55 versus 0.6 -- right? -Isaac

On Thu, Oct 16, 2008 at 05:31:50PM -0400, Isaac Dupree wrote:
Christopher Lane Hinson wrote:
I agree with David, we should be using multiplication, not addition. However, I think that under the law of least surprise, we should require that for all a,b,z:
all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
so that [0,0.1..0.3] doesn't include the terminating value that's a little more than the literal 0.3?
We could do what octave appears to do, which is to set anything within the fudge factor (but greater than the upper bound) equal to the upper bound. That seems like a reasonable option to me. David

David Roundy
On Thu, Oct 16, 2008 at 05:31:50PM -0400, Isaac Dupree wrote:
Christopher Lane Hinson wrote:
I agree with David, we should be using multiplication, not addition. However, I think that under the law of least surprise, we should require that for all a,b,z:
all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
so that [0,0.1..0.3] doesn't include the terminating value that's a little more than the literal 0.3?
We could do what octave appears to do, which is to set anything within the fudge factor (but greater than the upper bound) equal to the upper bound. That seems like a reasonable option to me.
I've not been following this closely, but I've not seen a clear definition of what this is all for. Without that, we're going to see a succession of fudges that will work for some notions of what the notation means and fail for others. Assuming z>=a, must the resulting list include a? include z exactly when? If a or z is a literal, does include here mean "include the value represented by the literal" or some other nearby value? Must the step (l!!(i+1)-l!!i) always be the same? Must it equal (b-a)? (I can even imagine that for some applications, a list that looked like this: [0.0, 1.0e-2, 2.0e-2, 3.0e-2, 4.0e-2, 5.0e-2, 6.0000000000000005e-2, 7.0e-2, 8.0e-2, 9.0e-2, 9.999999999999999e-2, 0.10999999999999999, 0.11] would be preferable to one that included only one of the last two elements). Whatever is decided for floats, the definition must be consistent with that for discrete quantities. I'm not sure this is possible. Since floating point numbers are full of surprises for anyone who is used to real numbers (and even a few for people who are quite familiar with floats), the principle of least surprise would surely indicate no instance of Enum for floats. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

On 15/10/2008, at 20:41, Malcolm Wallace wrote:
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
I'd like to raise the following two points in this context. Firstly, an array library which attempts to provide reasonable counterparts to the list functions would want to define array versions of enumFromTo and enumFromThenTo. To be efficient, it must be able to determine the expected number of elements reasonably fast (in constant time). Secondly, a parallel array library (such as the one provided by DPH) also needs to be able to generate, say, the first half of the array on one processor and the second half on another. This requires enumFromTo to obey some form of the following law for some f and g: enumFromTo m n = enumFromTo m (f m n) ++ enumFromTo (g m n) n It might make sense to try to provide this for Float and Double for the sake of consistency with future libraries. I'm not sure how easy it is to do, though. Roman

Malcolm Wallace wrote:
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
I'll leave it to the floating-point experts to decide exactly what to do here for Haskell' (but I note that David Roundy's version looks better than the iterate version above, because the errors won't accumulate).
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
We definitely can't make breaking changes to Haskell 98; this would be much more than a "typo". However, this does give us a problem if we decide to make a change here for H', as Neil points out, because there would be two mutually-incompatible instances for Enum Float. We'd need to have a clear distinction between programs that are Haskell 98 and those that are not, with only the former allowed to use the haskell98 package. Cheers, Simon

Hi
Bringing up an old thread, but this issue just bit me in real life:
[0, 2.5 .. 3.75 ] = [0, 2.5, 5.0]
I was comparative testing some C++ and some Haskell, and its a shame
when its the Haskell that is clearly wrong! I don't think any end
decision was reached in this thread, but I think this is sufficiently
incorrect to call the Haskell 98 behaviour a bug.
Thanks
Neil
On Wed, Oct 22, 2008 at 10:11 AM, Simon Marlow
Malcolm Wallace wrote:
Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family:
numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
I'll leave it to the floating-point experts to decide exactly what to do here for Haskell' (but I note that David Roundy's version looks better than the iterate version above, because the errors won't accumulate).
But as maintainer and bug-fixer of the Haskell'98 Report, I have also been asked whether we should make this change retrospectively to the Haskell'98 language (as a "typo"). Since it involves not merely an ordinary library function, but a Prelude function, and moreover a function that is used in the desugaring of syntax, it is less clear to me whether to alter Haskell'98.
We definitely can't make breaking changes to Haskell 98; this would be much more than a "typo".
However, this does give us a problem if we decide to make a change here for H', as Neil points out, because there would be two mutually-incompatible instances for Enum Float. We'd need to have a clear distinction between programs that are Haskell 98 and those that are not, with only the former allowed to use the haskell98 package.
Cheers, Simon
_______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
participants (11)
-
Christopher Lane Hinson
-
David Roundy
-
Duncan Coutts
-
Isaac Dupree
-
Jon Fairbairn
-
Lennart Augustsson
-
Malcolm Wallace
-
Mitchell, Neil
-
Neil Mitchell
-
Roman Leshchinskiy
-
Simon Marlow