Re: Is logBase right?

On Sat, 2009-08-22 at 13:03 -0400, haskell-cafe-request@haskell.org wrote:
Message: 10 Date: Sat, 22 Aug 2009 11:24:21 +0200 From: Roberto L?pez
Subject: [Haskell-cafe] Re: Is logBase right? To: haskell-cafe@haskell.org Message-ID: Content-Type: text/plain; charset="ISO-8859-1" If 4.0 / 2.0 was 1.9999999999999999999998, it would be ok?
The real value of log10 1000 is 3 (3.0). It can be represented with accuracy and it should be.
You get the accuracy value in Perl, but there is the same problem in Python. It's a bit discouraging.
There is *not* the same problem in Python: $ python Python 2.6.2 (r262:71600, Jul 9 2009, 23:16:53) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import math math.log10(1000) 3.0
Recent work in Python 3 (and Python 2.6) has improved the handling of floating point numbers, and addresses exactly the problem that Roberto has raised. I see no reason why Haskell could not improve its handling of floating point numbers by using similar techniques. Steve

2009/8/23 Steve
On Sat, 2009-08-22 at 13:03 -0400, haskell-cafe-request@haskell.org wrote:
Message: 10 Date: Sat, 22 Aug 2009 11:24:21 +0200 From: Roberto L?pez
Subject: [Haskell-cafe] Re: Is logBase right? To: haskell-cafe@haskell.org Message-ID: Content-Type: text/plain; charset="ISO-8859-1" If 4.0 / 2.0 was 1.9999999999999999999998, it would be ok?
The real value of log10 1000 is 3 (3.0). It can be represented with accuracy and it should be.
You get the accuracy value in Perl, but there is the same problem in Python. It's a bit discouraging.
There is *not* the same problem in Python: $ python Python 2.6.2 (r262:71600, Jul 9 2009, 23:16:53) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import math math.log10(1000) 3.0
import math math.log(1000,10) 2.9999999999999996
Recent work in Python 3 (and Python 2.6) has improved the handling of floating point numbers, and addresses exactly the problem that Roberto has raised.
I see no reason why Haskell could not improve its handling of floating point numbers by using similar techniques.
You mean introducing a "log10" function into the definition of the Floating class ? That might be a proposal for Haskell Prime.
Steve
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

On Sun, 2009-08-23 at 15:12 +0400, Eugene Kirpichov wrote:
There is *not* the same problem in Python: $ python Python 2.6.2 (r262:71600, Jul 9 2009, 23:16:53) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import math math.log10(1000) 3.0
import math math.log(1000,10) 2.9999999999999996
That is surprising. I think its a Python bug - the results should be consistent.
Recent work in Python 3 (and Python 2.6) has improved the handling of floating point numbers, and addresses exactly the problem that Roberto has raised.
I see no reason why Haskell could not improve its handling of floating point numbers by using similar techniques.
You mean introducing a "log10" function into the definition of the Floating class ? That might be a proposal for Haskell Prime.
round(697.04157958254996, 10) gave 697.04157958259998 which is closer to 697.0415795826
I was not thinking of the log10 function. I was thinking of the changes mentioned in http://docs.python.org/3.1/whatsnew/3.1.html where it says "Python now uses David Gay’s algorithm for finding the shortest floating point representation that doesn’t change its value. This should help mitigate some of the confusion surrounding binary floating point numbers." Also, I had a problem using floating point in Python where than the desired result of 697.0415795825 Its been fixed in the latest versions of Python:
round(697.04157958254996, 10) 697.0415795825
I'm not sure what the equivalent in Haskell is. Is there a function for rounding to a number of decimal digits ? I came up with this: roundN :: Double -> Int -> Double roundN n ndigits = fromIntegral (round $ n * m) / m where m = 10 ^ ndigits ghci> roundN 697.04157958254996 10 697.0415795826 which is not the desired result. Steve

Hello Steve, Sunday, August 23, 2009, 5:27:48 PM, you wrote: ghci>> roundN 697.04157958254996 10
697.0415795826
afair, double has 13 decimal digits precision, so your 697.04157958254996 represented by another value. you just looking for miracles -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sun, 2009-08-23 at 17:36 +0400, Bulat Ziganshin wrote:
Hello Steve,
Sunday, August 23, 2009, 5:27:48 PM, you wrote:
ghci>> roundN 697.04157958254996 10
697.0415795826
afair, double has 13 decimal digits precision, so your 697.04157958254996 represented by another value. you just looking for miracles
I think its 16 or 17 (significant not decimal digits). Its a miracle that Python achieves! But Haskell does not. Python uses doubles too, and for certain operations it uses arbitrary precision floating point (I think). Steve

Am Sonntag 23 August 2009 15:27:48 schrieb Steve:
On Sun, 2009-08-23 at 15:12 +0400, Eugene Kirpichov wrote:
There is *not* the same problem in Python: $ python Python 2.6.2 (r262:71600, Jul 9 2009, 23:16:53) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import math math.log10(1000)
3.0
import math math.log(1000,10)
2.9999999999999996
That is surprising. I think its a Python bug - the results should be consistent.
On the other hand, it is also desirable to have log(a,b) == log(a)/log(b). If you have a special implementation of log10, you must decide which consistency you want to break, log(a,10) == log10(a) or log(a,10) == log(a)/log(10).
Recent work in Python 3 (and Python 2.6) has improved the handling of floating point numbers, and addresses exactly the problem that Roberto has raised.
I see no reason why Haskell could not improve its handling of floating point numbers by using similar techniques.
You mean introducing a "log10" function into the definition of the Floating class ? That might be a proposal for Haskell Prime.
I was not thinking of the log10 function. I was thinking of the changes mentioned in http://docs.python.org/3.1/whatsnew/3.1.html where it says "Python now uses David Gay’s algorithm for finding the shortest floating point representation that doesn’t change its value. This should help mitigate some of the confusion surrounding binary floating point numbers."
Note however that that only concerns the output, not the underlying value. It doesn't change the fact that log(1000)/log(10) (== log(1000,10)) < 3 in Python, too.
Also, I had a problem using floating point in Python where
round(697.04157958254996, 10)
gave 697.04157958259998 which is closer to 697.0415795826 than the desired result of 697.0415795825
Well, 10^10*(697.04157958254996 :: Double) == 6970415795825.5, that gets rounded to 6970415795826 under both, round-half-up and round-half-to-even, and (6970415795826/10^10 :: Double) == 697.04157958259998 Though ghci displays it as 697.0415795826, unlike Python.
Its been fixed in the latest versions of Python:
round(697.04157958254996, 10)
697.0415795825
I'm not sure what the equivalent in Haskell is. Is there a function for rounding to a number of decimal digits ? I came up with this:
roundN :: Double -> Int -> Double roundN n ndigits = fromIntegral (round $ n * m) / m where m = 10 ^ ndigits
ghci> roundN 697.04157958254996 10 697.0415795826
which is not the desired result.
Prelude Numeric> let roundN :: Double -> Int -> Double; roundN x d = let { m = 10^(d+1); i = floor (m*x); r = i `mod` 10; j = if r < 5 then i-r else i-r+10 } in fromInteger j/m roundN :: Double -> Int -> Double roundN x d = fromInteger j / m where m = 10^(d+1) i = floor (m*x) r = i `mod` 10 j | r < 5 = i-r | otherwise = i+10-r ghci> roundN 697.04157958254996 10 697.0415795825 I haven't treated negative input, but, more importantly, floating point representation being what it is, there are cases where this doesn't do what you want/expect either.
Steve

Steve
Also, I had a problem using floating point in Python where
round(697.04157958254996, 10) gave 697.04157958259998
Its been fixed in the latest versions of Python:
round(697.04157958254996, 10) 697.0415795825
ghci> roundN 697.04157958254996 10 697.0415795826
Is there something special with this number? Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
697.04157958259998 697.04157958259998 12345.678901234567890 12345.678901234567
GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> 697.04157958259998 697.0415795826 Prelude> 12345.678901234567890 12345.678901234567 So, Python manages to keep more decimals than GHC for your number, but for other numbers, the precision appears to be the same. -k -- If I haven't seen further, it is by standing in the footprints of giants

Prelude> toRational 697.04157958259998
3065621287177675 % 4398046511104
Prelude> toRational 697.0415795826
3065621287177675 % 4398046511104
As you can see, both numbers are represented by the same Double.
Haskell prints a Double with the simplest number that converts back to
the same bit pattern.
So there's no "keep more decimals", just a matter of string conversion.
-- Lennart
On Tue, Aug 25, 2009 at 7:11 PM, Ketil Malde
Steve
writes: Also, I had a problem using floating point in Python where
round(697.04157958254996, 10) gave 697.04157958259998
Its been fixed in the latest versions of Python:
round(697.04157958254996, 10) 697.0415795825
ghci> roundN 697.04157958254996 10 697.0415795826
Is there something special with this number?
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> 697.04157958259998 697.04157958259998 >>> 12345.678901234567890 12345.678901234567
GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> 697.04157958259998 697.0415795826 Prelude> 12345.678901234567890 12345.678901234567
So, Python manages to keep more decimals than GHC for your number, but for other numbers, the precision appears to be the same.
-k -- If I haven't seen further, it is by standing in the footprints of giants _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Lennart Augustsson
Prelude> toRational 697.04157958259998 3065621287177675 % 4398046511104 Prelude> toRational 697.0415795826 3065621287177675 % 4398046511104
As you can see, both numbers are represented by the same Double. Haskell prints a Double with the simplest number that converts back to the same bit pattern. So there's no "keep more decimals", just a matter of string conversion.
What puzzled me (and the parent article indicated), was that Python appeared to be able to work with more precision, and thus be more "numerically correct" than GHC. Since it's all machine precision floating point, this is even more strange, and I couldn't reproduce the behavior for any other numbers than the one used in the example. One test that I *didn't* make is this one: Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
697.0415795826 697.04157958259998
So yes - it's just a matter of Python printing the number as 697.04157958259998, while GHC prints it as 697.0415795826. Thanks for clarifying, and apologies for the confusion.
Its been fixed in the latest versions of Python:
round(697.04157958254996, 10) 697.0415795825
I get (Python 2.6.2):
round(697.04157958254996,10) 697.04157958259998
ghci> roundN 697.04157958254996 10 697.0415795826
I suppose it is likely that 697.04157958254996 is closer to the real decimal value of this particular number than GHC's representation, which is 697.04157958255. Whether this is a problem is a different matter, and I'm curious what programs usefully depend on one behavior or the other in decimal rounding near the limit of precision. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Aug 28, 2009, at 03:24 , Ketil Malde wrote:
What puzzled me (and the parent article indicated), was that Python appeared to be able to work with more precision, and thus be more "numerically correct" than GHC. Since it's all machine precision floating point, this is even more strange, and I couldn't reproduce the behavior for any other numbers than the one used in the example.
-fexcess-precision or other gcc compile options? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

You're absolutely right. It would be easy to change logBase to have
special cases for, say, base 2 and base 10, and call the C library
functions for those. In fact, I think it's a worth while change,
since it's easy and get's better results for some cases.
-- Lennart
On Sun, Aug 23, 2009 at 12:41 PM, Steve
On Sat, 2009-08-22 at 13:03 -0400, haskell-cafe-request@haskell.org wrote:
Message: 10 Date: Sat, 22 Aug 2009 11:24:21 +0200 From: Roberto L?pez
Subject: [Haskell-cafe] Re: Is logBase right? To: haskell-cafe@haskell.org Message-ID: Content-Type: text/plain; charset="ISO-8859-1" If 4.0 / 2.0 was 1.9999999999999999999998, it would be ok?
The real value of log10 1000 is 3 (3.0). It can be represented with accuracy and it should be.
You get the accuracy value in Perl, but there is the same problem in Python. It's a bit discouraging.
There is *not* the same problem in Python: $ python Python 2.6.2 (r262:71600, Jul 9 2009, 23:16:53) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import math math.log10(1000) 3.0
Recent work in Python 3 (and Python 2.6) has improved the handling of floating point numbers, and addresses exactly the problem that Roberto has raised.
I see no reason why Haskell could not improve its handling of floating point numbers by using similar techniques.
Steve
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sun, Aug 23, 2009 at 01:25:14PM +0200, Lennart Augustsson wrote:
You're absolutely right. It would be easy to change logBase to have special cases for, say, base 2 and base 10, and call the C library functions for those. In fact, I think it's a worth while change, since it's easy and get's better results for some cases.
For a short-term solution there's Don's cmath[1]. [1] http://hackage.haskell.org/packages/archive/cmath/0.3/doc/html/Foreign-C-Mat... -- Felipe.

On Sun, 23 Aug 2009, Lennart Augustsson wrote:
You're absolutely right. It would be easy to change logBase to have special cases for, say, base 2 and base 10, and call the C library functions for those. In fact, I think it's a worth while change, since it's easy and get's better results for some cases.
I think, the current implementation should left as it is. For fractional bases, no one would easily detect such imprecise results and report them as problem. So, it seems like people need a logarithm of integers, so they should be supplied with a special logarithm function for integers. For the other use cases, where 10 as base is one choice amongst a continuous set of rational numbers it would not be a problem to give the imprecise result. In the general case I would not accept a speed loss due to a check against 2 and 10 as base. In dynamically typed languages like Python this might be different, because their users might not care much about types. It may not be important for them, whether a number is an integer or a floating point number that is accidentally integral. However, Python distinguishes between these two kinds of integers, but only dynamically.

I don't really care much one way or the other, but since C (math.h)
provides functions for base 2 and base 10 with some additional
accuracy, I wouldn't mind using them. For a constant base I'd expect
the extra comparison to be constant folded, so that's ok. For a
non-constant base there would be a small penalty.
-- Lennart
On Tue, Aug 25, 2009 at 3:20 PM, Henning
Thielemann
On Sun, 23 Aug 2009, Lennart Augustsson wrote:
You're absolutely right. It would be easy to change logBase to have special cases for, say, base 2 and base 10, and call the C library functions for those. In fact, I think it's a worth while change, since it's easy and get's better results for some cases.
I think, the current implementation should left as it is. For fractional bases, no one would easily detect such imprecise results and report them as problem. So, it seems like people need a logarithm of integers, so they should be supplied with a special logarithm function for integers. For the other use cases, where 10 as base is one choice amongst a continuous set of rational numbers it would not be a problem to give the imprecise result. In the general case I would not accept a speed loss due to a check against 2 and 10 as base.
In dynamically typed languages like Python this might be different, because their users might not care much about types. It may not be important for them, whether a number is an integer or a floating point number that is accidentally integral. However, Python distinguishes between these two kinds of integers, but only dynamically.
participants (9)
-
Brandon S. Allbery KF8NH
-
Bulat Ziganshin
-
Daniel Fischer
-
Eugene Kirpichov
-
Felipe Lessa
-
Henning Thielemann
-
Ketil Malde
-
Lennart Augustsson
-
Steve