floating point operations and representation

I have two questions about using the Double data type and the operations in the Floating typeclass on a computer that uses IEEE floating point numbers. I notice that the Floating class only provides "log" (presumably log base 'e') and "logBase" (which, in the latest source that I see for GHC is defined as "log y / log x"). However, in C, the "math.h" library provides specific "log2" and "log10" functions, for extra precision. A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)". I am under the restriction that I need to write Haskell programs using Double which mimic existing C/C++ programs or generated data sets, and get the same answers. (It's silly, but take it as a given requirement.) If the C programs are using "log2", then I need "log2" in the Haskell, or else I run the risk of not producing the same answers. My first thought is to import "log2" and "log10" through the FFI. I was wondering if anyone on Haskell-Cafe has already done this and/or has a better suggestion about how to get specialized "log2" and "log10" (among the many specialized functions that the "math.h" library provides, for better precision -- for now, I'm just concerned with "log2" and "log10"). My second question is how to get at the IEEE bit representation for a Double. I am already checking "isIEEE n" in my source code (and "floatRadix n == 2"). So I know that I am operating on hardware that implements floating point numbers by the IEEE standard. I would like to get at the 64 bits of a Double. Again, I can convert to a CDouble and use the FFI to wrap a C function which casts the "double" to a 64-bit number and returns it. But I'm wondering if there's not a better way to do this natively in Haskell/GHC (perhaps some crazy use of the Storable typeclass?). Or if someone has already tackled this problem with FFI, that would be interesting to know. Thanks. Jacob

quark:
I have two questions about using the Double data type and the operations in the Floating typeclass on a computer that uses IEEE floating point numbers.
I notice that the Floating class only provides "log" (presumably log base 'e') and "logBase" (which, in the latest source that I see for GHC is defined as "log y / log x"). However, in C, the "math.h" library provides specific "log2" and "log10" functions, for extra precision. A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)".
You could consider binding directly to the C functions, if needed, {-# OPTIONS -fffi -#include "math.h" #-} import Foreign.C.Types foreign import ccall unsafe "math.h log10" c_log10 :: CDouble -> CDouble log10 :: Double -> Double log10 x = realToFrac (c_log10 (realToFrac x)) main = mapM_ (print . log10) [1..10] Also, is there any difference if you compile with -fvia-C or -fexcess-precision (or both)?
My second question is how to get at the IEEE bit representation for a Double. I am already checking "isIEEE n" in my source code (and "floatRadix n == 2"). So I know that I am operating on hardware that implements floating point numbers by the IEEE standard. I would like to get at the 64 bits of a Double. Again, I can convert to a CDouble and use the FFI to wrap a C function which casts the "double" to a 64-bit number and returns it. But I'm wondering if there's not a better way to do this natively in Haskell/GHC (perhaps some crazy use of the Storable typeclass?). Or if someone has already tackled this problem with FFI, that would be interesting to know.
The FFI is a good way. You can just bind to any C code linked with your code. There's some similar code for messing with doubles and longs in the mersenne-random package you might be able to use for inspiration: http://code.haskell.org/~dons/code/mersenne-random/ -- Don

On Wed, Mar 12, 2008 at 9:27 PM, Don Stewart
You could consider binding directly to the C functions, if needed,
{-# OPTIONS -fffi -#include "math.h" #-}
import Foreign.C.Types
foreign import ccall unsafe "math.h log10" c_log10 :: CDouble -> CDouble
log10 :: Double -> Double log10 x = realToFrac (c_log10 (realToFrac x))
Actually, you could almost certainly just use foreign import ccall unsafe "math.h log10" log10 :: Double -> Double since in ghc CDouble and Double are identical. It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs. David

Hello David, Monday, March 17, 2008, 7:59:09 PM, you wrote:
foreign import ccall unsafe "math.h log10" c_log10 :: CDouble -> CDouble
log10 :: Double -> Double log10 x = realToFrac (c_log10 (realToFrac x))
It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs.
it should be nop (no operation) in such cases -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 17 Mar 2008, bulat.ziganshin@gmail.com wrote:
Hello David,
Monday, March 17, 2008, 7:59:09 PM, you wrote:
foreign import ccall unsafe "math.h log10" c_log10 :: CDouble -> CDouble
log10 :: Double -> Double log10 x = realToFrac (c_log10 (realToFrac x))
It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs.
it should be nop (no operation) in such cases
A related issue: http://hackage.haskell.org/trac/ghc/ticket/2110 Presumably everyone is aware of decodeFloat :: (RealFloat a) => a -> (Integer, Int) which really is a canonical representation of a floating point number. As for realToFrac, this really isn't okay: *GHCi> 0/0 NaN *GHCi> realToFrac (0/0) -Infinity Also, this one might surprise a few people. *GHCi> realToFrac (realToFrac 0.2 :: Ratio Int) -Infinity *GHCi> realToFrac 0.2 :: Ratio Int (-858993459)%0 Jed

daveroundy:
On Wed, Mar 12, 2008 at 9:27 PM, Don Stewart
wrote: You could consider binding directly to the C functions, if needed,
{-# OPTIONS -fffi -#include "math.h" #-}
import Foreign.C.Types
foreign import ccall unsafe "math.h log10" c_log10 :: CDouble -> CDouble
log10 :: Double -> Double log10 x = realToFrac (c_log10 (realToFrac x))
Actually, you could almost certainly just use
foreign import ccall unsafe "math.h log10" log10 :: Double -> Double
since in ghc CDouble and Double are identical.
It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs.
It's a no-op (from the core I was looking at), and then you're not worrying about coercing newtypes. -- Don

On Mon, Mar 17, 2008 at 12:59:09PM -0400, David Roundy wrote:
foreign import ccall unsafe "math.h log10" log10 :: Double -> Double
since in ghc CDouble and Double are identical.
It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs.
Yes. 'realToFrac' is inherently pretty broken and should be avoided whenever possible. It is not all all obvious to me what the correct primitive should be.. but we really should do something better for haskell'. relying on RULES firing as ghc currently does doesn't seem ideal.. hmm.. maybe a 'FloatMax' type and have 'fromFloatMax' and 'toFloatMax' in 'Fractional' and 'Real' respectively? hmm.. hc has 'fromDouble' and 'toDouble' there, but jhc also supports a 'Float128' type (when the underlying hardware does). so this still isn't quite right. John -- John Meacham - ⑆repetae.net⑆john⑈

On 2008-03-17, John Meacham
On Mon, Mar 17, 2008 at 12:59:09PM -0400, David Roundy wrote:
foreign import ccall unsafe "math.h log10" log10 :: Double -> Double
since in ghc CDouble and Double are identical.
It's a bit sloppier, but shouldn't cause any trouble. And I've no idea how realToFrac is implemented, but would worry about it behaving oddly... for instance when given NaNs.
Yes. 'realToFrac' is inherently pretty broken and should be avoided whenever possible. It is not all all obvious to me what the correct primitive should be.. but we really should do something better for haskell'. relying on RULES firing as ghc currently does doesn't seem ideal..
hmm.. maybe a 'FloatMax' type and have 'fromFloatMax' and 'toFloatMax' in 'Fractional' and 'Real' respectively? hmm.. hc has 'fromDouble' and 'toDouble' there, but jhc also supports a 'Float128' type (when the underlying hardware does). so this still isn't quite right.
Well, the whole numeric hierarchy needs to be redone to support proper mathematical structures like groups, rings, and fields. Once that's done, this might end up being clarified a bit. -- Aaron Denney -><-

"Jacob Schwartz"
A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)".
I think this may also depend on C compiler and optimization level. Perhaps now everybody uses SSE to do math, but earlier Intel FPU architectures did floating point with 80-bit registers, so the accuracy of the result could depend on whether an intermediate result was flushed to memory (by a context switch). Equality on floating point is tricky, if I were you, I'd round the results before comparing. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Mar 13, 2008, at 5:12 , Ketil Malde wrote:
Perhaps now everybody uses SSE to do math, but earlier Intel FPU architectures did floating point with 80-bit registers, so the accuracy of the result could depend on whether an intermediate result was flushed to memory (by a context switch).
Double operations in IO, anyone? :/ -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Jacob Schwartz
I have two questions about using the Double data type and the operations in the Floating typeclass on a computer that uses IEEE floating point numbers.
I notice that the Floating class only provides "log" (presumably log base 'e') and "logBase" (which, in the latest source that I see for GHC is defined as "log y / log x"). However, in C, the "math.h" library provides specific "log2" and "log10" functions, for extra precision. A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)".
This is to be expected in about 25% of cases, as the GHC definition rounds two times, whereas the implementation provided in the SUN math library (file lib/msun/src/e_log10.c on FreeBSD) uses a representation in two doubles for log(10) to avoid one of the rounding errors: * Return the base 10 logarithm of x * * Method : * Let log10_2hi = leading 40 bits of log10(2) and * log10_2lo = log10(2) - log10_2hi, * ivln10 = 1/log(10) rounded. * Then * n = ilogb(x), * if(n<0) n = n+1; * x = scalbn(x,-n); * log10(x) := n*log10_2hi + (n*log10_2lo + ivln10*log(x)) -- Dipl.-Math. Wilhelm Bernhard Kloke Institut fuer Arbeitsphysiologie an der Universitaet Dortmund Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257 PGP: http://vestein.arb-phys.uni-dortmund.de/~wb/mypublic.key

Wow, you have a tough mission if you want to replicate the bit level answers
for double (btw, hi Jacob).
Libraries differ for transcendental function, and even worse, CPUs differ.
You may get different answers on an Intel and and AMD.
That said, I think your best bet is to import log2 and log10 from C and use
those.
Haskell sadly lacks an efficient way to go from a Double to its bit pattern.
:(
-- Lennart
On Thu, Mar 13, 2008 at 12:35 AM, Jacob Schwartz
I have two questions about using the Double data type and the operations in the Floating typeclass on a computer that uses IEEE floating point numbers.
I notice that the Floating class only provides "log" (presumably log base 'e') and "logBase" (which, in the latest source that I see for GHC is defined as "log y / log x"). However, in C, the "math.h" library provides specific "log2" and "log10" functions, for extra precision. A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)".
I am under the restriction that I need to write Haskell programs using Double which mimic existing C/C++ programs or generated data sets, and get the same answers. (It's silly, but take it as a given requirement.) If the C programs are using "log2", then I need "log2" in the Haskell, or else I run the risk of not producing the same answers. My first thought is to import "log2" and "log10" through the FFI. I was wondering if anyone on Haskell-Cafe has already done this and/or has a better suggestion about how to get specialized "log2" and "log10" (among the many specialized functions that the "math.h" library provides, for better precision -- for now, I'm just concerned with "log2" and "log10").
My second question is how to get at the IEEE bit representation for a Double. I am already checking "isIEEE n" in my source code (and "floatRadix n == 2"). So I know that I am operating on hardware that implements floating point numbers by the IEEE standard. I would like to get at the 64 bits of a Double. Again, I can convert to a CDouble and use the FFI to wrap a C function which casts the "double" to a 64-bit number and returns it. But I'm wondering if there's not a better way to do this natively in Haskell/GHC (perhaps some crazy use of the Storable typeclass?). Or if someone has already tackled this problem with FFI, that would be interesting to know.
Thanks.
Jacob _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 13 Mar 2008, Lennart Augustsson wrote:
Wow, you have a tough mission if you want to replicate the bit level answers for double (btw, hi Jacob). Libraries differ for transcendental function, and even worse, CPUs differ. You may get different answers on an Intel and and AMD. That said, I think your best bet is to import log2 and log10 from C and use those.
Haskell sadly lacks an efficient way to go from a Double to its bit pattern. :(
You can do nasty casting using STArray: http://slavepianos.org/rd/sw/sw-78/Sound/OpenSoundControl/Cast.hs

On Mar 12, 2008, at 8:35 PM, Jacob Schwartz wrote:
My second question is how to get at the IEEE bit representation for a Double.
My (rhetorical) question on this front isn't "how do I get the representation," but "why is it so hard and non-portable to get the representation sensibly"? A candidate for standardization, surely? -Jan-Willem Maessen

On Thu, Mar 13, 2008 at 1:35 AM, Jacob Schwartz
I have two questions about using the Double data type and the operations in the Floating typeclass on a computer that uses IEEE floating point numbers.
I notice that the Floating class only provides "log" (presumably log base 'e') and "logBase" (which, in the latest source that I see for GHC is defined as "log y / log x"). However, in C, the "math.h" library provides specific "log2" and "log10" functions, for extra precision. A test on IEEE computers (x86 and x86-64), shows that for a range of 64-bit "double" values, the answers in C do differ (in the last bit) if you use "log2(x)" and "log10(x)" versus "log (x) / log(2)" and "log(x) / log(10)".
I am under the restriction that I need to write Haskell programs using Double which mimic existing C/C++ programs or generated data sets, and get the same answers. (It's silly, but take it as a given requirement.) If the C programs are using "log2", then I need "log2" in the Haskell, or else I run the risk of not producing the same answers. My first thought is to import "log2" and "log10" through the FFI. I was wondering if anyone on Haskell-Cafe has already done this and/or has a better suggestion about how to get specialized "log2" and "log10" (among the many specialized functions that the "math.h" library provides, for better precision -- for now, I'm just concerned with "log2" and "log10").
The RULES pragma + FFI solution (as suggested by Henning) seems to be the most sensible one so far.
My second question is how to get at the IEEE bit representation for a Double. I am already checking "isIEEE n" in my source code (and "floatRadix n == 2"). So I know that I am operating on hardware that implements floating point numbers by the IEEE standard. I would like to get at the 64 bits of a Double. Again, I can convert to a CDouble and use the FFI to wrap a C function which casts the "double" to a 64-bit number and returns it. But I'm wondering if there's not a better way to do this natively in Haskell/GHC (perhaps some crazy use of the Storable typeclass?).
Posted in a a previous haskell-cafe message [1] {-# LANGUAGE MagicHash #-} import Data.Int import GHC.Exts foo :: Double -> Int64 --ghc only foo (D# x) = I64# (unsafeCoerce# x) [1] http://www.haskell.org/pipermail/haskell-cafe/2008-February/040000.html
participants (14)
-
Aaron Denney
-
Alfonso Acosta
-
Brandon S. Allbery KF8NH
-
Bulat Ziganshin
-
David Roundy
-
Don Stewart
-
Henning Thielemann
-
Jacob Schwartz
-
Jan-Willem Maessen
-
Jed Brown
-
John Meacham
-
Ketil Malde
-
Lennart Augustsson
-
Wilhelm B. Kloke