Folding constants for floats

Hi, I'm cutting my teeth on some constant folding for floats in the cmm. I have a question regarding the ticket I'm tackling: Should floats be folded with infinite precision (and later truncated to the platform float size) -- most useful/accurate, or folded with the platform precision, i.e. double, losing accuracy but keeping consistent behaviour with -O0 -- most "correct"? I would prefer the first case because it's *much* easier to implement than the second, and it'll probably rot less. Regards.

This is actually a bit more subtle than you'd think. Are those constants precise and exact? (There's certainly floating point code that exploits the cancellations in the floating point model) There's many floating point computations that can't be done with exact rational operations. There's also certain aspects that are target dependent like operations having 80bit vs 64bit precision. (Ie using the old intel fp registers vs sse2 and newer) What's the ticket you're working on? Please be very cautious with floating point, any changes to the meaning that aren't communicated by the programs author could leave a haskeller numerical analyst scratching their head. For example, when doing these floating point computations, what rounding modes will you use? On Monday, January 13, 2014, Kyle Van Berendonck wrote:
Hi,
I'm cutting my teeth on some constant folding for floats in the cmm.
I have a question regarding the ticket I'm tackling:
Should floats be folded with infinite precision (and later truncated to the platform float size) -- most useful/accurate, or folded with the platform precision, i.e. double, losing accuracy but keeping consistent behaviour with -O0 -- most "correct"?
I would prefer the first case because it's *much* easier to implement than the second, and it'll probably rot less.
Regards.

Oh I see the ticket. Are you focusing on adding hex support to Double# and Float# ? That would be splendid. We currently don have a decent way of writing nan, and the infinities. That would be splendid. On Monday, January 13, 2014, Carter Schonwald wrote:
This is actually a bit more subtle than you'd think. Are those constants precise and exact? (There's certainly floating point code that exploits the cancellations in the floating point model) There's many floating point computations that can't be done with exact rational operations. There's also certain aspects that are target dependent like operations having 80bit vs 64bit precision. (Ie using the old intel fp registers vs sse2 and newer)
What's the ticket you're working on?
Please be very cautious with floating point, any changes to the meaning that aren't communicated by the programs author could leave a haskeller numerical analyst scratching their head. For example, when doing these floating point computations, what rounding modes will you use?
On Monday, January 13, 2014, Kyle Van Berendonck wrote:
Hi,
I'm cutting my teeth on some constant folding for floats in the cmm.
I have a question regarding the ticket I'm tackling:
Should floats be folded with infinite precision (and later truncated to the platform float size) -- most useful/accurate, or folded with the platform precision, i.e. double, losing accuracy but keeping consistent behaviour with -O0 -- most "correct"?
I would prefer the first case because it's *much* easier to implement than the second, and it'll probably rot less.
Regards.

Hi, I'd like to work on the primitives first. They are relatively easy to implement. Here's how I figure it; The internal representation of the floats in the cmm is as a Rational (ratio of Integers), so they have "infinite precision". I can implement all the constant folding by just writing my own operations on these rationals; ie, ** takes the power of the top/bottom and reconstructs a new Rational, log takes the difference between the log of the top/bottom etc. This is all very easy to fold. I can encode errors in the Rational where infinity is >0 %: 0 and NaN is 0 %: 0. Since the size of floating point constants is more of an architecture specific thing, and floats don't wrap around like integers do, it would make more sense (in my opinion) to only reduce the value to the architecture specific precision (or clip it to a NaN or such) in the **final** stage as apposed to trying to emulate the behavior of a double native to the architecture (which is a very hard thing to do, and results in precision errors -- the real question is, do people want precision errors when they write literals in code, or are they really looking for the compiler to do a better job than them at making sure they stay precise?) On Tue, Jan 14, 2014 at 3:27 AM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
Oh I see the ticket. Are you focusing on adding hex support to Double# and Float# ? That would be splendid. We currently don have a decent way of writing nan, and the infinities. That would be splendid.
On Monday, January 13, 2014, Carter Schonwald wrote:
This is actually a bit more subtle than you'd think. Are those constants precise and exact? (There's certainly floating point code that exploits the cancellations in the floating point model) There's many floating point computations that can't be done with exact rational operations. There's also certain aspects that are target dependent like operations having 80bit vs 64bit precision. (Ie using the old intel fp registers vs sse2 and newer)
What's the ticket you're working on?
Please be very cautious with floating point, any changes to the meaning that aren't communicated by the programs author could leave a haskeller numerical analyst scratching their head. For example, when doing these floating point computations, what rounding modes will you use?
On Monday, January 13, 2014, Kyle Van Berendonck wrote:
Hi,
I'm cutting my teeth on some constant folding for floats in the cmm.
I have a question regarding the ticket I'm tackling:
Should floats be folded with infinite precision (and later truncated to the platform float size) -- most useful/accurate, or folded with the platform precision, i.e. double, losing accuracy but keeping consistent behaviour with -O0 -- most "correct"?
I would prefer the first case because it's *much* easier to implement than the second, and it'll probably rot less.
Regards.

On 01/13/2014 05:21 PM, Kyle Van Berendonck wrote:
Hi,
I'd like to work on the primitives first. They are relatively easy to implement. Here's how I figure it;
The internal representation of the floats in the cmm is as a Rational (ratio of Integers), so they have "infinite precision". I can implement all the constant folding by just writing my own operations on these rationals; ie, ** takes the power of the top/bottom and reconstructs a new Rational, log takes the difference between the log of the top/bottom etc. This is all very easy to fold.
What about sin(), etc? I don't think identities will get you out of computing at least some irrational numbers. (Maybe I'm missing your point?)
Since the size of floating point constants is more of an architecture specific thing
IEEE 754 is becoming more and more ubiquitous. As far as I know, Haskell Float is always IEEE 754 32-bit binary floating point and Double is IEEE 754 64-bit binary floating point, on machines that support this (including x86_64, ARM, and sometimes x86). Let's not undermine this progress.
and floats don't wrap around like integers do, it would make more sense (in my opinion) to only reduce the value to the architecture specific precision (or clip it to a NaN or such) in the **final** stage as apposed to trying to emulate the behavior of a double native to the architecture (which is a very hard thing to do, and results in precision errors
GCC uses MPFR to exactly emulate the target machine's rounding behaviour.
the real question is, do people want precision errors when they write literals in code,
Yes. Look at GCC. If you don't pass -ffast-math (which says you don't care if floating-point rounding behaves as specified), you get the same floating-point behaviour with and without optimizations. This is IMHO even more important for Haskell where we tend to believe in deterministic pure code. -Isaac

... and let's not forget about such fun stuff as IEEE's -0, e.g.: 1/(-1 * 0) => -Infinity 1/(0 + (-1 * 0)) => Infinity If we take the standpoint that Haskell's Float and Double types correspond to IEEE 754 floating point numbers, there is almost no mathematical equivalence which holds, and consequently almost all folding or other optimizations will be wrong. One can do all these things behind a flag (trading IEEE compatibility for better code), but this shouldn't be done by default IMHO.

maybe so, but having a semantics by default is huge, and honestly i'm not
super interested in optimizations that merely change one infinity for
another. What would the alternative semantics be? Whatever it is, how will
we communicate it to our users? GHC's generally been (by accidenta) IEEE
compliant, changing that will possibly break someones code! (perhaps). Also
who's going to specify this alternative semantics and educate everyone
about it?
the thing is floating point doesn't act like most other models of numbers,
they have a very very non linear grid of precision across as HUGE dynamic
range. Pretending theyre something they're not is the root of most problems
with them.
either way, its a complex problem that nees to be carefully sorted out
On Tue, Jan 14, 2014 at 3:03 AM, Sven Panne
... and let's not forget about such fun stuff as IEEE's -0, e.g.:
1/(-1 * 0) => -Infinity 1/(0 + (-1 * 0)) => Infinity
If we take the standpoint that Haskell's Float and Double types correspond to IEEE 754 floating point numbers, there is almost no mathematical equivalence which holds, and consequently almost all folding or other optimizations will be wrong. One can do all these things behind a flag (trading IEEE compatibility for better code), but this shouldn't be done by default IMHO. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

2014/1/14 Carter Schonwald
maybe so, but having a semantics by default is huge, and honestly i'm not super interested in optimizations that merely change one infinity for another. What would the alternative semantics be?
I'm not sure that I understood your reply: My example regarding -0 was only demonstrating the status quo of GHCi and is IEEE-754-conformant. The 1/foo is only used to distinguish between 0 and -0, it is not about infinities per se. My point was: As much as I propose to keep these current semantics, there might be users who care more about performance than IEEE-754-conformance. For those, relatively simple semantics could be: Regarding optimizations, numbers are considered "mathematical" numbers, ignoring any rounding and precision issues, and everything involving -0, NaN, and infinities is undefined. This would open up optimizations like easy constant folding, transforming 0 + x to x, x - x to 0, x `op` y to y `op` x for mathematically commutative operators, associativity, etc. I'm not 100% sure how useful this would really be, but I think we agree that this shouldn't be the default. Cheers, S.

Sven, I'm one of those people who cares about numerical performance :-). Kinda been my obsession :-). My near term stop gap is writing some very high quality ffi bindings, but I'm very keen on Haskell giving fortran a run for it's money. Glad we agree the version that's easier to debug (IEEE, ie current ghc semantics) should be the default There's much more meaningful ways we can improve floating point perf, like adding simd support more systematically to ghc (which I'm just now getting the ball rolling on that hacking, there's a lot of things I need to do before adding that mind you ). better constant propagation will help in a few cases, and should be explored. But deciding what the right relaxed rules should be isn't something we should do off the cuff. We should write down the space of possible relaxed rules, add engineering support to ghc better experiment with benchmarking various approaches, and then if something has a good perf impact see about providing it exposed through a flag. On Tuesday, January 14, 2014, Sven Panne wrote:
2014/1/14 Carter Schonwald
javascript:;>: maybe so, but having a semantics by default is huge, and honestly i'm not super interested in optimizations that merely change one infinity for another. What would the alternative semantics be?
I'm not sure that I understood your reply: My example regarding -0 was only demonstrating the status quo of GHCi and is IEEE-754-conformant. The 1/foo is only used to distinguish between 0 and -0, it is not about infinities per se.
My point was: As much as I propose to keep these current semantics, there might be users who care more about performance than IEEE-754-conformance. For those, relatively simple semantics could be: Regarding optimizations, numbers are considered "mathematical" numbers, ignoring any rounding and precision issues, and everything involving -0, NaN, and infinities is undefined. This would open up optimizations like easy constant folding, transforming 0 + x to x, x - x to 0, x `op` y to y `op` x for mathematically commutative operators, associativity, etc.
I'm not 100% sure how useful this would really be, but I think we agree that this shouldn't be the default.
Cheers, S.

On 01/14/2014 11:48 AM, Sven Panne wrote:
My point was: As much as I propose to keep these current semantics, there might be users who care more about performance than IEEE-754-conformance.
Adding a -ffast-math flag could be fine IMHO.
For those, relatively simple semantics could be: Regarding optimizations, numbers are considered "mathematical" numbers, ignoring any rounding and precision issues,
How do you plan to constant-fold things like "log(cos(pi**pi))" without rounding? I checked C, and apparently the optimizer is entitled to assume the default floating-point control modes (e.g. rounding mode, quiet/signaling NaN) are in effect except in scopes where "#pragma STDC FENV_ACCESS ON" is given. However the standard does not entitle the optimizer to change rounding in any other way. This is sufficient for constant-folding in regions where FENV_ACCESS is off. GCC also has flags to control floating-point optimization: http://gcc.gnu.org/wiki/FloatingPointMath Probably it's best not to touch floating point optimization without understanding all these issues. Hmm, I can't see how non-default floating point control mode is compatible with Haskell's purity... Even without optimizations, (1/3 :: Double) could evaluate to two different values in the same program if the floating-point rounding mode changes during execution (e.g. by C fesetenv()). -Isaac

I emphatically and forcefully agree with Isaac. Thanks for articulating these issues much better than I could. On Tue, Jan 14, 2014 at 2:54 PM, Isaac Dupree < ml@isaac.cedarswampstudios.org> wrote:
On 01/14/2014 11:48 AM, Sven Panne wrote:
My point was: As much as I propose to keep these current semantics, there might be users who care more about performance than IEEE-754-conformance.
Adding a -ffast-math flag could be fine IMHO.
For those, relatively simple semantics could be:
Regarding optimizations, numbers are considered "mathematical" numbers, ignoring any rounding and precision issues,
How do you plan to constant-fold things like "log(cos(pi**pi))" without rounding?
I checked C, and apparently the optimizer is entitled to assume the default floating-point control modes (e.g. rounding mode, quiet/signaling NaN) are in effect except in scopes where "#pragma STDC FENV_ACCESS ON" is given. However the standard does not entitle the optimizer to change rounding in any other way. This is sufficient for constant-folding in regions where FENV_ACCESS is off. GCC also has flags to control floating-point optimization: http://gcc.gnu.org/wiki/FloatingPointMath
Probably it's best not to touch floating point optimization without understanding all these issues.
Hmm, I can't see how non-default floating point control mode is compatible with Haskell's purity... Even without optimizations, (1/3 :: Double) could evaluate to two different values in the same program if the floating-point rounding mode changes during execution (e.g. by C fesetenv()).
-Isaac

2014/1/14 Carter Schonwald
I emphatically and forcefully agree with Isaac. [...]
Yup, I would prefer to no touch FP optimization in a rush, too. I am not sure if this is still the case today, but I remember breaking some FP stuff in GHC when doing cross-compilation + bootstrapping with it ages ago. Can this still happen today? I don't know how GHC is ported to a brand new platform nowadays... All the rounding magic etc. has to happen as if it was executed on the target platform, not like on the platform GHC is running. More fun stuff to consider, I guess. .-)

some of those issues come up even more forcefully when cross compiling from
a 64bit to 32 bit architecture :), but you're absolutely right, and It
sounds like theres a clear near term concensus
Even more fun is in the case of ghc-ios, where ideally a single build would
create the right object code for 64bit + 32bit arm both! I think theres
some subtle fun there! :)
On Tue, Jan 14, 2014 at 4:12 PM, Sven Panne
2014/1/14 Carter Schonwald
: I emphatically and forcefully agree with Isaac. [...]
Yup, I would prefer to no touch FP optimization in a rush, too. I am not sure if this is still the case today, but I remember breaking some FP stuff in GHC when doing cross-compilation + bootstrapping with it ages ago. Can this still happen today? I don't know how GHC is ported to a brand new platform nowadays... All the rounding magic etc. has to happen as if it was executed on the target platform, not like on the platform GHC is running. More fun stuff to consider, I guess. .-)
participants (4)
-
Carter Schonwald
-
Isaac Dupree
-
Kyle Van Berendonck
-
Sven Panne