Proposal: Add "fma" to the RealFloat class

This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html "fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively. I created a ticket along these lines already: https://ghc.haskell.org/trac/ghc/ticket/10364 Edward suggested that the matter should further be discussed here. I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new method to the RealFloat class: class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a The intention is that fma x y z = x * y + z except the multiplication and addition are done infinitely-precisely, and then rounded only once; as opposed to two roundings as one would get with the above implementation. Most modern architectures directly support this operation so we can map it easily; and in case the architecture does not have it available, we can get it via the C-math libraries, where it appears under the names fma (the double version), and fmaf (the float version.) There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first place. While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear. Discussion period: 2 weeks.

I think there *should* be a default definition in terms of (+) and (*). A person who defines their own instance for their own purpose should be free to ignore this function if it's not needed for their specific application. On 29/04/15 10:21, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
I created a ticket along these lines already: https://ghc.haskell.org/trac/ghc/ticket/10364
Edward suggested that the matter should further be discussed here.
I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
The intention is that
fma x y z = x * y + z
except the multiplication and addition are done infinitely-precisely, and then rounded only once; as opposed to two roundings as one would get with the above implementation. Most modern architectures directly support this operation so we can map it easily; and in case the architecture does not have it available, we can get it via the C-math libraries, where it appears under the names fma (the double version), and fmaf (the float version.)
There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first place.
While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Discussion period: 2 weeks.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 29 April 2015 at 17:21, Levent Erkok
This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
I created a ticket along these lines already: https://ghc.haskell.org/trac/ghc/ticket/10364
Edward suggested that the matter should further be discussed here.
I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
The intention is that
fma x y z = x * y + z
Can we please have a better name (even if it's just "fusedMultipleAdd")? I have no real opinion on adding this function or not, but just seeing that function (in other code, doing ":info RealFloat" in ghci, etc.) tells me nothing about what it is. Ideally we wouldn't have to rely upon reading Haddock to understand random TLAs in code.
except the multiplication and addition are done infinitely-precisely, and then rounded only once; as opposed to two roundings as one would get with the above implementation. Most modern architectures directly support this operation so we can map it easily; and in case the architecture does not have it available, we can get it via the C-math libraries, where it appears under the names fma (the double version), and fmaf (the float version.)
There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first place.
While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Discussion period: 2 weeks.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

On Wed, 29 Apr 2015, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
Btw. what was the final decision with respect to log1p and expm1? I suggest that the decision for 'fma' will be made consistently with 'log1p' and 'expm1'.
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
Ok, the proposal is about increasing precision. One could also hope that a single fma operation is faster than separate addition and multiplication but as far as I know, fma can even be slower since it has more data dependencies.
I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
RealFloat excludes Complex.
There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first place.
I just read again the whole expm1 thread and default implementations with possible loss of precision seem to be the best option. This way, one can mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not make anything worse. Types with a guaranteed high precision should be put in a Fused class.
While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Although I like descriptive names, the numeric classes already contain mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the abbreviation for consistency. Btw. in DSP 56002 the same operation is called MAC (multiply-accumulate).

On Wed, Apr 29, 2015 at 5:19 AM, Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Wed, 29 Apr 2015, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on
adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
Btw. what was the final decision with respect to log1p and expm1?
I suggest that the decision for 'fma' will be made consistently with 'log1p' and 'expm1'.
We decided to add them. Then we didn't do it in 7.10. I'll talk to Herbert about how to proceed to get them in 7.12, though we may wait until we know the outcome of this proposal and fuse the two together into one patch.
I think the proposal is rather straightforward, and should be
noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
RealFloat excludes Complex.
Good point. If we wanted to we could push this all the way up to Num given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'. There should be no default definitions; as an incorrect (two-rounding
version) would essentially beat the purpose of having fma in the first place.
I just read again the whole expm1 thread and default implementations with possible loss of precision seem to be the best option. This way, one can mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not make anything worse. Types with a guaranteed high precision should be put in a Fused class.
I argued rather strenuously for this for the expm1, log1p case, but wasn't able to win folks over. While the name "fma" is well-established in the arithmetic/hardware
community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Although I like descriptive names, the numeric classes already contain mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the abbreviation for consistency. Btw. in DSP 56002 the same operation is called MAC (multiply-accumulate).
I have no strong preference on the name. fusedMultiplyAdd has the benefit that a non-domain-expert can figure it out. fma is traditional. -Edward

On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields. --ken

I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well. I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude. For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine. Twan On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

+1 for "mulAdd". The "fused" would be a misnomer if there's a default implementation.
Tom
El May 1, 2015, a las 12:44, Twan van Laarhoven
I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well.
I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude.
For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine.
Twan
On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

The Num class is defined in GHC.Num, so Prelude could import GHC.Num hiding
(fma) to avoid having another round of prelude changes breaking code.
On Fri, May 1, 2015 at 12:44 PM, Twan van Laarhoven
I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well.
I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude.
For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine.
Twan
On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num
given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

I'm somewhat opposed to the Num class in general, and very much opposed to
calling floating point representations "numbers" in particular. How are
they numbers when they don't obey associative or distributive laws, let
alone cancellation, commutativity, ....? I know Carter disagrees with me,
but I'll stand my ground, resolute! I suppose adding some more nonsense
into the trash heap won't do too much more harm, but I'd much rather see
some deeper thought about how we want to deal with floating point.
On May 1, 2015 1:35 PM, "adam vogt"
The Num class is defined in GHC.Num, so Prelude could import GHC.Num hiding (fma) to avoid having another round of prelude changes breaking code.
On Fri, May 1, 2015 at 12:44 PM, Twan van Laarhoven
wrote: I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well.
I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude.
For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine.
Twan
On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num
given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Would it make sense to create a new class for operations like fma that has
accuracy guarantees as part of its typeclass laws? Or would managing a
bunch of typeclasses like that create too much syntactic, conceptual or
performance overhead for actual use?
To me, that seems like it could be better than polluting Num—which, after
all, features prominently in the Prelude—but it might make for worse
discoverability.
If we do add it to Num, I strongly support having a default implementation.
We don't want to make implementing a custom numeric type any more difficult
than it has to be, and somebody unfamiliar with fma would just manually
implement it without any optimizations anyhow or just leave it out,
incomplete instantiation warnings nonwithstanding. Num is already a bit to
big for casual use (I rarely care about signum and abs myself), so making
it *bigger* is not appealing.
Personally, I'm a bit torn on the naming. Something like mulAdd or
fusedMultiplyAdd is great for non-experts, but it feels like fma is
something that we only expect experts to care about, so perhaps it's better
to name it in line with their expectations.
On Fri, May 1, 2015 at 10:52 AM, David Feuer
I'm somewhat opposed to the Num class in general, and very much opposed to calling floating point representations "numbers" in particular. How are they numbers when they don't obey associative or distributive laws, let alone cancellation, commutativity, ....? I know Carter disagrees with me, but I'll stand my ground, resolute! I suppose adding some more nonsense into the trash heap won't do too much more harm, but I'd much rather see some deeper thought about how we want to deal with floating point. On May 1, 2015 1:35 PM, "adam vogt"
wrote: The Num class is defined in GHC.Num, so Prelude could import GHC.Num hiding (fma) to avoid having another round of prelude changes breaking code.
On Fri, May 1, 2015 at 12:44 PM, Twan van Laarhoven
wrote: I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well.
I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude.
For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine.
Twan
On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num
given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

The main problem that I find in practice with the 'just exile it to another
class' argument is that it creates a pain point. Do you implement against
the worse implementations of exp or do they use the specialized class that
provides harder guarantees for expm1 to avoid destroying all precision very
near 1? It means that anything that builds on top of the abstraction you
provide gets built two ways at least.
I wound up with a lot of code that was written against Monad and Functor
separately and spent much of my time dealing with nonsensical "made up"
organization issues like "is this version the liftM-like one or the
fmap-like one?" If it is in the class then folks can just reach out and use
it. (<$) being directly in Functor means you can just reach for it and get
better sharing when you 'refill' a functor with a constant. If it was
exiled to some other place, there'd always the worry about of whether you
should implement for portability or precision and you'll never get to stop
thinking about it.
-Edward
On Fri, May 1, 2015 at 2:00 PM, Tikhon Jelvis
Would it make sense to create a new class for operations like fma that has accuracy guarantees as part of its typeclass laws? Or would managing a bunch of typeclasses like that create too much syntactic, conceptual or performance overhead for actual use?
To me, that seems like it could be better than polluting Num—which, after all, features prominently in the Prelude—but it might make for worse discoverability.
If we do add it to Num, I strongly support having a default implementation. We don't want to make implementing a custom numeric type any more difficult than it has to be, and somebody unfamiliar with fma would just manually implement it without any optimizations anyhow or just leave it out, incomplete instantiation warnings nonwithstanding. Num is already a bit to big for casual use (I rarely care about signum and abs myself), so making it *bigger* is not appealing.
Personally, I'm a bit torn on the naming. Something like mulAdd or fusedMultiplyAdd is great for non-experts, but it feels like fma is something that we only expect experts to care about, so perhaps it's better to name it in line with their expectations.
On Fri, May 1, 2015 at 10:52 AM, David Feuer
wrote: I'm somewhat opposed to the Num class in general, and very much opposed to calling floating point representations "numbers" in particular. How are they numbers when they don't obey associative or distributive laws, let alone cancellation, commutativity, ....? I know Carter disagrees with me, but I'll stand my ground, resolute! I suppose adding some more nonsense into the trash heap won't do too much more harm, but I'd much rather see some deeper thought about how we want to deal with floating point. On May 1, 2015 1:35 PM, "adam vogt"
wrote: The Num class is defined in GHC.Num, so Prelude could import GHC.Num hiding (fma) to avoid having another round of prelude changes breaking code.
On Fri, May 1, 2015 at 12:44 PM, Twan van Laarhoven
wrote: I agree that Num is the place to put this function, with a default implementation. In my mind it is a special combination of (+) and (*), which both live in Num as well.
I dislike the name fma, as that is a three letter acronym with no meaning to people who don't do numeric programming. And by putting the function in Num the name would end up in the Prelude.
For further bikeshedding: my proposal for a name would mulAdd. But fusedMulAdd or fusedMultiplyAdd would also be fine.
Twan
On 2015-04-30 00:19, Ken T Takusagawa wrote:
On Wed, 29 Apr 2015, Edward Kmett wrote:
Good point. If we wanted to we could push this all the way up to Num
given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I too advocate this go in Num. The place I anticipate seeing fma being used is in some polymorphic linear algebra library, and it is not uncommon (having recently done this myself) to do linear algebra on things that aren't RealFloat, e.g., Rational, Complex, or number-theoretic fields.
--ken _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Fri, May 1, 2015 at 5:52 PM, David Feuer
I'm somewhat opposed to the Num class in general, and very much opposed to calling floating point representations "numbers" in particular. How are they numbers when they don't obey associative or distributive laws, let alone cancellation, commutativity, ....? I know Carter
TBH I think Num is a lost cause. If you want mathematical numbers, set up a parallel class instead of trying to force a class designed for numbers "in the wild" to be a pure theory class. This operation in particular is *all about* numbers in the wild --- it has no place in theory, it's an optimization for hardware implementations. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

well said brandon. FMA support is absolutely a mathematical accuracy and
performance engineering thing (except when it hinder performance ). It is
worth noting that most modern CPUS support several *different* versions of
the FMA operation, but thats beyond the scope / goal of this proposal I
think.
but yeah, for all of Num's warts, probably the right place to put it, with
a default implementation in terms of * and + (and compiler supplied primops
for applicable prelude types)
On Fri, May 1, 2015 at 2:11 PM, Brandon Allbery
On Fri, May 1, 2015 at 5:52 PM, David Feuer
wrote: I'm somewhat opposed to the Num class in general, and very much opposed to calling floating point representations "numbers" in particular. How are they numbers when they don't obey associative or distributive laws, let alone cancellation, commutativity, ....? I know Carter
TBH I think Num is a lost cause. If you want mathematical numbers, set up a parallel class instead of trying to force a class designed for numbers "in the wild" to be a pure theory class.
This operation in particular is *all about* numbers in the wild --- it has no place in theory, it's an optimization for hardware implementations.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Wed, Apr 29, 2015 at 11:48 AM, Edward Kmett
Good point. If we wanted to we could push this all the way up to Num given the operations involved, and I could see that you could benefit from it there for types that have nothing to do with floating point, e.g. modular arithmetic could get away with using a single 'mod'.
I'm strongly in favor of adding fma *somewhere*, even if just as a family of primops; though, of course, it'd be nicer to put it in a type class so we don't have to pull in GHC.Exts. And as far as type classes go, I'm strongly in favor of pushing it all the way up to Num (or rather, to Semiring if only we had such a thing). There's no conceptual reason for it to live in RealFloat. -- Live well, ~wren

Hi,
little information.
General CPUs use term of "FMA" for "Mul + Add" operation
and implement special instructions.
x86(AMD64, Intel64) has FMA instructions:
FMADD132PD, ...
ARM has FMA instructions:
VMLA, ...
In DSP culture, it's called "MAC(Multiply and Accumulator)".
Traditional DSPs have MAC(Multiply and Accumulator) instructions:
TI's C67 has MAC instructions:
MAC, ...
If you map "fma" function to cpu's raw instruction,
be careful for rounding and saturation mode.
BTW, "FMA" operation is defined in IEEE754-2008 standard.
Regards,
Takenobu
2015-04-29 18:19 GMT+09:00 Henning Thielemann : On Wed, 29 Apr 2015, Levent Erkok wrote: This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for
instance see here:
https://mail.haskell.org/pipermail/libraries/2014-April/022667.html Btw. what was the final decision with respect to log1p and expm1? I suggest that the decision for 'fma' will be made consistently with
'log1p' and 'expm1'. "fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications.
The idea is to multiply two floats and add a third with just one
rounding, and thus preserving more precision.
There are a multitude of applications for this operation in engineering
data-analysis, and modern processors
come with custom implementations and a lot of hardware to support it
natively. Ok, the proposal is about increasing precision. One could also hope that a
single fma operation is faster than separate addition and multiplication
but as far as I know, fma can even be slower since it has more data
dependencies. I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new
method to the RealFloat class: class (RealFrac a, Floating a) => RealFloat a where
...
fma :: a -> a -> a -> a RealFloat excludes Complex. There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first
place. I just read again the whole expm1 thread and default implementations with
possible loss of precision seem to be the best option. This way, one can
mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not
make anything worse. Types with a guaranteed high precision should be put
in a Fused class. While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if
that is deemed more clear. Although I like descriptive names, the numeric classes already contain
mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the
abbreviation for consistency. Btw. in DSP 56002 the same operation is
called MAC (multiply-accumulate).
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

We have (almost) no tradition of using CPU instruction names for our own
function, and I don't see why now is the time to start. To take a recent
example, we have countLeadingZeros and countTrailingZeros rather than clz,
ctz, ctlz, cttz, bsf, bsr, etc. We also have popCount instead of popcnt,
and use shiftR and shiftL instead of things like shl, shr, sla, sal, sra,
sar, etc. Thus I am -1 on calling this thing fma. multiplyAdd seems more
reasonable to me.
On Sun, May 3, 2015 at 3:42 AM, Takenobu Tani
Hi,
little information.
General CPUs use term of "FMA" for "Mul + Add" operation and implement special instructions.
x86(AMD64, Intel64) has FMA instructions: FMADD132PD, ...
ARM has FMA instructions: VMLA, ...
In DSP culture, it's called "MAC(Multiply and Accumulator)". Traditional DSPs have MAC(Multiply and Accumulator) instructions:
TI's C67 has MAC instructions: MAC, ...
If you map "fma" function to cpu's raw instruction, be careful for rounding and saturation mode.
BTW, "FMA" operation is defined in IEEE754-2008 standard.
Regards, Takenobu
2015-04-29 18:19 GMT+09:00 Henning Thielemann < lemming@henning-thielemann.de>:
On Wed, 29 Apr 2015, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on
adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
Btw. what was the final decision with respect to log1p and expm1?
I suggest that the decision for 'fma' will be made consistently with 'log1p' and 'expm1'.
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is
the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
Ok, the proposal is about increasing precision. One could also hope that a single fma operation is faster than separate addition and multiplication but as far as I know, fma can even be slower since it has more data dependencies.
I think the proposal is rather straightforward, and should be
noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
RealFloat excludes Complex.
There should be no default definitions; as an incorrect (two-rounding
version) would essentially beat the purpose of having fma in the first place.
I just read again the whole expm1 thread and default implementations with possible loss of precision seem to be the best option. This way, one can mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not make anything worse. Types with a guaranteed high precision should be put in a Fused class.
While the name "fma" is well-established in the arithmetic/hardware
community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Although I like descriptive names, the numeric classes already contain mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the abbreviation for consistency. Btw. in DSP 56002 the same operation is called MAC (multiply-accumulate). _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Thank you for all the feedback on this proposal. Based on the feedback, I
came to conclude that the original idea did not really capture what I
really was after, and hence I think this proposal needs to be shelved for
the time being.
I want to summarize the points made so far:
* Almost everyone agrees that we should have this functionality
available. (But see below for the direction I want to take it in.)
* There's some disagreement on the name chosen, but I think this is
less important for the time being.
* The biggest gripe is where does "fma" really belong. Original
suggestion was 'RealFloat', but people pointed 'Num' is just a good place
as well.
* Most folks want a default definition, and see "fma" as an
optimization.
It is these last two points actually that convinced me this proposal is not
really what I want to have. I do not see "fma" as an optimization. In
particular, I'd be very concerned if the compiler substituted "fma x y z"
for "x*y+z". The entire reason why IEEE754 has an fma operation is because
those two expressions have different values in general. By the same token,
I'm also against providing a default implementation. I see this not as an
increased-precision issue, but rather a semantic one; where "x*y+z" and
"fma x y z" *should* produce two different values, per the IEEE754 spec.
It's not really an optimization, but how floating-point values work. In
that sense "fma" is a separate operation that's related to multiplication
and addition, but is not definable in those terms alone.
Having said that, it was also pointed out that for non-float values this
can act as an optimization. (Modular arithmetic was given as an example.)
I'd think that functionality is quite different than the original proposal,
and perhaps should be tackled separately. My original proposal was not
aiming for that particular use case.
My original motivation was to give Haskell access to the floating-point
circuitry that hardware-manufacturers are putting a lot of effort and
energy into. It's a shame that modern processors provide a ton of
instructions around floating-point operations, but such operations are
simply very hard to use from many high-level languages, including Haskell.
Two other points were raised, that also convinced me to seek an alternative
solution:
* Tikhon Jelvis suggested these functions should be put in a different
class, which suggests that we're following IEEE754, and not some idealized
model of numbers. I think this suggestion is spot on, and is very much in
line with what I wanted to have.
* Takebonu Tani kindly pointed that a discussion of floats in the
absence of rounding-modes is a moot one, as the entire semantics is based
on rounding. Haskell simply picks "RoundNearestTiesToEven," but there are 4
other rounding modes defined by IEEE754, and I think we need a way to
access those from Haskell in a convenient way.
Based on this analysis, I'm withdrawing the original proposal. I think fma
and other floating-point arithmetic operations are very important to
support properly, but it should not be done by tacking them on to Num or
RealFloat; but rather in a new class that also considers rounding-mode
properly.
The advantage of the "separate" class approach is, of course, I (or someone
else) can create such a class and push it on to hackage, using FFI to
delegate the task of implementation to the land-of-C, by supporting
rounding modes and other floating-point weirdness appropriately. Once that
class stabilizes and its details are ironed out, then we can imagine
cooperating with GHC folks to actually bypass the FFI and directly generate
native code whenever possible.
This is the direction I intend to move on. Please drop me a line if you'd
like to help out and/or have any feedback.
Thanks!
-Levent.
On Sun, May 3, 2015 at 7:27 AM, David Feuer
We have (almost) no tradition of using CPU instruction names for our own function, and I don't see why now is the time to start. To take a recent example, we have countLeadingZeros and countTrailingZeros rather than clz, ctz, ctlz, cttz, bsf, bsr, etc. We also have popCount instead of popcnt, and use shiftR and shiftL instead of things like shl, shr, sla, sal, sra, sar, etc. Thus I am -1 on calling this thing fma. multiplyAdd seems more reasonable to me.
On Sun, May 3, 2015 at 3:42 AM, Takenobu Tani
wrote: Hi,
little information.
General CPUs use term of "FMA" for "Mul + Add" operation and implement special instructions.
x86(AMD64, Intel64) has FMA instructions: FMADD132PD, ...
ARM has FMA instructions: VMLA, ...
In DSP culture, it's called "MAC(Multiply and Accumulator)". Traditional DSPs have MAC(Multiply and Accumulator) instructions:
TI's C67 has MAC instructions: MAC, ...
If you map "fma" function to cpu's raw instruction, be careful for rounding and saturation mode.
BTW, "FMA" operation is defined in IEEE754-2008 standard.
Regards, Takenobu
2015-04-29 18:19 GMT+09:00 Henning Thielemann < lemming@henning-thielemann.de>:
On Wed, 29 Apr 2015, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on
adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
Btw. what was the final decision with respect to log1p and expm1?
I suggest that the decision for 'fma' will be made consistently with 'log1p' and 'expm1'.
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is
the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
Ok, the proposal is about increasing precision. One could also hope that a single fma operation is faster than separate addition and multiplication but as far as I know, fma can even be slower since it has more data dependencies.
I think the proposal is rather straightforward, and should be
noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
RealFloat excludes Complex.
There should be no default definitions; as an incorrect (two-rounding
version) would essentially beat the purpose of having fma in the first place.
I just read again the whole expm1 thread and default implementations with possible loss of precision seem to be the best option. This way, one can mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not make anything worse. Types with a guaranteed high precision should be put in a Fused class.
While the name "fma" is well-established in the arithmetic/hardware
community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Although I like descriptive names, the numeric classes already contain mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the abbreviation for consistency. Btw. in DSP 56002 the same operation is called MAC (multiply-accumulate). _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Thanks for taking time to write this, Levent. Now that you explain this in such detail, it's clear why implementing fma in terms of add and multiply is wrong. I also have to admit that upon the first reading of your proposal, I confused RealFloat with RealFrac. Since RealFloat should only be implemented by actual floating-point types, I retract my earlier objection. And the idea of putting the IEEE754-specific functions in a separate class (or even module) sounds reasonable, too. On 04/05/15 00:11, Levent Erkok wrote:
Thank you for all the feedback on this proposal. Based on the feedback, I came to conclude that the original idea did not really capture what I really was after, and hence I think this proposal needs to be shelved for the time being.
I want to summarize the points made so far:
* Almost everyone agrees that we should have this functionality available. (But see below for the direction I want to take it in.) * There's some disagreement on the name chosen, but I think this is less important for the time being. * The biggest gripe is where does "fma" really belong. Original suggestion was 'RealFloat', but people pointed 'Num' is just a good place as well. * Most folks want a default definition, and see "fma" as an optimization.
It is these last two points actually that convinced me this proposal is not really what I want to have. I do not see "fma" as an optimization. In particular, I'd be very concerned if the compiler substituted "fma x y z" for "x*y+z". The entire reason why IEEE754 has an fma operation is because those two expressions have different values in general. By the same token, I'm also against providing a default implementation. I see this not as an increased-precision issue, but rather a semantic one; where "x*y+z" and "fma x y z" *should* produce two different values, per the IEEE754 spec. It's not really an optimization, but how floating-point values work. In that sense "fma" is a separate operation that's related to multiplication and addition, but is not definable in those terms alone.
Having said that, it was also pointed out that for non-float values this can act as an optimization. (Modular arithmetic was given as an example.) I'd think that functionality is quite different than the original proposal, and perhaps should be tackled separately. My original proposal was not aiming for that particular use case.
My original motivation was to give Haskell access to the floating-point circuitry that hardware-manufacturers are putting a lot of effort and energy into. It's a shame that modern processors provide a ton of instructions around floating-point operations, but such operations are simply very hard to use from many high-level languages, including Haskell.
Two other points were raised, that also convinced me to seek an alternative solution:
* Tikhon Jelvis suggested these functions should be put in a different class, which suggests that we're following IEEE754, and not some idealized model of numbers. I think this suggestion is spot on, and is very much in line with what I wanted to have. * Takebonu Tani kindly pointed that a discussion of floats in the absence of rounding-modes is a moot one, as the entire semantics is based on rounding. Haskell simply picks "RoundNearestTiesToEven," but there are 4 other rounding modes defined by IEEE754, and I think we need a way to access those from Haskell in a convenient way.
Based on this analysis, I'm withdrawing the original proposal. I think fma and other floating-point arithmetic operations are very important to support properly, but it should not be done by tacking them on to Num or RealFloat; but rather in a new class that also considers rounding-mode properly.
The advantage of the "separate" class approach is, of course, I (or someone else) can create such a class and push it on to hackage, using FFI to delegate the task of implementation to the land-of-C, by supporting rounding modes and other floating-point weirdness appropriately. Once that class stabilizes and its details are ironed out, then we can imagine cooperating with GHC folks to actually bypass the FFI and directly generate native code whenever possible.
This is the direction I intend to move on. Please drop me a line if you'd like to help out and/or have any feedback.
Thanks!
-Levent.
On Sun, May 3, 2015 at 7:27 AM, David Feuer
mailto:david.feuer@gmail.com> wrote: We have (almost) no tradition of using CPU instruction names for our own function, and I don't see why now is the time to start. To take a recent example, we have countLeadingZeros and countTrailingZeros rather than clz, ctz, ctlz, cttz, bsf, bsr, etc. We also have popCount instead of popcnt, and use shiftR and shiftL instead of things like shl, shr, sla, sal, sra, sar, etc. Thus I am -1 on calling this thing fma. multiplyAdd seems more reasonable to me.
On Sun, May 3, 2015 at 3:42 AM, Takenobu Tani
mailto:takenobu.hs@gmail.com> wrote: Hi,
little information.
General CPUs use term of "FMA" for "Mul + Add" operation and implement special instructions.
x86(AMD64, Intel64) has FMA instructions: FMADD132PD, ...
ARM has FMA instructions: VMLA, ...
In DSP culture, it's called "MAC(Multiply and Accumulator)". Traditional DSPs have MAC(Multiply and Accumulator) instructions:
TI's C67 has MAC instructions: MAC, ...
If you map "fma" function to cpu's raw instruction, be careful for rounding and saturation mode.
BTW, "FMA" operation is defined in IEEE754-2008 standard.
Regards, Takenobu
2015-04-29 18:19 GMT+09:00 Henning Thielemann
mailto:lemming@henning-thielemann.de>: On Wed, 29 Apr 2015, Levent Erkok wrote:
This proposal is very much in the spirit of the earlier proposal on adding new float/double functions; for instance see here: https://mail.haskell.org/pipermail/libraries/2014-April/022667.html
Btw. what was the final decision with respect to log1p and expm1?
I suggest that the decision for 'fma' will be made consistently with 'log1p' and 'expm1'.
"fma" (a.k.a. fused-multiply-add) is one of those functions; which is the workhorse in many HPC applications. The idea is to multiply two floats and add a third with just one rounding, and thus preserving more precision. There are a multitude of applications for this operation in engineering data-analysis, and modern processors come with custom implementations and a lot of hardware to support it natively.
Ok, the proposal is about increasing precision. One could also hope that a single fma operation is faster than separate addition and multiplication but as far as I know, fma can even be slower since it has more data dependencies.
I think the proposal is rather straightforward, and should be noncontroversial. To wit, we shall add a new method to the RealFloat class:
class (RealFrac a, Floating a) => RealFloat a where ... fma :: a -> a -> a -> a
RealFloat excludes Complex.
There should be no default definitions; as an incorrect (two-rounding version) would essentially beat the purpose of having fma in the first place.
I just read again the whole expm1 thread and default implementations with possible loss of precision seem to be the best option. This way, one can mechanically replace all occurrences of (x*y+z) by (fma x y z) and will not make anything worse. Types with a guaranteed high precision should be put in a Fused class.
While the name "fma" is well-established in the arithmetic/hardware community and in the C-library, we can also go with "fusedMultiplyAdd," if that is deemed more clear.
Although I like descriptive names, the numeric classes already contain mostly abbreviations (abs, exp, sin, tanh, ...) Thus I would prefer the abbreviation for consistency. Btw. in DSP 56002 the same operation is called MAC (multiply-accumulate).

On Sun, May 3, 2015 at 4:11 PM, Levent Erkok

.... how would you have an implementation of finite precision floating
point that has the "expected" exact algebraic laws for * and +?
I would argue that Float and Double do satisfy a form of the standard
algebric laws where equality is approximate.
eg (a+(b+c)) - ((a+b)+c) <= \epsilon, where epsilon is some constant
multiple of max(ulp(a),ulp(b),ulp(c)).
(a similar idea applies to pretty much any other algebraic law you can
state, such as distributivity etc)
I do think that it'd be useful if the RealFloat class provided an ulp
function (unit of least precision), which is available as part of any IEEE
compliant c float library.
there are MANY computable number represntations where the *exact* algebraic
laws dont hold, but this *approximate* form which provides some notion of
bounded forwards/backwards relative/absolute error bound guarantee in a
particularly strong way.
i think we should figure out articulating laws that play nice for both the
*exact* and *approximate* universes. the
On Sun, May 3, 2015 at 7:05 PM, Mike Meyer
On Sun, May 3, 2015 at 4:11 PM, Levent Erkok
wrote: * Tikhon Jelvis suggested these functions should be put in a different class, which suggests that we're following IEEE754, and not some idealized model of numbers. I think this suggestion is spot on, and is very much in line with what I wanted to have. This is very much in line with a suggestion I've been toying with for a long time. Basically, we have three different ideas for how floats should behave, and the current implementation isn't any of them. So I've been thinking that we ought to deal with this by moving Float out of Prelude - or at least large chunks of it.
The three different models are:
1) Real numbers. We aren't going to get those.
2) IEEE Floats. This is what we've got, except as noted, there are lots of things that come with this that we don't provide.
3) Floats that obey the laws of Num. We don't get that, mostly because getting #2 breaks things.
The breakage of #3 causes people creates behavior that's surprising - at least to people who aren't familiar with IEEE Floats.
So the proposal I've been toying with was something along the lines of breaking RealFloat up along class lines. Those classes where RealFloat obeyed the class laws and IEEE Float behavior would stay in RealFloat. The rest would move out, and could be gotten by importing either Data.Float.IEEE or Data.Float.Num (or some such).
Ideally, this will leave enough floating point behavior in the Prelude that doing simple calculations would just work - at least as well as it ever did, anyway. When you start doing things that can currently generate surprising results, you will need to import one of the two options. Figuring out which one means there's a chance you'll also figure out why you sometimes get those surprising results.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Sun, May 3, 2015 at 6:50 PM, Carter Schonwald wrote: .... how would you have an implementation of finite precision floating
point that has the "expected" exact algebraic laws for * and +? That's model #1 that we can't have. So you don't. I would argue that Float and Double do satisfy a form of the standard
algebric laws where equality is approximate. eg (a+(b+c)) - ((a+b)+c) <= \epsilon, where epsilon is some constant
multiple of max(ulp(a),ulp(b),ulp(c)).
(a similar idea applies to pretty much any other algebraic law you can
state, such as distributivity etc) So how do you fix the fact that any comparison with a NaN and a non-NaN is
false? Among other IEEE oddities. I do think that it'd be useful if the RealFloat class provided an ulp
function (unit of least precision), which is available as part of any IEEE
compliant c float library. there are MANY computable number represntations where the *exact*
algebraic laws dont hold, but this *approximate* form which provides some
notion of bounded forwards/backwards relative/absolute error bound
guarantee in a particularly strong way. True. That's the root of the problem the proposal is trying to solve. i think we should figure out articulating laws that play nice for both the
*exact* and *approximate* universes. We also need laws that play nice for the IEEE universe, because people
doing serious numerical work want that one. I believe you will wind up with
two different sets of laws, which is why I proposed taking the parts that
don't agree out of the Prelude, and letting users import the ones they want
to use.

Hi, Am Sonntag, den 03.05.2015, 14:11 -0700 schrieb Levent Erkok:
Based on this analysis, I'm withdrawing the original proposal. I think fma and other floating-point arithmetic operations are very important to support properly, but it should not be done by tacking them on to Num or RealFloat; but rather in a new class that also considers rounding-mode properly.
does it really have to be a class? How much genuinely polymorphic code is there out there that yet requires this precise handling of precision? Have you considered adding it as monomorphic functions fmaDouble, fmaFloat etc. on hackage, using FFI? Then those who need these functions can start to use them. Furthermore you can start getting the necessary primops supported in GHC, and have your library transparently use them when available. And only then, when we have the implementation in place and actual users, we can evaluate whether we need an abstract class for this. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Quite a bit actually.
Consider something like:
http://hackage.haskell.org/package/ad-4.2.1.1/docs/src/Numeric-AD-Rank1-Newt...
The step function in there could be trivially adapted to using fused
multiplyAdd and precision would just improve. If such a member _were_ in
Num, I'd use it in a heartbeat there. If it were in an extra class? I'd
have to make a second copy of the function to even try to see the precision
win.
Most of my numeric code is generic in some fashion, working over vector
spaces or simpler number types just as easily.
As this proposal has been withdrawn, the point is more or less moot for now.
-Edward
On Mon, May 4, 2015 at 4:14 AM, Joachim Breitner
Hi,
Am Sonntag, den 03.05.2015, 14:11 -0700 schrieb Levent Erkok:
Based on this analysis, I'm withdrawing the original proposal. I think fma and other floating-point arithmetic operations are very important to support properly, but it should not be done by tacking them on to Num or RealFloat; but rather in a new class that also considers rounding-mode properly.
does it really have to be a class? How much genuinely polymorphic code is there out there that yet requires this precise handling of precision?
Have you considered adding it as monomorphic functions fmaDouble, fmaFloat etc. on hackage, using FFI? Then those who need these functions can start to use them.
Furthermore you can start getting the necessary primops supported in GHC, and have your library transparently use them when available.
And only then, when we have the implementation in place and actual users, we can evaluate whether we need an abstract class for this.
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Levent Erkok wrote:
...I think this proposal needs to be shelved for the time being.
I wrote:
Nevertheless, I vote for doing it now.
Edward Kmett wrote:
As this proposal has been withdrawn, the point is more or less moot for now.
OK, let me make myself more clear. I hereby propose the exact same proposal that Levant originally proposed in this thread and then withdrew, with the caveat that the scope of the proposal is explicitly orthogonal to any large scale change to the way we do floating point. Discussion period: 2 weeks, minus time spent so far in this thread since Levant's original proposal. Thanks, Yitz

Yitz: Thanks for taking over. I do agree that "fma" can just be added to
the Num class with all the ramifications and treated as an "optimization."
But that's a different proposal than what I had in mind, so I'm perfectly
happy you pursuing this version.
Just one comment: The name "FMA" is quite overloaded, and perhaps it should
be reserved to the true IEEE754 version. I think someone suggested
'mulAccum' as an alternative, which does make sense if one thinks about the
dot-product operation. Please be absolutely clear in the documentation that
this is not the IEEE754-fma; but rather a fused-multiply-add operation that
is used for the Num class, following some idealized notion of numbers. In
particular, the compiler should be free to substitute "a*b+c" with
"mulAccum a b c".
The latter (i.e., the IEEE754 variant) should be addressed in a different
proposal that I intend to work on separately.
-Levent.
On Mon, May 4, 2015 at 9:11 AM, Yitzchak Gale
Levent Erkok wrote:
...I think this proposal needs to be shelved for the time being.
I wrote:
Nevertheless, I vote for doing it now.
Edward Kmett wrote:
As this proposal has been withdrawn, the point is more or less moot for now.
OK, let me make myself more clear.
I hereby propose the exact same proposal that Levant originally proposed in this thread and then withdrew, with the caveat that the scope of the proposal is explicitly orthogonal to any large scale change to the way we do floating point.
Discussion period: 2 weeks, minus time spent so far in this thread since Levant's original proposal.
Thanks, Yitz _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 05/04/2015 08:36 PM, Levent Erkok wrote:
In particular, the compiler should be free to substitute "a*b+c" with "mulAccum a b c". But isn't it unacceptable in some cases? For instance, in this case (taken from Wikipedia): If /x/^2 − /y/^2 is evaluated as ((/x/×/x/) − /y/×/y/) using fused multiply–add, then the result may be negative even when /x/ = /y/ due to the first multiplication discarding low significance bits. This could then lead to an error if, for instance, the square root of the result is then evaluated.

Artyom: That's precisely the point. The true IEEE754 variants where
precision does matter should be part of a different class. What Edward and
Yitz want is an "optimized" multiply-add where the semantics is the same
but one that goes faster.
On Mon, May 4, 2015 at 10:40 AM, Artyom
On 05/04/2015 08:36 PM, Levent Erkok wrote:
In particular, the compiler should be free to substitute "a*b+c" with "mulAccum a b c".
But isn't it unacceptable in some cases? For instance, in this case (taken from Wikipedia):
If *x*2 − *y*2 is evaluated as ((*x*×*x*) − *y*×*y*) using fused multiply–add, then the result may be negative even when *x* = *y* due to the first multiplication discarding low significance bits. This could then lead to an error if, for instance, the square root of the result is then evaluated.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster. No, it looks to me that Edward wants to have a more precise operation in Num: I'd have to make a second copy of the function to even try to see the precision win. Unless I'm wrong, you can't have the following things simultaneously:
The true IEEE754 variants where precision does matter should be part of a different class. So, does it mean that you're fine with not having point #3 because
1. the compiler is free to substitute /a+b*c/ with /mulAdd a b c/ 2. /mulAdd a b c/ is implemented as /fma/ for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754 people who need it would be able to use a separate class for IEEE754 floats?

I think `mulAdd a b c` should be implemented as `a*b+c` even for
Double/Float. It should only be an "optmization" (as in modular
arithmetic), not a semantic changing operation. Thus justifying the
optimization.
"fma" should be the "more-precise" version available for Float/Double. I
don't think it makes sense to have "fma" for other types. That's why I'm
advocating "mulAdd" to be part of "Num" for optimization purposes; and
"fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different
class; but really, that's the price to pay if we claim Haskell has proper
support for IEEE754 semantics. (Which I think it should.) The operation is
just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the
pack here. The situation with floats is even worse in other languages. This
is our chance to make a proper implementation, and we have the right tools
to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?

pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops
to GHC and playing around with different ways to add it to Num (which seems
pretty straightforward, though I think we'd all agree it shouldn't be
exported by Prelude). And then depending on how Yitzchak's reproposal of
that exactly goes (or some iteration thereof) we can get something
useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact
numbers*, and a higher precision FMA for *approximate numbers *(*ie
floating point*), and where I cant sanely use FMA if it lives anywhere but
Num unless I rub typeable everywhere and do runtime type checks for
applicable floating point types, which kinda destroys parametrically in
engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic
(aside from 1-2 very simple things that are possibly questionable), and
until ghc has support for precisly emulating high precision floating point
computation in a portable way, probably wont have any interesting floating
point computation. Mandating that fma a b c === a*b+c for inexact number
datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc
is conservative about optimizing floating point, because it makes doing
correct stability analyses tractable! I look forward to the day that GHC
gets a bit more sophisticated about optimizing floating point computation,
but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster
than the individual primitive operations, merely more accurate when used
carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only
sane place to put it where i can actually use it for a general use library)
and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd
accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if
we want a descriptive name that will be familiar to experts yet easy to
google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims
are portably exposed by ghc-prims (i've spoken with several ghc devs about
that, and they agree to its value, and thats a decision outside of scope of
the libraries purview), and I do hope we can to a consensus about putting
it in Num so that expert library authors can upgrade the guarantees that
they can provide end users without imposing any breaking changes to end
users.
A number of folks have brought up "but Num is broken" as a counter argument
to adding FMA support to Num. I emphatically agree num is borken :), BUT!
I do also believe that fixing up Num prelude has the burden of providing a
whole cloth design for an alternative design that we can get broad
consensus/adoption with. That will happen by dint of actually
experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than
it already is, it just provides expert library authors with a transparent
way of improving the experience of their users with a free upgrade in
answer accuracy if used carefully. Additionally, when Num's "semiring ish
equational laws" are framed with respect to approximate forwards/backwards
stability, there is a perfectly reasonable law for FMA. I am happy to spend
some time trying to write that up more precisely IFF that will tilt those
in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a
method therein for Num. If that constitutes a different and *more palatable
proposal* than what people have articulated so far (by discouraging casual
use by dint of hiding) then I am happy to kick off a new thread with that
concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for
exact arithmetic" or "Num is wrong" that will sway me to the other side,
i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some
alternative designs in userland, I do think it'd be great to figure out a
new Num prelude we can migrate Haskell / GHC to over the next 2-5 years,
but again any such proposal really needs to be realized whole cloth before
it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :)
-Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Carter: Wall of text is just fine!
I'm personally happy to see the results of your experiment. In particular,
the better "code-generation" facilities you add around floats/doubles that
map to the underlying hardware's native instructions, the better. When we
do have proper IEEE floats, we shall surely need all that functionality.
While you're working on this, if you can also watch out for how rounding
modes can be integrated into the operations, that would be useful as well.
I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd
RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional
solution, but could get quite verbose; and might be costly if the
implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the
RoundNearestTiesToEven, but we have lifted IO versions that can be modified
with a "with" like construct: `withRoundingMode RoundTowardsPositive $
fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have
to return some sort of a monadic value (probably in the IO monad) since
it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be
possible. I'd love to hear any insight you gain regarding rounding-modes
during your experiment.
-Levent.
On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald wrote: pardon the wall of text everyone, but I really want some FMA tooling :) I am going to spend some time later this week and next adding FMA primops
to GHC and playing around with different ways to add it to Num (which seems
pretty straightforward, though I think we'd all agree it shouldn't be
exported by Prelude). And then depending on how Yitzchak's reproposal of
that exactly goes (or some iteration thereof) we can get something
useful/usable into 7.12 i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact
numbers*, and a higher precision FMA for *approximate numbers *(*ie
floating point*), and where I cant sanely use FMA if it lives anywhere
but Num unless I rub typeable everywhere and do runtime type checks for
applicable floating point types, which kinda destroys parametrically in
engineering nice things. @levent: ghc doesn't do any optimization for floating point arithmetic
(aside from 1-2 very simple things that are possibly questionable), and
until ghc has support for precisly emulating high precision floating point
computation in a portable way, probably wont have any interesting floating
point computation. Mandating that fma a b c === a*b+c for inexact number
datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc
is conservative about optimizing floating point, because it makes doing
correct stability analyses tractable! I look forward to the day that GHC
gets a bit more sophisticated about optimizing floating point computation,
but that day is still a ways off. relatedly: FMA for float and double are not generally going to be faster
than the individual primitive operations, merely more accurate when used
carefully. point being*, i'm +1 on adding some manner of FMA operations to Num*
(only sane place to put it where i can actually use it for a general use
library) and i dont really care if we name it fusedMultiplyAdd,
multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor
"fusedMultiplyAdd" if we want a descriptive name that will be familiar to
experts yet easy to google for the curious. to repeat: i'm going to do some leg work so that the double and float
prims are portably exposed by ghc-prims (i've spoken with several ghc devs
about that, and they agree to its value, and thats a decision outside of
scope of the libraries purview), and I do hope we can to a consensus about
putting it in Num so that expert library authors can upgrade the guarantees
that they can provide end users without imposing any breaking changes to
end users. A number of folks have brought up "but Num is broken" as a counter
argument to adding FMA support to Num. I emphatically agree num is borken
:), BUT! I do also believe that fixing up Num prelude has the burden of
providing a whole cloth design for an alternative design that we can get
broad consensus/adoption with. That will happen by dint of actually
experimentation and usage. Point being, adding FMA doesn't further entrench current Num any more than
it already is, it just provides expert library authors with a transparent
way of improving the experience of their users with a free upgrade in
answer accuracy if used carefully. Additionally, when Num's "semiring ish
equational laws" are framed with respect to approximate forwards/backwards
stability, there is a perfectly reasonable law for FMA. I am happy to spend
some time trying to write that up more precisely IFF that will tilt those
in opposition to being in favor. I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a
method therein for Num. If that constitutes a different and *more
palatable proposal* than what people have articulated so far (by
discouraging casual use by dint of hiding) then I am happy to kick off a
new thread with that concrete design choice. If theres a counter argument thats a bit more substantive than "Num is for
exact arithmetic" or "Num is wrong" that will sway me to the other side,
i'm all ears, but i'm skeptical of that. I emphatically support those who are displeased with Num to prototype some
alternative designs in userland, I do think it'd be great to figure out a
new Num prelude we can migrate Haskell / GHC to over the next 2-5 years,
but again any such proposal really needs to be realized whole cloth before
it makes its way to being a libraries list proposal. again, pardon the wall of text, i just really want to have nice things :)
-Carter On Mon, May 4, 2015 at 2:22 PM, Levent Erkok I think `mulAdd a b c` should be implemented as `a*b+c` even for
Double/Float. It should only be an "optmization" (as in modular
arithmetic), not a semantic changing operation. Thus justifying the
optimization. "fma" should be the "more-precise" version available for Float/Double. I
don't think it makes sense to have "fma" for other types. That's why I'm
advocating "mulAdd" to be part of "Num" for optimization purposes; and
"fma" reserved for true IEEE754 types and semantics. I understand that Edward doesn't like this as this requires a different
class; but really, that's the price to pay if we claim Haskell has proper
support for IEEE754 semantics. (Which I think it should.) The operation is
just different. It also should account for the rounding-modes properly. I think we can pull this off just fine; and Haskell can really lead the
pack here. The situation with floats is even worse in other languages. This
is our chance to make a proper implementation, and we have the right tools
to do so. -Levent. On Mon, May 4, 2015 at 10:58 AM, Artyom On 05/04/2015 08:49 PM, Levent Erkok wrote: Artyom: That's precisely the point. The true IEEE754 variants where
precision does matter should be part of a different class. What Edward and
Yitz want is an "optimized" multiply-add where the semantics is the same
but one that goes faster. No, it looks to me that Edward wants to have a more precise operation in
Num: I'd have to make a second copy of the function to even try to see the
precision win. Unless I'm wrong, you can't have the following things simultaneously: 1. the compiler is free to substitute *a+b*c* with *mulAdd a b c*
2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more
precise)
3. Num operations for Double (addition and multiplication) always
conform to IEEE754 The true IEEE754 variants where precision does matter should be part
of a different class. So, does it mean that you're fine with not having point #3 because
people who need it would be able to use a separate class for IEEE754 floats? _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 2015-05-05 00:54, Levent Erkok wrote:
I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
The monadic alternative is more readily extensible to handle IEEE 754's sticky flags: inexact, overflow, underflow, divide-by-zero, and invalid.

On Tue, May 5, 2015 at 7:22 AM, Scott Turner <2haskell@pkturner.org> wrote:
On 2015-05-05 00:54, Levent Erkok wrote:
I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
The monadic alternative is more readily extensible to handle IEEE 754's sticky flags: inexact, overflow, underflow, divide-by-zero, and invalid.
This gets messier than you'd think. Keep in mind we switch contexts within our own green threads constantly on shared system threads / capabilities so the current rounding mode, sticky flags, etc. would become something you'd have to hold per Thread, and then change proactively as threads migrate between CPUs / capabilities, which we're basically completely unaware of right now. This was what I learned when I tried my own hand at it and failed: http://hackage.haskell.org/package/rounding There found I gave up, and moved setting the rounding mode into custom primitives themselves. But even then you find other problems! The libm versions of almost every combinator doesn't just give slightly wrong answers when you switch rounding modes, it gives _completely_ wrong answers when you switch rounding modes. cos basically starts looking like a random number generator. This is rather amusing given that libm is the library that specified how to change the damn rounding mode and fixing this fact it was blocked by Ulrich Drepper when I last looked. Workarounds such as using crlibm http://lipforge.ens-lyon.fr/www/crlibm/ exist, but isn't installed on most platforms and it would rather dramatically complicate the installation of ghc to incur the dependency. This is why I've switched to using MPFR for anything with known rounding modes and just paying a pretty big performance tax for correctness. (That and I'm working to release a library that does exact real arithmetic using trees of nested linear fractional transformations -- assuming I can figure out how to keep performance high enough.) -Edward -Edward
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Irk. If libm is busted when changing rounding modes, that puts a nasty
twist on things.
I do agree that even if that hurdle is jumped, setting the local rounding
mode will have to be part of every green thread context switch. But if
libm is hosed that kinda makes adding that machinery a smudge pointless
until there's a good story for that.
On Tuesday, May 5, 2015, Edward Kmett
On Tue, May 5, 2015 at 7:22 AM, Scott Turner <2haskell@pkturner.org javascript:_e(%7B%7D,'cvml','2haskell@pkturner.org');> wrote:
On 2015-05-05 00:54, Levent Erkok wrote:
I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
The monadic alternative is more readily extensible to handle IEEE 754's sticky flags: inexact, overflow, underflow, divide-by-zero, and invalid.
This gets messier than you'd think. Keep in mind we switch contexts within our own green threads constantly on shared system threads / capabilities so the current rounding mode, sticky flags, etc. would become something you'd have to hold per Thread, and then change proactively as threads migrate between CPUs / capabilities, which we're basically completely unaware of right now.
This was what I learned when I tried my own hand at it and failed:
http://hackage.haskell.org/package/rounding
There found I gave up, and moved setting the rounding mode into custom primitives themselves. But even then you find other problems! The libm versions of almost every combinator doesn't just give slightly wrong answers when you switch rounding modes, it gives _completely_ wrong answers when you switch rounding modes. cos basically starts looking like a random number generator. This is rather amusing given that libm is the library that specified how to change the damn rounding mode and fixing this fact it was blocked by Ulrich Drepper when I last looked.
Workarounds such as using crlibm http://lipforge.ens-lyon.fr/www/crlibm/ exist, but isn't installed on most platforms and it would rather dramatically complicate the installation of ghc to incur the dependency.
This is why I've switched to using MPFR for anything with known rounding modes and just paying a pretty big performance tax for correctness. (That and I'm working to release a library that does exact real arithmetic using trees of nested linear fractional transformations -- assuming I can figure out how to keep performance high enough.)
-Edward
-Edward
_______________________________________________ Libraries mailing list Libraries@haskell.org javascript:_e(%7B%7D,'cvml','Libraries@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hmm, minefield ahead.. But surely there must be a "correct" compromise?
(Even with a huge performance penalty.)
I'll just add that rwbarton had this comment earlier:
"Be aware (if you aren't already) that GHC does not do any management of
floating-point control registers, so functions called through FFI should
take care to clean up their floating-point state, otherwise the rounding
mode can change unpredictably at the level of Haskell code."
So, there're some FFI related issues even if we just leave the work to C.
I'll also note that the current implementation of arithmetic on
Double/Floats already has rounding mode issues: If someone does an FFI call
to change the rounding mode via C (fgetround/fsetround functions) inside
some IO block, then the arithmetic in that block cannot be "lifted" out
even though it appears pure to GHC. Perhaps that should be filed as a bug
too.
-Levent.
On Tue, May 5, 2015 at 8:07 AM, Carter Schonwald wrote: Irk. If libm is busted when changing rounding modes, that puts a nasty
twist on things. I do agree that even if that hurdle is jumped, setting the local rounding
mode will have to be part of every green thread context switch. But if
libm is hosed that kinda makes adding that machinery a smudge pointless
until there's a good story for that. On Tuesday, May 5, 2015, Edward Kmett On Tue, May 5, 2015 at 7:22 AM, Scott Turner <2haskell@pkturner.org>
wrote: On 2015-05-05 00:54, Levent Erkok wrote: I can see at least two designs: * One where the rounding mode goes with the operation: `fpAdd
RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the
functional solution, but could get quite verbose; and might be costly
if the implementation changes the rounding-mode at every issue. * The other is where the operations simply assume the
RoundNearestTiesToEven, but we have lifted IO versions that can be
modified with a "with" like construct: `withRoundingMode
RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not*
`fpAdd` as before) will have to return some sort of a monadic value
(probably in the IO monad) since it'll need to access the rounding
mode currently active. Neither choice jumps out at me as the best one; and a hybrid might
also be possible. I'd love to hear any insight you gain regarding
rounding-modes during your experiment. The monadic alternative is more readily extensible to handle IEEE 754's
sticky flags: inexact, overflow, underflow, divide-by-zero, and invalid. This gets messier than you'd think. Keep in mind we switch contexts
within our own green threads constantly on shared system threads /
capabilities so the current rounding mode, sticky flags, etc. would become
something you'd have to hold per Thread, and then change proactively as
threads migrate between CPUs / capabilities, which we're basically
completely unaware of right now. This was what I learned when I tried my own hand at it and failed: http://hackage.haskell.org/package/rounding There found I gave up, and moved setting the rounding mode into custom
primitives themselves. But even then you find other problems! The libm
versions of almost every combinator doesn't just give slightly wrong
answers when you switch rounding modes, it gives _completely_ wrong answers
when you switch rounding modes. cos basically starts looking like a random
number generator. This is rather amusing given that libm is the library
that specified how to change the damn rounding mode and fixing this fact it
was blocked by Ulrich Drepper when I last looked. Workarounds such as using crlibm
http://lipforge.ens-lyon.fr/www/crlibm/ exist, but isn't installed on
most platforms and it would rather dramatically complicate the installation
of ghc to incur the dependency. This is why I've switched to using MPFR for anything with known rounding
modes and just paying a pretty big performance tax for correctness. (That
and I'm working to release a library that does exact real arithmetic using
trees of nested linear fractional transformations -- assuming I can figure
out how to keep performance high enough.) -Edward -Edward _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hey Levent,
I actually looked into how to do rounding mode setting a while ago, and the
conclusion I came to is that those can simply be ffi calls at the top level
that do a sort of with mode bracketing. Or at least I'm not sure if
setting the mode in an inner loop is a good idea.
That said, you are making a valid point, and I will investigate to what
extent compiler support is useful for the latter. If bracketed mode setting
and unsetting has a small enough performance overhead, adding support in
ghc primops would be worth while. Note that those primops would have to be
modeled as doing something thats like io or st, so that when mode switches
happen can be predictable. Otherwise CSE and related optimizations could
result in evaluating the same code in the wrong mode. I'll think through
how that can be avoided, as I do have some ideas.
I suspect mode switching code will wind up using new type wrapped floats
and doubles that have a phantom index for the mode, and something like
"runWithModeFoo:: Num a => Mode m->(forall s . Moded s a ) -> a" to
make sure mode choices happen predictably. That said, there might be a
better approach that we'll come to after some experimenting
On May 5, 2015 12:54 AM, "Levent Erkok"
Carter: Wall of text is just fine!
I'm personally happy to see the results of your experiment. In particular, the better "code-generation" facilities you add around floats/doubles that map to the underlying hardware's native instructions, the better. When we do have proper IEEE floats, we shall surely need all that functionality.
While you're working on this, if you can also watch out for how rounding modes can be integrated into the operations, that would be useful as well. I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
-Levent.
On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

To clarify: I think theres a bit of an open design question how the
explicitly moded api would look. I'd suspect it'll look somewhat like Ed's
AD lib, and should be in a userland library I think.
On May 5, 2015 7:40 AM, "Carter Schonwald"
Hey Levent, I actually looked into how to do rounding mode setting a while ago, and the conclusion I came to is that those can simply be ffi calls at the top level that do a sort of with mode bracketing. Or at least I'm not sure if setting the mode in an inner loop is a good idea.
That said, you are making a valid point, and I will investigate to what extent compiler support is useful for the latter. If bracketed mode setting and unsetting has a small enough performance overhead, adding support in ghc primops would be worth while. Note that those primops would have to be modeled as doing something thats like io or st, so that when mode switches happen can be predictable. Otherwise CSE and related optimizations could result in evaluating the same code in the wrong mode. I'll think through how that can be avoided, as I do have some ideas.
I suspect mode switching code will wind up using new type wrapped floats and doubles that have a phantom index for the mode, and something like "runWithModeFoo:: Num a => Mode m->(forall s . Moded s a ) -> a" to make sure mode choices happen predictably. That said, there might be a better approach that we'll come to after some experimenting On May 5, 2015 12:54 AM, "Levent Erkok"
wrote: Carter: Wall of text is just fine!
I'm personally happy to see the results of your experiment. In particular, the better "code-generation" facilities you add around floats/doubles that map to the underlying hardware's native instructions, the better. When we do have proper IEEE floats, we shall surely need all that functionality.
While you're working on this, if you can also watch out for how rounding modes can be integrated into the operations, that would be useful as well. I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
-Levent.
On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hrm, now that ive thought about it a wee bit more,perhaps the rounding mode
info needs to be attached to ghc threads, otherwise there will be some fun
bugs in multithreaded code that uses multiple rounded modes. I'll do some
investigation.
On May 5, 2015 8:16 AM, "Carter Schonwald"
To clarify: I think theres a bit of an open design question how the explicitly moded api would look. I'd suspect it'll look somewhat like Ed's AD lib, and should be in a userland library I think. On May 5, 2015 7:40 AM, "Carter Schonwald"
wrote: Hey Levent, I actually looked into how to do rounding mode setting a while ago, and the conclusion I came to is that those can simply be ffi calls at the top level that do a sort of with mode bracketing. Or at least I'm not sure if setting the mode in an inner loop is a good idea.
That said, you are making a valid point, and I will investigate to what extent compiler support is useful for the latter. If bracketed mode setting and unsetting has a small enough performance overhead, adding support in ghc primops would be worth while. Note that those primops would have to be modeled as doing something thats like io or st, so that when mode switches happen can be predictable. Otherwise CSE and related optimizations could result in evaluating the same code in the wrong mode. I'll think through how that can be avoided, as I do have some ideas.
I suspect mode switching code will wind up using new type wrapped floats and doubles that have a phantom index for the mode, and something like "runWithModeFoo:: Num a => Mode m->(forall s . Moded s a ) -> a" to make sure mode choices happen predictably. That said, there might be a better approach that we'll come to after some experimenting On May 5, 2015 12:54 AM, "Levent Erkok"
wrote: Carter: Wall of text is just fine!
I'm personally happy to see the results of your experiment. In particular, the better "code-generation" facilities you add around floats/doubles that map to the underlying hardware's native instructions, the better. When we do have proper IEEE floats, we shall surely need all that functionality.
While you're working on this, if you can also watch out for how rounding modes can be integrated into the operations, that would be useful as well. I can see at least two designs:
* One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.
* The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.
Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.
-Levent.
On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald < carter.schonwald@gmail.com> wrote:
pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Tue, May 5, 2015 at 8:16 AM, Carter Schonwald wrote: To clarify: I think theres a bit of an open design question how the
explicitly moded api would look. I'd suspect it'll look somewhat like Ed's
AD lib, and should be in a userland library I think. Another concern here is laziness. What happens when you force a thunk of
type Double inside a "withRoundingMode" kind of construct?
-Jan On May 5, 2015 7:40 AM, "Carter Schonwald" Hey Levent,
I actually looked into how to do rounding mode setting a while ago, and
the conclusion I came to is that those can simply be ffi calls at the top
level that do a sort of with mode bracketing. Or at least I'm not sure if
setting the mode in an inner loop is a good idea. That said, you are making a valid point, and I will investigate to what
extent compiler support is useful for the latter. If bracketed mode setting
and unsetting has a small enough performance overhead, adding support in
ghc primops would be worth while. Note that those primops would have to be
modeled as doing something thats like io or st, so that when mode switches
happen can be predictable. Otherwise CSE and related optimizations could
result in evaluating the same code in the wrong mode. I'll think through
how that can be avoided, as I do have some ideas. I suspect mode switching code will wind up using new type wrapped floats
and doubles that have a phantom index for the mode, and something like
"runWithModeFoo:: Num a => Mode m->(forall s . Moded s a ) -> a" to
make sure mode choices happen predictably. That said, there might be a
better approach that we'll come to after some experimenting
On May 5, 2015 12:54 AM, "Levent Erkok" Carter: Wall of text is just fine! I'm personally happy to see the results of your experiment. In
particular, the better "code-generation" facilities you add around
floats/doubles that map to the underlying hardware's native instructions,
the better. When we do have proper IEEE floats, we shall surely need all
that functionality. While you're working on this, if you can also watch out for how rounding
modes can be integrated into the operations, that would be useful as well.
I can see at least two designs: * One where the rounding mode goes with the operation: `fpAdd
RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional
solution, but could get quite verbose; and might be costly if the
implementation changes the rounding-mode at every issue. * The other is where the operations simply assume the
RoundNearestTiesToEven, but we have lifted IO versions that can be modified
with a "with" like construct: `withRoundingMode RoundTowardsPositive $
fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have
to return some sort of a monadic value (probably in the IO monad) since
it'll need to access the rounding mode currently active. Neither choice jumps out at me as the best one; and a hybrid might also
be possible. I'd love to hear any insight you gain regarding rounding-modes
during your experiment. -Levent. On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald <
carter.schonwald@gmail.com> wrote: pardon the wall of text everyone, but I really want some FMA tooling :) I am going to spend some time later this week and next adding FMA
primops to GHC and playing around with different ways to add it to Num
(which seems pretty straightforward, though I think we'd all agree it
shouldn't be exported by Prelude). And then depending on how Yitzchak's
reproposal of that exactly goes (or some iteration thereof) we can get
something useful/usable into 7.12 i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact
numbers*, and a higher precision FMA for *approximate numbers *(*ie
floating point*), and where I cant sanely use FMA if it lives
anywhere but Num unless I rub typeable everywhere and do runtime type
checks for applicable floating point types, which kinda destroys
parametrically in engineering nice things. @levent: ghc doesn't do any optimization for floating point arithmetic
(aside from 1-2 very simple things that are possibly questionable), and
until ghc has support for precisly emulating high precision floating point
computation in a portable way, probably wont have any interesting floating
point computation. Mandating that fma a b c === a*b+c for inexact number
datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc
is conservative about optimizing floating point, because it makes doing
correct stability analyses tractable! I look forward to the day that GHC
gets a bit more sophisticated about optimizing floating point computation,
but that day is still a ways off. relatedly: FMA for float and double are not generally going to be
faster than the individual primitive operations, merely more accurate when
used carefully. point being*, i'm +1 on adding some manner of FMA operations to Num*
(only sane place to put it where i can actually use it for a general use
library) and i dont really care if we name it fusedMultiplyAdd,
multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor
"fusedMultiplyAdd" if we want a descriptive name that will be familiar to
experts yet easy to google for the curious. to repeat: i'm going to do some leg work so that the double and float
prims are portably exposed by ghc-prims (i've spoken with several ghc devs
about that, and they agree to its value, and thats a decision outside of
scope of the libraries purview), and I do hope we can to a consensus about
putting it in Num so that expert library authors can upgrade the guarantees
that they can provide end users without imposing any breaking changes to
end users. A number of folks have brought up "but Num is broken" as a counter
argument to adding FMA support to Num. I emphatically agree num is borken
:), BUT! I do also believe that fixing up Num prelude has the burden of
providing a whole cloth design for an alternative design that we can get
broad consensus/adoption with. That will happen by dint of actually
experimentation and usage. Point being, adding FMA doesn't further entrench current Num any more
than it already is, it just provides expert library authors with a
transparent way of improving the experience of their users with a free
upgrade in answer accuracy if used carefully. Additionally, when Num's
"semiring ish equational laws" are framed with respect to approximate
forwards/backwards stability, there is a perfectly reasonable law for FMA.
I am happy to spend some time trying to write that up more precisely IFF
that will tilt those in opposition to being in favor. I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num*
as a method therein for Num. If that constitutes a different and *more
palatable proposal* than what people have articulated so far (by
discouraging casual use by dint of hiding) then I am happy to kick off a
new thread with that concrete design choice. If theres a counter argument thats a bit more substantive than "Num is
for exact arithmetic" or "Num is wrong" that will sway me to the other
side, i'm all ears, but i'm skeptical of that. I emphatically support those who are displeased with Num to prototype
some alternative designs in userland, I do think it'd be great to figure
out a new Num prelude we can migrate Haskell / GHC to over the next 2-5
years, but again any such proposal really needs to be realized whole cloth
before it makes its way to being a libraries list proposal. again, pardon the wall of text, i just really want to have nice things
:)
-Carter On Mon, May 4, 2015 at 2:22 PM, Levent Erkok I think `mulAdd a b c` should be implemented as `a*b+c` even for
Double/Float. It should only be an "optmization" (as in modular
arithmetic), not a semantic changing operation. Thus justifying the
optimization. "fma" should be the "more-precise" version available for Float/Double.
I don't think it makes sense to have "fma" for other types. That's why I'm
advocating "mulAdd" to be part of "Num" for optimization purposes; and
"fma" reserved for true IEEE754 types and semantics. I understand that Edward doesn't like this as this requires a
different class; but really, that's the price to pay if we claim Haskell
has proper support for IEEE754 semantics. (Which I think it should.) The
operation is just different. It also should account for the rounding-modes
properly. I think we can pull this off just fine; and Haskell can really lead
the pack here. The situation with floats is even worse in other languages.
This is our chance to make a proper implementation, and we have the right
tools to do so. -Levent. On Mon, May 4, 2015 at 10:58 AM, Artyom On 05/04/2015 08:49 PM, Levent Erkok wrote: Artyom: That's precisely the point. The true IEEE754 variants where
precision does matter should be part of a different class. What Edward and
Yitz want is an "optimized" multiply-add where the semantics is the same
but one that goes faster. No, it looks to me that Edward wants to have a more precise operation
in Num: I'd have to make a second copy of the function to even try to see the
precision win. Unless I'm wrong, you can't have the following things simultaneously: 1. the compiler is free to substitute *a+b*c* with *mulAdd a b c*
2. *mulAdd a b c* is implemented as *fma* for Doubles (and is
more precise)
3. Num operations for Double (addition and multiplication) always
conform to IEEE754 The true IEEE754 variants where precision does matter should be
part of a different class. So, does it mean that you're fine with not having point #3 because
people who need it would be able to use a separate class for IEEE754 floats? _______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hi,
Related informatioln.
Intel FMA's information(hardware dependent) is here:
Chapter 11
Intel 64 and IA-32 Architectures Optimization Reference Manual
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32...
Of course, it is information that depends on the particular processor.
And abstraction level is too low.
PS
I like Haskell's abstruct naming convention more than "fma":-)
Regards,
Takenobu
2015-05-05 11:54 GMT+09:00 Carter Schonwald
pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hi,
Is this useful?
BLAS (Basic Linear Algebra Subprograms)
http://www.netlib.org/blas/
http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Regards,
Takenobu
2015-05-05 22:06 GMT+09:00 Takenobu Tani
Hi,
Related informatioln.
Intel FMA's information(hardware dependent) is here:
Chapter 11
Intel 64 and IA-32 Architectures Optimization Reference Manual
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32...
Of course, it is information that depends on the particular processor. And abstraction level is too low.
PS I like Haskell's abstruct naming convention more than "fma":-)
Regards, Takenobu
2015-05-05 11:54 GMT+09:00 Carter Schonwald
: pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hey Takenobu,
Yes both are super useful! I've certainly used the Intel architecture
manual a few times and I wrote/maintain (in my biased opinion ) one of the
nicer blas ffi bindings on hackage.
It's worth mentioning that for haskellers who are interested in either
mathematical computation or performance engineering, on freenode the
#numerical-haskell channel is pretty good. Though again I'm a bit biased
about the nice community there
On Tuesday, May 5, 2015, Takenobu Tani
Hi,
Is this useful?
BLAS (Basic Linear Algebra Subprograms) http://www.netlib.org/blas/ http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Regards, Takenobu
2015-05-05 22:06 GMT+09:00 Takenobu Tani
javascript:_e(%7B%7D,'cvml','takenobu.hs@gmail.com');>: Hi,
Related informatioln.
Intel FMA's information(hardware dependent) is here:
Chapter 11
Intel 64 and IA-32 Architectures Optimization Reference Manual
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32...
Of course, it is information that depends on the particular processor. And abstraction level is too low.
PS I like Haskell's abstruct naming convention more than "fma":-)
Regards, Takenobu
2015-05-05 11:54 GMT+09:00 Carter Schonwald
javascript:_e(%7B%7D,'cvml','carter.schonwald@gmail.com');>: pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
javascript:_e(%7B%7D,'cvml','erkokl@gmail.com');> wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
javascript:_e(%7B%7D,'cvml','yom@artyom.me');> wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org javascript:_e(%7B%7D,'cvml','Libraries@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org javascript:_e(%7B%7D,'cvml','Libraries@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hi Carter,
Uh excuse me, you are BLAS master [1] ;-)
And, thank you for teaching me about #numerical-haskell.
I'll learn it. I like effective performance and abstraction.
[1] http://hackage.haskell.org/package/linear-algebra-cblas
Thank you,
Takenobu
2015-05-05 22:52 GMT+09:00 Carter Schonwald
Hey Takenobu, Yes both are super useful! I've certainly used the Intel architecture manual a few times and I wrote/maintain (in my biased opinion ) one of the nicer blas ffi bindings on hackage.
It's worth mentioning that for haskellers who are interested in either mathematical computation or performance engineering, on freenode the #numerical-haskell channel is pretty good. Though again I'm a bit biased about the nice community there
On Tuesday, May 5, 2015, Takenobu Tani
wrote: Hi,
Is this useful?
BLAS (Basic Linear Algebra Subprograms) http://www.netlib.org/blas/ http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Regards, Takenobu
2015-05-05 22:06 GMT+09:00 Takenobu Tani
: Hi,
Related informatioln.
Intel FMA's information(hardware dependent) is here:
Chapter 11
Intel 64 and IA-32 Architectures Optimization Reference Manual
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32...
Of course, it is information that depends on the particular processor. And abstraction level is too low.
PS I like Haskell's abstruct naming convention more than "fma":-)
Regards, Takenobu
2015-05-05 11:54 GMT+09:00 Carter Schonwald
: pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: On 05/04/2015 08:49 PM, Levent Erkok wrote:
Artyom: That's precisely the point. The true IEEE754 variants where precision does matter should be part of a different class. What Edward and Yitz want is an "optimized" multiply-add where the semantics is the same but one that goes faster.
No, it looks to me that Edward wants to have a more precise operation in Num:
I'd have to make a second copy of the function to even try to see the precision win.
Unless I'm wrong, you can't have the following things simultaneously:
1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is more precise) 3. Num operations for Double (addition and multiplication) always conform to IEEE754
The true IEEE754 variants where precision does matter should be part of a different class.
So, does it mean that you're fine with not having point #3 because people who need it would be able to use a separate class for IEEE754 floats?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hblas is what I recommend
https://hackage.haskell.org/package/hblas
Doesn't have everything yet. But the design is a lite better.
On Wednesday, May 6, 2015, Takenobu Tani
Hi Carter,
Uh excuse me, you are BLAS master [1] ;-)
And, thank you for teaching me about #numerical-haskell. I'll learn it. I like effective performance and abstraction.
[1] http://hackage.haskell.org/package/linear-algebra-cblas
Thank you, Takenobu
2015-05-05 22:52 GMT+09:00 Carter Schonwald
javascript:_e(%7B%7D,'cvml','carter.schonwald@gmail.com');>: Hey Takenobu, Yes both are super useful! I've certainly used the Intel architecture manual a few times and I wrote/maintain (in my biased opinion ) one of the nicer blas ffi bindings on hackage.
It's worth mentioning that for haskellers who are interested in either mathematical computation or performance engineering, on freenode the #numerical-haskell channel is pretty good. Though again I'm a bit biased about the nice community there
On Tuesday, May 5, 2015, Takenobu Tani
javascript:_e(%7B%7D,'cvml','takenobu.hs@gmail.com');> wrote: Hi,
Is this useful?
BLAS (Basic Linear Algebra Subprograms) http://www.netlib.org/blas/ http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms
Regards, Takenobu
2015-05-05 22:06 GMT+09:00 Takenobu Tani
: Hi,
Related informatioln.
Intel FMA's information(hardware dependent) is here:
Chapter 11
Intel 64 and IA-32 Architectures Optimization Reference Manual
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32...
Of course, it is information that depends on the particular processor. And abstraction level is too low.
PS I like Haskell's abstruct naming convention more than "fma":-)
Regards, Takenobu
2015-05-05 11:54 GMT+09:00 Carter Schonwald
:
pardon the wall of text everyone, but I really want some FMA tooling :)
I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12
i have codes (ie *dotproducts*!!!!!) where a faster direct FMA for *exact numbers*, and a higher precision FMA for *approximate numbers *(*ie floating point*), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things.
@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.
relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully.
point being*, i'm +1 on adding some manner of FMA operations to Num* (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious.
to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.
A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage.
Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor.
I dont need FMA to be exposed by *prelude/base*, merely by *GHC.Num* as a method therein for Num. If that constitutes a different and *more palatable proposal* than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice.
If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that.
I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.
again, pardon the wall of text, i just really want to have nice things :) -Carter
On Mon, May 4, 2015 at 2:22 PM, Levent Erkok
wrote: I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.
"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.
I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.
I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.
-Levent.
On Mon, May 4, 2015 at 10:58 AM, Artyom
wrote: > On 05/04/2015 08:49 PM, Levent Erkok wrote: > > Artyom: That's precisely the point. The true IEEE754 variants where > precision does matter should be part of a different class. What Edward and > Yitz want is an "optimized" multiply-add where the semantics is the same > but one that goes faster. > > No, it looks to me that Edward wants to have a more precise > operation in Num: > > I'd have to make a second copy of the function to even try to see > the precision win. > > Unless I'm wrong, you can't have the following things simultaneously: > > 1. the compiler is free to substitute *a+b*c* with *mulAdd a b c* > 2. *mulAdd a b c* is implemented as *fma* for Doubles (and is > more precise) > 3. Num operations for Double (addition and multiplication) > always conform to IEEE754 > > The true IEEE754 variants where precision does matter should be > part of a different class. > > So, does it mean that you're fine with not having point #3 because > people who need it would be able to use a separate class for IEEE754 floats? > >
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Hi Carter,
Thank you for teaching me again.
I'll learn by it.
well-established:-)
Thank you,
Takenobu
2015-05-07 4:42 GMT+09:00 Carter Schonwald
Hblas is what I recommend https://hackage.haskell.org/package/hblas
Doesn't have everything yet. But the design is a lite better.

Joachim:
I do think that a class is needed. The IEEE754 is actually quite agnostic
about floating-point types. What IEEE754 says about floats are the sizes of
the exponent and the mantissa; let's call them E and M for short. Then, one
can define a floating-point type for each combination of E and M, both of
which are at least 2. The resulting type will fit into E+M+1 bits.
We have:
- "Float" is E=8, M=23. (And thus fits into a 32 bit machine word
with the sign bit.)
- "Double" is E=11, M=52. (And thus fits into a 64 bit machine
word with the sign bit.)
(In fact IEEE754 defines single/double precision to have at least those E/M
values, but allows for larger. But let's ignore that for a moment.)
You can see that the next thing in line is going to be something that fits
into 128 bits, also known as quad-precision. (Where E=15, M=112, plus 1 for
the sign-bit.)
If we get type-literals into Haskell proper, then these types can all be
nicely represented as "FP e m" for numbers e, m >= 2.
It just happens that Float/Double are what most hardware implementations
support "naturally," but all IEEE-754 operations are defined for all
precisions, and I think it would make sense to capture this nicely in
Haskell, much like we have Int8, Int16, Int32 etc, and have them instances
of this new class.
So, I'm quite against creating "fmaFloat"/"fmaDouble" etc.; but rather
collect all these in a true IEEE754 arithmetic class. Float and Double will
be the two instances for today, but one can easily see the extension to
other variants in the future. (C already supports long-double to an extent,
so that's missing in Haskell; as one sticking point.)
This class should also address rounding-modes, as almost all
float-operations only make sense in the context of a rounding mode. The
design space there is also large, but that's a different discussion.
-Levent.
On Mon, May 4, 2015 at 1:14 AM, Joachim Breitner
Hi,
Am Sonntag, den 03.05.2015, 14:11 -0700 schrieb Levent Erkok:
Based on this analysis, I'm withdrawing the original proposal. I think fma and other floating-point arithmetic operations are very important to support properly, but it should not be done by tacking them on to Num or RealFloat; but rather in a new class that also considers rounding-mode properly.
does it really have to be a class? How much genuinely polymorphic code is there out there that yet requires this precise handling of precision?
Have you considered adding it as monomorphic functions fmaDouble, fmaFloat etc. on hackage, using FFI? Then those who need these functions can start to use them.
Furthermore you can start getting the necessary primops supported in GHC, and have your library transparently use them when available.
And only then, when we have the implementation in place and actual users, we can evaluate whether we need an abstract class for this.
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Levent Erkok wrote:
...I think this proposal needs to be shelved for the time being.
Nevertheless, I vote for doing it now. A better, more featureful, and more principled approach to FP is definitely needed. It would be great if we could tackle that and finally solve it - and I think we can. But that's a huge issue which has been discussed extensively in the past, and orthogonal to Levant's proposal. In the meantime, adding new functions that provide access to more FP functionality without adding any significant new weirdness are welcome, and will naturally flow into whatever future solution to the broader FP issue we implement. It makes little difference whether or not we provide a bad but working default implementation; my vote is to provide it. It will prevent breakage in case someone happens to have implemented a manual RealFloat instance out there somewhere, and it won't affect the standard instances because we'll provide implementations for those. Obviously a clear explanatory Haddock comment would be required. Even better, trigger a warning if an instance does not provide an explicit implementation, but I'm not sure if that's possible. I'm still in favor of doing Levant's proposal now even if the consensus is to omit the default. I vote for the usual practice of a human-readable name, but don't let bikeshedding hold this back. Thanks, Yitz

I would suggest adding the relevant high-precision versions as direct functions on Float/Double and then add the "better" versions as part of Num as was suggested. Anyone who *needs* the precision can then get it by using the functions directly and forcing a specific type (since I don't think polymorphic code and this sort of precision demands fit well together). This way it's *possible* to write code with the required precision for Float/Double and anyone using Num gets an optional precision boost. Cheers, Merijn
On 4 May 2015, at 12:00, Yitzchak Gale
wrote: Levent Erkok wrote:
...I think this proposal needs to be shelved for the time being.
Nevertheless, I vote for doing it now.
A better, more featureful, and more principled approach to FP is definitely needed. It would be great if we could tackle that and finally solve it - and I think we can. But that's a huge issue which has been discussed extensively in the past, and orthogonal to Levant's proposal.
In the meantime, adding new functions that provide access to more FP functionality without adding any significant new weirdness are welcome, and will naturally flow into whatever future solution to the broader FP issue we implement.
It makes little difference whether or not we provide a bad but working default implementation; my vote is to provide it. It will prevent breakage in case someone happens to have implemented a manual RealFloat instance out there somewhere, and it won't affect the standard instances because we'll provide implementations for those. Obviously a clear explanatory Haddock comment would be required. Even better, trigger a warning if an instance does not provide an explicit implementation, but I'm not sure if that's possible. I'm still in favor of doing Levant's proposal now even if the consensus is to omit the default.
I vote for the usual practice of a human-readable name, but don't let bikeshedding hold this back.
Thanks, Yitz _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Agreed. It will be a boon for dot product powered algorithms every where.
There's a valid argument for in parallel exploration of systematically
better abstractions for the future, but that shouldn't preclude making core
tooling and primops a bit better in time for 7.12
I'll start investigating adding the applicable primops to ghc on all
supported platforms. Most of the widely used ones have direct instruction
support, but some may have to call out to the c fma, eg unregisterized
builds and perhaps x86_32 unless I'm mistaken on the latter.
On Monday, May 4, 2015, Merijn Verstraaten
I would suggest adding the relevant high-precision versions as direct functions on Float/Double and then add the "better" versions as part of Num as was suggested. Anyone who *needs* the precision can then get it by using the functions directly and forcing a specific type (since I don't think polymorphic code and this sort of precision demands fit well together). This way it's *possible* to write code with the required precision for Float/Double and anyone using Num gets an optional precision boost.
Cheers, Merijn
On 4 May 2015, at 12:00, Yitzchak Gale
javascript:;> wrote: Levent Erkok wrote:
...I think this proposal needs to be shelved for the time being.
Nevertheless, I vote for doing it now.
A better, more featureful, and more principled approach to FP is definitely needed. It would be great if we could tackle that and finally solve it - and I think we can. But that's a huge issue which has been discussed extensively in the past, and orthogonal to Levant's proposal.
In the meantime, adding new functions that provide access to more FP functionality without adding any significant new weirdness are welcome, and will naturally flow into whatever future solution to the broader FP issue we implement.
It makes little difference whether or not we provide a bad but working default implementation; my vote is to provide it. It will prevent breakage in case someone happens to have implemented a manual RealFloat instance out there somewhere, and it won't affect the standard instances because we'll provide implementations for those. Obviously a clear explanatory Haddock comment would be required. Even better, trigger a warning if an instance does not provide an explicit implementation, but I'm not sure if that's possible. I'm still in favor of doing Levant's proposal now even if the consensus is to omit the default.
I vote for the usual practice of a human-readable name, but don't let bikeshedding hold this back.
Thanks, Yitz _______________________________________________ Libraries mailing list Libraries@haskell.org javascript:; http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
participants (22)
-
adam vogt
-
amindfv@gmail.com
-
Artyom
-
Brandon Allbery
-
Carter Schonwald
-
David Feuer
-
Edward Kmett
-
Henning Thielemann
-
Ivan Lazar Miljenovic
-
Jan-Willem Maessen
-
Joachim Breitner
-
Ken T Takusagawa
-
Levent Erkok
-
Merijn Verstraaten
-
Mike Meyer
-
Roman Cheplyaka
-
Scott Turner
-
Takenobu Tani
-
Tikhon Jelvis
-
Twan van Laarhoven
-
wren romano
-
Yitzchak Gale