
Richard A. O'Keefe wrote:
On 14 Feb 2008, at 2:28 pm, Roman Leshchinskiy wrote:
Richard A. O'Keefe wrote:
On 12 Feb 2008, at 5:14 pm, jerzy.karczmarczuk@info.unicaen.fr wrote:
Would you say that *no* typical floating-point software is reliable? With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey
Ints do not obey those laws, either.
They obey a heck of a lot more of them. Any combination of Ints using (+), (-), (*), and negate is going to be congruent to the mathematically correct answer modulo 2**n for some n. In particular, (+) is associative for Ints.
Yes, but neither school nor, for the most part, university mathematics teach us to expect modulo arithmetic. Good programmers learn about it at some point in their carreer, though, and write their programs accordingly. If they intend to use floating point, they should learn about it, too. I do agree that most programmers don't know how to use floats properly and aren't even aware that they can be used improperly. But that's an educational problem, not a problem with floating point.
This would be my top priority request for Haskell': require that the default Int type check for overflow on all operations where overflow is possible, provide Int32, Int64 for people who actually *want* wraparound.
I don't understand this. Why use a type which can overflow in the first place? Why not use Integer?
You just have to check for exceptional conditions.
Why should it be *MY* job to check for exceptional conditions?
It shouldn't unless you use a type whose contract specifies that it's your job to check for them. Which is the case for Int, Float and Double. It's not the case for Integer and Rational.
If you think that, you do not understand floating point. x+(y+z) == (x+y)+z fails even though there is nothing exceptional about any of the operands or any of the results.
For all practical purposes, the semantics of (==) is not well defined for floating point numbers. That's one of the first things I used to teach my students about floats: *never* compare them for equality. So in my view, your example doesn't fail, it's undefined. That Haskell provides (==) for floats is unfortunate.
I have known a *commercial* program blithely invert a singular matrix because of this kind of thing, on hardware where every kind of arithmetic exception was reported. There were no "exceptional conditions", the answer was just 100% wrong.
If they used (==) for floats, then they simply didn't know what they were doing. The fact that a program is commercial doesn't mean it's any good.
I guess it trapped on creating denormals. But again, presumably the reason the student used doubles here was because he wanted his program to be fast. Had he read just a little bit about floating point, he would have known that it is *not* fast under certain conditions.
Well, no. Close, but no cigar. (a) It wasn't denormals, it was underflow.
"Creating denormals" and underflow are equivalent. Denormals are created as a result of underflow. A denormalised number is smaller than any representable normal number. When the result of an operation is too small to be represented by a normal number, IEEE arithmetic will either trap or return a denormal, depending on whether underflow is masked or not.
(b) The fact underflow was handled by trapping to the operating system, which then completed the operating by writing a 0.0 to the appropriate register, is *NOT* a universal property of floating point, and is *NOT* a universal property of IEEE floating point. It's a fact about that particular architecture, and I happened to have the manual and he didn't.
IIRC, underflow is a standard IEEE exception.
(c) x*x => 0 when x is small enough *is* fast on a lot of machines.
Only if underflow is masked (which it probably is by default). Although I vaguely recall that denormals were/are slower on some architectures.
As it were, he seems to have applied what he though was an optimisation (using floating point) without knowing anything about it. A professional programmer would get (almost) no sympathy in such a situation.
You must be joking. Almost everybody working with neural nets uses floating point.
[...]
If you are aware of any neural net software for general purpose hardware done by programmers you consider competent that *doesn't* use floating point, I would be interested to hear about it.
I'm not. But progammers I consider competent for this particular task know how to use floating point. Your student didn't but that's ok for a student. He had someone he could ask so hopefully, he'll know next time. To be clear, I do not mean to imply that programmers who do not know about floating point are incompetent. I'm only somewhat sceptical of programmers who do not know about it but still write software that relies on it. Roman