A question about "monad laws"

Deokjae Lee

11 Feb 2008 11 Feb '08

1:33 p.m.

Tutorials about monad mention the "monad axioms" or "monad laws". The tutorial "All About Monads" says that "It is up to the programmer to ensure that any Monad instance he creates satisfies the monad laws". The following is one of the laws. (x >>= f) >>= g == x >>= (\v -> f v >>= g) However, this seems to me a kind of mathematical identity. If it is mathematical identity, a programmer need not care about this law to implement a monad. Can anyone give me an example implementation of monad that violate this law ?

Show replies by date

Neil Mitchell

11 Feb 11 Feb

1:38 p.m.

...

The following is one of the laws.

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

Or stated another way: (x >>= f) >>= g == x >>= (f >>= g) Now it should be easier to see that this is simply associativity. It's easy enough to violate, if you want to - but I don't have any nice simple examples to hand. Thanks Neil

Michael Reid

1:57 p.m.

...

Now it should be easier to see that this is simply associativity. It's easy enough to violate, if you want to - but I don't have any nice simple examples to hand.

I have recently been reading a tutorial or paper where a Monad that violated this law was presented. The authors shrugged it off as not important, that the notation gained by implementing the operation as a Monad was worth it, but what is not clear is what the consequences of violating such associativity are. Does violating this law introduce the potential that your program will not do what you think it should? /mike.

Wolfgang Jeltsch

7:24 p.m.

Am Montag, 11. Februar 2008 14:57 schrieb Michael Reid:

...

...
Now it should be easier to see that this is simply associativity. It's easy enough to violate, if you want to - but I don't have any nice simple examples to hand.

I have recently been reading a tutorial or paper where a Monad that violated this law was presented. The authors shrugged it off as not important, that the notation gained by implementing the operation as a Monad was worth it, but what is not clear is what the consequences of violating such associativity are.

Does violating this law introduce the potential that your program will not do what you think it should?

/mike.

Other libraries might (and probably will) expect Monad instances to satisfy the monad laws and will not work as intended or even make sense if the monad laws aren’t satisfied. Sometimes it looks as if people think that monads are special in that they have to satisfy certain laws. But this isn’t the case. Practically every Haskell type class has some laws (informally) attached to it which instances should satisfy. For example, the following should hold for instances of the Ord class: a b = compare a b = GT a <= b = a = b = a > b || a == b a a a a < c If an Ord instances doesn’t obey these laws than it’s likely to make Set and Map behave strangely. Best wishes, Wolfgang

David Benbennick

9:35 p.m.

On Feb 11, 2008 11:24 AM, Wolfgang Jeltsch wrote:

...

a a < c

If an Ord instances doesn't obey these laws than it's likely to make Set and Map behave strangely.

Some months ago I pointed out that Ratio Int (which is an Ord instance) doesn't satisfy this property. I provided a patch to fix the problem, but my bug report was closed as wontfix: http://hackage.haskell.org/trac/ghc/ticket/1517. -- I'm doing Science and I'm still alive.

Richard A. O'Keefe

12 Feb 12 Feb

1:21 a.m.

On 12 Feb 2008, at 10:35 am, David Benbennick wrote:

...

Some months ago I pointed out that Ratio Int (which is an Ord instance) doesn't satisfy this property. I provided a patch to fix the problem, but my bug report was closed as wontfix: http://hackage.haskell.org/trac/ghc/ticket/1517.

I'm not happy about that response. Basically, if the inputs to a mathematical operation are representable, and the mathematically correct result is representable, I expect a system to deliver it or die trying. What the intermediate calculations get up to is the implementor's problem, not the user's. On the other hand, if I knew in advance whether a particular + or * was going to overflow, I probably wouldn't need the computer to actually do it. But if I give the computer some numbers that are clearly representable and just ask it to *sort* them, it had better d--- well get that RIGHT. I am extremely grateful for this report, because now I know "NEVER USE Ratio Int, it's too broken". Sad aside: back in the 70s I had my own Ratio Int written in Burroughs Algol. I was not smart enough to use double precision numbers for anything, but because of one hardware feature, it didn't matter a lot. That hardware feature was that integer overflows were TRAPPED and REPORTED. I have since used precisely one C compiler on precisely one Unix system that took advantage of the fact that the C standard (both C89 and C99) was very carefully written to ALLOW TRAPPING of signed integer overflows. (Contrary to mythology, C only requires wrapping for unsigned integers.) I found that a surprisingly valuable debugging aid. This all supports the general point, of course: data types whose operations are so implemented as to break sensible laws can and WILL land you in great piles of fresh steaming hot fertiliser.

Yitzchak Gale

11:20 a.m.

David Benbennick wrote:

...

...
Some months ago I pointed out that Ratio Int (which is an Ord instance) doesn't satisfy this property. I provided a patch to fix the problem, but my bug report was closed as wontfix: http://hackage.haskell.org/trac/ghc/ticket/1517.

Richard A. O'Keefe wrote:

...

I'm not happy about that response... I am extremely grateful for this report, because now I know "NEVER USE Ratio Int, it's too broken".

Ian wrote, in the Trac ticket:

...

Thanks for the patch, but I see a couple of problems: ...If you still think that this change should be made then I think that it should go through the library submissions process: http://www.haskell.org/haskellwiki/Library_submissions

The "wontfix" resolution does not mean that the patch was rejected. It just means that there are reasons for and against it, so GHC HQ is not unilaterally implementing it without giving the community a chance to discuss the issues. I think that is an admirable attitude. If you feel that this issue is important, why not go ahead and start the process to get it adopted? Regards, Yitz

Uwe Hollerbach

6:18 a.m.

Ratio Integer may possibly have the same trouble, or maybe something related. I was messing around with various operators on Rationals and found that positive and negative infinity don't compare right. Here's a small program which shows this; if I'm doing something wrong, I'd most appreciate it being pointed out to me. If I fire up ghci, import Data.Ratio and GHC.Real, and then ask about the type of "infinity", it tells me Rational, which as far as I can tell is Ratio Integer...? So far I have only found these wrong results when I compare the two infinities. Uwe

...

module Main where import Prelude import Data.Ratio import GHC.Real

pinf = infinity ninf = -infinity zero = 0

main = do putStrLn ("pinf = " ++ (show pinf)) putStrLn ("ninf = " ++ (show ninf)) putStrLn ("zero = " ++ (show zero)) putStrLn ("min pinf zero =\t" ++ (show (min pinf zero))) putStrLn ("min ninf zero =\t" ++ (show (min ninf zero))) putStrLn ("min ninf pinf =\t" ++ (show (min ninf pinf))) putStrLn ("min pinf ninf =\t" ++ (show (min pinf ninf)) ++ "\twrong") putStrLn ("max pinf zero =\t" ++ (show (max pinf zero))) putStrLn ("max ninf zero =\t" ++ (show (max ninf zero))) putStrLn ("max ninf pinf =\t" ++ (show (max ninf pinf))) putStrLn ("max pinf ninf =\t" ++ (show (max pinf ninf)) ++ "\twrong") putStrLn ("(<) pinf zero =\t" ++ (show ((<) pinf zero))) putStrLn ("(<) ninf zero =\t" ++ (show ((<) ninf zero))) putStrLn ("(<) ninf pinf =\t" ++ (show ((<) ninf pinf)) ++ "\twrong") putStrLn ("(<) pinf ninf =\t" ++ (show ((<) pinf ninf))) putStrLn ("(>) pinf zero =\t" ++ (show ((>) pinf zero))) putStrLn ("(>) ninf zero =\t" ++ (show ((>) ninf zero))) putStrLn ("(>) ninf pinf =\t" ++ (show ((>) ninf pinf))) putStrLn ("(>) pinf ninf =\t" ++ (show ((>) pinf ninf)) ++ "\twrong") putStrLn ("(<=) pinf zero =\t" ++ (show ((<=) pinf zero))) putStrLn ("(<=) ninf zero =\t" ++ (show ((<=) ninf zero))) putStrLn ("(<=) ninf pinf =\t" ++ (show ((<=) ninf pinf))) putStrLn ("(<=) pinf ninf =\t" ++ (show ((<=) pinf ninf)) ++ "\twrong") putStrLn ("(>=) pinf zero =\t" ++ (show ((>=) pinf zero))) putStrLn ("(>=) ninf zero =\t" ++ (show ((>=) ninf zero))) putStrLn ("(>=) ninf pinf =\t" ++ (show ((>=) ninf pinf))) putStrLn ("(>=) pinf ninf =\t" ++ (show ((>=) pinf ninf)) ++ "\twrong")

David Benbennick

6:50 a.m.

On Feb 11, 2008 10:18 PM, Uwe Hollerbach wrote:

...

If I fire up ghci, import Data.Ratio and GHC.Real, and then ask about the type of "infinity", it tells me Rational, which as far as I can tell is Ratio Integer...?

Yes, Rational is Ratio Integer. It might not be a good idea to import GHC.Real, since it doesn't seem to be documented at http://www.haskell.org/ghc/docs/latest/html/libraries/. If you just import Data.Ratio, and define

...

pinf :: Integer pinf = 1 % 0

...

ninf :: Integer ninf = (-1) % 0

Then things fail the way you expect (basically, Data.Ratio isn't written to support infinity). But it's really odd the way the infinity from GHC.Real works. Anyone have an explanation? -- I'm doing Science and I'm still alive.

Jan-Willem Maessen

2:12 p.m.

On Feb 12, 2008, at 1:50 AM, David Benbennick wrote:

...

On Feb 11, 2008 10:18 PM, Uwe Hollerbach wrote:

...
If I fire up ghci, import Data.Ratio and GHC.Real, and then ask about the type of "infinity", it tells me Rational, which as far as I can tell is Ratio Integer...?

Yes, Rational is Ratio Integer. It might not be a good idea to import GHC.Real, since it doesn't seem to be documented at http://www.haskell.org/ghc/docs/latest/html/libraries/. If you just import Data.Ratio, and define

...
pinf :: Integer pinf = 1 % 0

...
ninf :: Integer ninf = (-1) % 0

Then things fail the way you expect (basically, Data.Ratio isn't written to support infinity). But it's really odd the way the infinity from GHC.Real works. Anyone have an explanation?

An educated guess here: the value in GHC.Real is designed to permit fromRational to yield the appropriate high-precision floating value for infinity (exploiting IEEE arithmetic in a simple, easily- understood way). If I'm right, it probably wasn't intended to be used as a Rational at all, nor to be exploited by user code. -Jan-Willem Maessen

Uwe Hollerbach

6:43 p.m.

On Feb 12, 2008 6:12 AM, Jan-Willem Maessen wrote:

...

On Feb 12, 2008, at 1:50 AM, David Benbennick wrote:

...
On Feb 11, 2008 10:18 PM, Uwe Hollerbach wrote:

...
If I fire up ghci, import Data.Ratio and GHC.Real, and then ask about the type of "infinity", it tells me Rational, which as far as I can tell is Ratio Integer...?

Yes, Rational is Ratio Integer. It might not be a good idea to import GHC.Real, since it doesn't seem to be documented at http://www.haskell.org/ghc/docs/latest/html/libraries/. If you just import Data.Ratio, and define

...
pinf :: Integer pinf = 1 % 0

...
ninf :: Integer ninf = (-1) % 0

Then things fail the way you expect (basically, Data.Ratio isn't written to support infinity). But it's really odd the way the infinity from GHC.Real works. Anyone have an explanation?

An educated guess here: the value in GHC.Real is designed to permit fromRational to yield the appropriate high-precision floating value for infinity (exploiting IEEE arithmetic in a simple, easily- understood way). If I'm right, it probably wasn't intended to be used as a Rational at all, nor to be exploited by user code.

-Jan-Willem Maessen

Well... I dunno. Looking at the source to GHC.Real, I see infinity, notANumber :: Rationalinfinity = 1 :% 0notANumber = 0 :% 0 This is actually the reason I imported GHC.Real, because just plain % normalizes the rational number it creates, and that barfs very quickly when the denominator is 0. But the values themselves look perfectly reasonable... no? Uwe

Luke Palmer

6:49 p.m.

2008/2/12 Uwe Hollerbach :

...

Well... I dunno. Looking at the source to GHC.Real, I see

infinity, notANumber :: Rational infinity = 1 :% 0 notANumber = 0 :% 0

This is actually the reason I imported GHC.Real, because just plain % normalizes the rational number it creates, and that barfs very quickly when the denominator is 0. But the values themselves look perfectly reasonable... no?

Ummm... I'm going to have to go with no. In particular we can't have signed infinity represented like this and maintain reasonable numeric laws: 1/0 = 1/(-0) = (-1)/0 Rationals are defined not to have a zero denomiator, so I'll bet in more than one place in Data.Ratio that assumption is made. Luke

Neil Mitchell

11 Feb 11 Feb

1:59 p.m.

...

...
(x >>= f) >>= g == x >>= (\v -> f v >>= g)

Or stated another way:

(x >>= f) >>= g == x >>= (f >>= g)

Which is totally wrong, woops. See this page for lots of details about the Monad Laws and quite a nice explanation of where you use them: http://www.haskell.org/haskellwiki/Monad_Laws Thanks Neil

Stefan O'Rear

9:34 p.m.

On Mon, Feb 11, 2008 at 01:59:09PM +0000, Neil Mitchell wrote:

...

Hi

...
...
(x >>= f) >>= g == x >>= (\v -> f v >>= g)

Or stated another way:

(x >>= f) >>= g == x >>= (f >>= g)

Which is totally wrong, woops.

See this page for lots of details about the Monad Laws and quite a nice explanation of where you use them: http://www.haskell.org/haskellwiki/Monad_Laws

My favorite presentation of the monad laws is associativity of Kliesli composition: (a1 >=> a2) x = a1 x >>= a2 -- predefined in 6.8 control.monad -- The laws return >=> a = a a >=> return = a a >=> (b >=> c) = (a >=> b) >=> c Stefan

Derek Elkins

12 Feb 12 Feb

3:38 a.m.

On Mon, 2008-02-11 at 13:34 -0800, Stefan O'Rear wrote:

...

On Mon, Feb 11, 2008 at 01:59:09PM +0000, Neil Mitchell wrote:

...
Hi

...
...
(x >>= f) >>= g == x >>= (\v -> f v >>= g)

Or stated another way:

(x >>= f) >>= g == x >>= (f >>= g)

Which is totally wrong, woops.

See this page for lots of details about the Monad Laws and quite a nice explanation of where you use them: http://www.haskell.org/haskellwiki/Monad_Laws

My favorite presentation of the monad laws is associativity of Kliesli composition:

(a1 >=> a2) x = a1 x >>= a2 -- predefined in 6.8 control.monad

-- The laws

return >=> a = a a >=> return = a a >=> (b >=> c) = (a >=> b) >=> c

Indeed. The monad laws are just that the Kleisli category is actually a category.

jerzy.karczmarczuk＠info.unicaen.fr

11 Feb 11 Feb

2:15 p.m.

Deokjae Lee cites:

...

The tutorial "All About Monads" says that "It is up to the programmer to ensure that any Monad instance he creates satisfies the monad laws".

The following is one of the laws.

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

However, this seems to me a kind of mathematical identity. If it is mathematical identity, a programmer need not care about this law to implement a monad. Can anyone give me an example implementation of monad that violate this law ?

After three or five reactions to this posting, I think it it is time to generalize. Haskell is not math. Or rather, there is no way to be sure that the *implementation* of some mathematical domains and operations thereupon are fool-proof. Sometimes you break "en passant" some sacred laws. For example the transitivity of ordering. 5>2, right? and 8>5 as well. But, imagine a - little esoteric example of cyclic arithmetic modulo 10, where the shortest distance gives you the order, so 2>8. A mathematician will shrug, saying that calling +that+ an order relation is nonsense, and he/she will be absolutely right. But people do that... There is a small obscure religious sect of people who want to implement several mathematical entities as functional operators, where multiplication is f. composition. You do it too generically, too optimistically, and then some octonions come and break your teeth. So, people *should care*. Jerzy Karczmarczuk

apfelmus

3:14 p.m.

Deokjae Lee wrote:

...

Tutorials about monad mention the "monad axioms" or "monad laws". The tutorial "All About Monads" says that "It is up to the programmer to ensure that any Monad instance he creates satisfies the monad laws".

The following is one of the laws.

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

However, this seems to me a kind of mathematical identity. If it is mathematical identity, a programmer need not care about this law to implement a monad. Can anyone give me an example implementation of monad that violate this law ?

I will be mean by asking the following counter question: x + (y + z) = (x + y) + z is a mathematical identity. If it is a mathematical identity, a programmer need not care about this law to implement addition + . Can anyone give me an example implementation of addition that violates this law? The only difference here is that the associative law for addition is "obvious" to you, whereas the associative law for monads is not "obvious" to you (yet). As Neil mentioned, maybe http://www.haskell.org/haskellwiki/Monad_Laws can help to convince yourself that the associative law monads should be obvious, too. In short, the reason for its obviousness is the interpretation of >>= in terms of sequencing actions with side effects. The law is probably best demonstration with its special case x >> (y >> z) = (x >> y) >> z In other words, it signifies that it's only the sequence of x,y,z and not the nesting that matters. Regards, apfelmus

Andrew Butterfield

3:35 p.m.

apfelmus wrote:

...

Deokjae Lee wrote:

...
Tutorials about monad mention the "monad axioms" or "monad laws". The tutorial "All About Monads" says that "It is up to the programmer to ensure that any Monad instance he creates satisfies the monad laws".

The following is one of the laws.

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

However, this seems to me a kind of mathematical identity. If it is mathematical identity, a programmer need not care about this law to implement a monad. Can anyone give me an example implementation of monad that violate this law ?

I will be mean by asking the following counter question:

x + (y + z) = (x + y) + z

is a mathematical identity. If it is a mathematical identity, a programmer need not care about this law to implement addition + . Can anyone give me an example implementation of addition that violates this law? Hugs> 1.0 + (2.5e-15 + 2.5e-15) 1.00000000000001 :: Double Hugs> (1.0 + 2.5e-15) + 2.5e-15 1.0 :: Double

Hugs, on Pentium 4 machine running Windows XP SP2 (all of which is largely irrelevant!) This is precisely Jerzy's point - you can have many mathematical laws as you like but there is no guarantee that a programming languages implementation will satisfy them. The above example is due to rounding errors and arises because the Double type in Haskell (or C, C++, whatever) is a finite (rational) approximation to real numbers which are infinite (platonic) entities. Associativity of addition applies for platonic reals, but not their widely used IEEE-standard approximate implementation on standard hardware. For monads, the situation is slightly different. Haskell describes the signature of the monadic operators return :: x -> m x (>>=) :: m a -> (a -> m b) -> m b but cannot restrict how you actually instantiate these. It admonishes you by stating that you should obey the relevant laws, but cannot enforce this. (of course, technically if you implement a dodgy monad, its not really a monad at all, but something different with operators with the same name and types - also technically values of type Double are not real numbers, (or true rationals either !) let m denote the "list monad" (hypothetically). Let's instantiate: return :: x -> [x] return x = [x,x] (>>=) :: [x] -> (x -> [y]) -> [y] xs >>= f = concat ((map f) xs) Let g n = [show n] Here (return 1 >>= g ) [1,2,3] = ["1","1","1","1","1","1"] but g[1,2,3] = ["1","2","3"], thus violating the first monad law | return http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:return a >>= http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:>>= f = f a | -------------------------------------------------------------------- Andrew Butterfield Tel: +353-1-896-2517 Fax: +353-1-677-2204 Foundations and Methods Research Group Director. Course Director, B.A. (Mod.) in CS and ICT degrees, Year 4. Department of Computer Science, Room F.13, O'Reilly Institute, Trinity College, University of Dublin, Ireland. http://www.cs.tcd.ie/Andrew.Butterfield/ --------------------------------------------------------------------

Felipe Lessa

3:50 p.m.

On Feb 11, 2008 1:35 PM, Andrew Butterfield wrote:

...

Hugs> 1.0 + (2.5e-15 + 2.5e-15) 1.00000000000001 :: Double Hugs> (1.0 + 2.5e-15) + 2.5e-15 1.0 :: Double

Prelude> (1e30 + (-1e30)) + 1 1.0 Prelude> 1e30 + ((-1e30) + 1) 0.0 I love this example from David Goldberg (http://docs.sun.com/source/806-3568/ncg_goldberg.html). -- Felipe.

Andrew Butterfield

3:59 p.m.

Andrew Butterfield wrote:

...

let m denote the "list monad" (hypothetically). Let's instantiate:

return :: x -> [x] return x = [x,x]

(>>=) :: [x] -> (x -> [y]) -> [y] xs >>= f = concat ((map f) xs)

Let g n = [show n]

Here (return 1 >>= g ) [1,2,3] = ["1","1","1","1","1","1"] but g[1,2,3] = ["1","2","3"], thus violating the first monad law | return http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:return a >>= http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:>>= f = f a

I messed this up - I was trying for the simplest example I could get ! Apologies. Start over: Program ---------------------------------------------------- module BadMonad where import Monad newtype MyList t = MyList [t] instance Show t => Show (MyList t) where show (MyList xs) = show xs unmylist (MyList xs) = xs myconcat xs = MyList (concat (map unmylist xs)) instance Monad MyList where return x = MyList [x,x] (MyList xs) >>= f = myconcat ((map f) xs) i2s :: Int -> MyList Char i2s x = MyList (show x) m = i2s 9 Hugs transcript ---------------------------------------- BadMonad> m "9" :: MyList Char BadMonad> m >>= return "99" :: MyList Char We violate the second law (Right Identity, m = m >>= return ) -- -------------------------------------------------------------------- Andrew Butterfield Tel: +353-1-896-2517 Fax: +353-1-677-2204 Foundations and Methods Research Group Director. Course Director, B.A. (Mod.) in CS and ICT degrees, Year 4. Department of Computer Science, Room F.13, O'Reilly Institute, Trinity College, University of Dublin, Ireland. http://www.cs.tcd.ie/Andrew.Butterfield/ --------------------------------------------------------------------

Wolfgang Jeltsch

7:30 p.m.

Am Montag, 11. Februar 2008 16:35 schrieb Andrew Butterfield:

...

This is precisely Jerzy's point - you can have many mathematical laws as you like but there is no guarantee that a programming languages implementation will satisfy them.

But people writing instances of type classes should take care of satisfying the laws since other libraries will most likely expect this. Best wishes, Wolfgang

Richard A. O'Keefe

12 Feb 12 Feb

1:02 a.m.

On 12 Feb 2008, at 4:35 am, Andrew Butterfield wrote: [floating point addition is not associative] And this is an excellent example of why violating expected laws is BAD. The failure of floating point addition to be associative means that there are umpteen ways of computing polynomials, for example, and doing it different ways will give you different answers. This is *not* a good way to write reliable software. I did enough Numerical Analysis papers in my pre- PhD years to get quite scared sometimes. Oh, here's a good one: dot1 [] [] = 0 dot1 (x:xs) (y:ys) = x*y + dots1 xs ys Obvious naive code for dot product. Switch over to tail recursion dot2 xs ys = aux xs ys 0 where aux [] [] s = s aux (x:xs) (y:ys) s = aux xs ys (s + x*y) The problem is that (a) in floating point arithmetic these two functions give DIFFERENT answers, and (b) NEITHER of them is wrong (arguably, neither of them is right either). For integers, of course, they must agree (if I haven't made any silly mistakes). This kind of thing makes it incredibly hard to think about numerical calculations. Basically, laws are stated so that implementors will make stuff that clients don't have to think about until their brains melt.

jerzy.karczmarczuk＠info.unicaen.fr

4:14 a.m.

Richard A. O'Keefe comments:

...

[floating point addition is not associative]]

And this is an excellent example of why violating expected laws is BAD. The failure of floating point addition to be associative means that there are umpteen ways of computing polynomials, for example, and doing it different ways will give you different answers. This is *not* a good way to write reliable software.

[Then we see the scalar product whose value *may* depend on the ev. order] I wonder... Would you say that *no* typical floating-point software is reliable? Jerzy Karczmarczuk

Yitzchak Gale

11:32 a.m.

Jerzy Karczmarczuk wrote:

...

Would you say that *no* typical floating-point software is reliable?

It depends on how you define "reliable". Floating point intentionally trades accuracy for speed, leaving it to the user to worry about round-off errors. It is usually not too hard to get the probability of failure somewhat low in practice, if you don't require a proof. It used to be true - and may still be - that the engineers who implement floating point in the hardware of our CPUs would never fly on commercial airliners. Would you? Would you entrust your country's nuclear arsenal to an automated system that depends on floating point arithmetic? Regards, Yitz

jerzy.karczmarczuk＠info.unicaen.fr

11:58 a.m.

Yitzchak Gale writes:

...

Jerzy Karczmarczuk wrote:

...
Would you say that *no* typical floating-point software is reliable?

It depends on how you define "reliable".

Floating point intentionally trades accuracy for speed, ...

...

It used to be true - and may still be - that the engineers who implement floating point in the hardware of our CPUs would never fly on commercial airliners. Would you?

Would you entrust your country's nuclear arsenal to an automated system that depends on floating point arithmetic?

1. This is not a possible "trade-off" or not. In scientific/engineering computation there is really no choice, since you have to compute logarithms, trigonometric functions, etc., and some inaccuracy is unavoidable. Of course, one may use intervals, and other extremely costly stuff, but if the stability of the algorithms is well controlled, and in normal case it is (especially if the basic arithmetics has some extra control bits to do the rounding), th issue is far from being mortal. 2. The story about engineering not flying commercial planes is largely anecdotical, and you know that. Repeating it here doesn't change much. 3. Nuclear arsenal is never really "entrusted to an automated system", because of reasons much beyond the fl.point inaccuracies. On the other hand, in all those software one has to deal with probabilities, and with imprecise experimental data, so even if for God knows which purpose everything used exact algebraic numbers, or controlled transcendental extensions, the input imprecision would kill all the sense of infinitely precise computations thereupon. 4. The non-reliability of engineering software has many more important reasons, sometimes incredibly stupid, such as the confusion between metric and English units in the Mars Climate Orbiter crash... The Ariane 5 crash was the result not of the floating-point computation but of the conversion to signed 16-bit numers (from a 64bit double). 5. Of course, in the original posting case, the underlying math/logic is discrete, and has no similar inaccuracies, so the two worlds should not be confounded... Here, if some laws get broken, it is the result of bad conventions, which usually can be easily avoided. Jerzy Karczmarczuk

Yitzchak Gale

12:33 p.m.

I wrote:

...

...
Floating point intentionally trades accuracy for speed,

Jerzy Karczmarczuk wrote:

...

1. This is not a possible "trade-off" or not. In scientific/engineering computation there is really no choice, since you have to compute logarithms, trigonometric functions, etc., and some inaccuracy is unavoidable. Of course, one may use intervals, and other extremely costly stuff, but if the stability of the algorithms is well controlled, and in normal case it is (especially if the basic arithmetics has some extra control bits to do the rounding), th issue is far from being mortal.

Agreed. That is what I meant by "trade-off". And I am not trying to say at all that it is wrong to use it. Life is full of trade-offs.

...

...
It used to be true - and may still be - that the engineers who implement floating point in the hardware of our CPUs would never fly on commercial airliners. Would you?

...

2. The story about engineering not flying commercial planes is largely anecdotical, and you know that. Repeating it here doesn't change much.

I heard it from someone who personally worked with one such team of engineers about ten years ago.

...

...
Would you entrust your country's nuclear arsenal to an automated system that depends on floating point arithmetic?

...

3. Nuclear arsenal is never really "entrusted to an automated system", because of reasons much beyond the fl.point inaccuracies.

Yes, of course, no one is really that stupid. Are they?

...

On the other hand, in all those software one has to deal with probabilities, and with imprecise experimental data, so even if for God knows which purpose everything used exact algebraic numbers, or controlled transcendental extensions, the input imprecision would kill all the sense of infinitely precise computations thereupon. 4. The non-reliability of engineering software has many more important reasons, sometimes incredibly stupid, such as the confusion between metric and English units in the Mars Climate Orbiter crash... The Ariane 5 crash was the result not of the floating-point computation but of the conversion to signed 16-bit numers (from a 64bit double).

Yes, high reliability is very hard. There are many factors that make it hard; floating point is undeniably one of them. Again - that doesn't mean that floating point should not be used. It is a powerful and important tool, as you say. I was once part of a software project in which people's lives might depend on there not being any bugs in my code. It was an experience that changed my life forever, and that is probably one of the factors that contributes to my interest in Haskell. Regards, Yitz

Richard A. O'Keefe

14 Feb 14 Feb

12:46 a.m.

On 12 Feb 2008, at 5:14 pm, jerzy.karczmarczuk@info.unicaen.fr wrote:

...

Would you say that *no* typical floating-point software is reliable?

With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey (b) "reliable" in each case needs to be defined with some care; it will almost never mean "gives answers accurate to machine precision for any legal input" and probably won't mean "gives sensible answers for any legal input" either. With luck, it will mean "gives answers accurate to a specified tolerance for an input that differs from the input you actually provided by no more than a specified tolerance for inputs that are neither too big nor too small, a stated range". (Note that the problem that gets almost solved may only be almost your problem.) (c) practical advice is to use reputable packaged software whenever you possibly can rather than writing your own, and always check that the answers make physical sense before putting any trust with them; if (or should I say when) things go weird, seek the help of an expert. (d) if you trust the output of a certain popular spreadsheet program, I have a bridge you might be interested in buying... This is leaving aside all sorts of machine strangeness, like the student whose neural net program started running hundreds of times slower than he expected. I advised him to replace s = 0; for (i = 0; i < n; i++) s += x[i]*x[i]; by s = 0; for (i = 0; i < n; i++) if (fabs(x[i]) > 1e-19) s += x[i]*x[i]; and the problem went away. Dear reader: do you know why I expected this problem, what it was, and why this is NOT a general solution?

Roman Leshchinskiy

1:28 a.m.

Richard A. O'Keefe wrote:

...

On 12 Feb 2008, at 5:14 pm, jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable?

With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either. It is not exceptionally hard to write reliable software using ints. You just have to check for exceptional conditions. That's also the case for floating point. That said, I suspect that 90% of programs that use float and double would be much better off using something else. The only reason to use floating point is performance.

...

This is leaving aside all sorts of machine strangeness, like the student whose neural net program started running hundreds of times slower than he expected. I advised him to replace s = 0; for (i = 0; i < n; i++) s += x[i]*x[i]; by s = 0; for (i = 0; i < n; i++) if (fabs(x[i]) > 1e-19) s += x[i]*x[i]; and the problem went away. Dear reader: do you know why I expected this problem, what it was, and why this is NOT a general solution?

I guess it trapped on creating denormals. But again, presumably the reason the student used doubles here was because he wanted his program to be fast. Had he read just a little bit about floating point, he would have known that it is *not* fast under certain conditions. As it were, he seems to have applied what he though was an optimisation (using floating point) without knowing anything about it. A professional programmer would get (almost) no sympathy in such a situation. Roman

jerzy.karczmarczuk＠info.unicaen.fr

3:31 a.m.

Trialog: Roman Leshchinskiy writes:

...

Richard A. O'Keefe wrote:

...

...
jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable?

With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either. It is not exceptionally hard to write reliable software using ints. You just have to check for exceptional conditions. That's also the case for floating point.

That said, I suspect that 90% of programs that use float and double would be much better off using something else. The only reason to use floating point is performance.

I have a bit different perspective... First, when I see the advice "use something else", I always ask "what", and I get an answer very, very rarely... Well? What do you propose? Then, the problem is not always pathological, in the sense of "exceptional conditions". There are touchy points related to the stability of the algorithms for the solution of differential equations. There are doubtful random number generators in Monte-Carlo business. There are ill-conditioned matrices and screwed-up iterative definitions. Algorithms work, work, and ultimately explode or produce rubbish. The "laws" which get broken are "almost" respected for a long time, and then we have the Bald Man (Sorites) paradox... RAO'K very wisely says that people should avoid reinventing wheels, and they should use established packages, written by people who know. The problem *here* is that we would like to have something fabulous in Haskell - for example... And there aren't too many experts, who would convert to the Functional Religion just for fun. What is *much worse*, some potential users who could encourage building such packages in the numerical domain, typically don't believe that FP gives anything interesting. At least, this is the opinion of physicists I spoke to recently. Never mind. We shall dance over their cadavers, unless they dance over ours. In both cases we shall be happy. Jerzy Karczmarczuk

Roman Leshchinskiy

4 a.m.

jerzy.karczmarczuk@info.unicaen.fr wrote:

...

Trialog: Roman Leshchinskiy writes:

...
Richard A. O'Keefe wrote:

...
...
jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable?

With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either. It is not exceptionally hard to write reliable software using ints. You just have to check for exceptional conditions. That's also the case for floating point. That said, I suspect that 90% of programs that use float and double would be much better off using something else. The only reason to use floating point is performance.

I have a bit different perspective... First, when I see the advice "use something else", I always ask "what", and I get an answer very, very rarely... Well? What do you propose?

For Haskell, Rational seems like a good choice. The fact that the standard requires defaulting to Double is quite unfortunate and inconsistent, IMO; the default should be Rational. Float and Double shouldn't even be in scope without an explicit import. There really is no good reason to use them unless you are writing a binding to existing libraries or really need the performance.

...

Then, the problem is not always pathological, in the sense of "exceptional conditions". There are touchy points related to the stability of the algorithms for the solution of differential equations. There are doubtful random number generators in Monte-Carlo business. There are ill-conditioned matrices and screwed-up iterative definitions. Algorithms work, work, and ultimately explode or produce rubbish. The "laws" which get broken are "almost" respected for a long time, and then we have the Bald Man (Sorites) paradox... RAO'K very wisely says that people should avoid reinventing wheels, and they should use established packages, written by people who know.

Yes, I completely agree with that (even though my original email probably didn't sound as if I did). My point was that (a) most people don't need floating point and (b) those who do need it should learn how to use it.

...

The problem *here* is that we would like to have something fabulous in Haskell - for example...

I think we mostly have it already. Roman

Jed Brown

9:04 a.m.

On 14 Feb 2008, rl@cse.unsw.edu.au wrote:

...

jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Trialog: Roman Leshchinskiy writes:

...
Richard A. O'Keefe wrote:

...
...
jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable?

With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either. It is not exceptionally hard to write reliable software using ints. You just have to check for exceptional conditions. That's also the case for floating point. That said, I suspect that 90% of programs that use float and double would be much better off using something else. The only reason to use floating point is performance.

I have a bit different perspective... First, when I see the advice "use something else", I always ask "what", and I get an answer very, very rarely... Well? What do you propose?

For Haskell, Rational seems like a good choice. The fact that the standard requires defaulting to Double is quite unfortunate and inconsistent, IMO; the default should be Rational. Float and Double shouldn't even be in scope without an explicit import. There really is no good reason to use them unless you are writing a binding to existing libraries or really need the performance.

Until you need to evaluate a transcendental function. Floating point numbers are remarkably well-behaved in the following sense. Fundamental Axiom of Floating Point Arithmetic (Trefethen & Bau, 1997) For all x,y in F, there exists e with |e| <= e_machine such that x <*> y = (x * y) (1 + e) where F is the set of real numbers in a particular floating point representation, <*> represents any hardware arithmetic operation and * is the corresponding exact operation. This is satisfied by nearly all floating point implementations, with somewhat stronger conditions for IEEE. It is easily extended to complex arithmetic, perhaps after scaling e_machine by a small factor. This single axiom is sufficient for all stability (with regard to rounding error) results of numerical algorithms. Double precision has 15 significant digits. It is a very rare physical quantity that can be measured to 10 significant digits and this is unlikely to change in the next 100 years. It is a rare algorithm for which floating point arithmetic is a problem. Occasionally we must make decisions, such as choosing which way to project during Householder QR, so that rounding error is not a problem. Unfortunately, Gaussian Elimination is an important (only because it happens to be fast) algorithm which suffers From rounding error. Since there are well conditioned matrices for which Gaussian Elimination fails spectacularly despite pivoting, many people believe that rounding error is a major part of numerical analysis. This is absolutely not the case; it is less than 10% of the field. Of course, if you are not trying to represent arbitrary real numbers, using floating point may be a mistake. Converting between integer or rational representations and floating point requires careful attention. As long as you stay in floating point, it is almost never a problem. Remember that it is not possible to implement exact real arithmetic so that equality is decidable. We should take this as a sign that equality for floating point arithmetic is dangerous to rely upon. Floating point is extremely useful and I think it would be a mistake to remove it from the Prelude. One thing I would like to see is the Enum instances removed. Jed

jerzy.karczmarczuk＠info.unicaen.fr

12:42 p.m.

Jed Brown comments the answer of - -- Roman Leshchinskiy who answers my question concerning the replacement of floating-point numbers:

...

...
...
First, when I see the advice "use something else", I always ask "what", and I get an answer very, very rarely... Well? What do you propose?

...

...
For Haskell, Rational seems like a good choice. The fact that the standard requires defaulting to Double is quite unfortunate and inconsistent, IMO; the default should be Rational. Float and Double shouldn't even be in scope without an explicit import. There really is no good reason to use them unless you are writing a binding to existing libraries or really need the performance.

Here Jed Brown says:

...

Until you need to evaluate a transcendental function. ... It would be killing, wouldn't it?...

I would say more. It is known that in randomly taken, usual formulae in physics, engineering, etc., if you start with rationals, *typically* the GCD between numerator and denominator will be small, the reductions of fractions are not very effective. Your rationals explode very fast! If after some reasonable number of operations you get rationals whose num/den have millions of digits, the program becomes *completely useless*, this is not "just" the questions of performance. Richard O'Keefe said that people speculate about floating-point numbers without knowing them. Well, nobody is perfect... I am a bit disturbed by people, who speculate without ever *needing* fl.p's, and who haven't thus the sensibility. For them this is a kind of game. Well, it isn't. R.L. says:

...

For all practical purposes, the semantics of (==) is not well defined for floating point numbers. That's one of the first things I used to teach my students about floats: *never* compare them for equality. So in my view, your example doesn't fail, it's undefined. That Haskell provides (==) for floats is unfortunate.

I disagree, on practical basis. Floating-point numbers are very well defined, we know how the mantissa is represented. If the numbers are normalized, as they should, plenty of low-level iterative algorithms may use the equality - after some operation - to check that the machine- precision convergence has been obtained. On the contrary, the verification that the absolute value between two terms is less than some threshold, may be arbitrary or dubious. Jerzy Karczmarczuk

Roman Leshchinskiy

2:15 p.m.

jerzy.karczmarczuk@info.unicaen.fr wrote:

...

Jed Brown comments the answer of - -- Roman Leshchinskiy who answers my question concerning the replacement of floating-point numbers:

...
...
...
First, when I see the advice "use something else", I always ask "what", and I get an answer very, very rarely... Well? What do you propose?

...
...
For Haskell, Rational seems like a good choice. The fact that the standard requires defaulting to Double is quite unfortunate and inconsistent, IMO; the default should be Rational. Float and Double shouldn't even be in scope without an explicit import. There really is no good reason to use them unless you are writing a binding to existing libraries or really need the performance.

Here Jed Brown says:

...
Until you need to evaluate a transcendental function. ... It would be killing, wouldn't it?...

Yes, it would. I was talking about average programs, though, which (I suspect) don't do numerics and really only need fractions. If you do numerics, by all means use a data type that supports numerics well. But even here, and especially in a functional language, IEEE floating point usually isn't the best choice unless you really need the performance. You seem to be after a type that can be used to represent non-integer numbers in next to all problem domains. I don't think such a type exists. So, as usual, one has to choose a data structure suited to the problem at hand. IMO, standard floating point is not a good choice for most problem domains so Float and Double shouldn't be used by default. Whether Rational is a good default is certainly debatable.

...

...
For all practical purposes, the semantics of (==) is not well defined for floating point numbers. That's one of the first things I used to teach my students about floats: *never* compare them for equality. So in my view, your example doesn't fail, it's undefined. That Haskell provides (==) for floats is unfortunate.

I disagree, on practical basis. Floating-point numbers are very well defined, we know how the mantissa is represented. If the numbers are normalized, as they should, plenty of low-level iterative algorithms may use the equality - after some operation - to check that the machine- precision convergence has been obtained.

If you are absolutely sure that for every possible precision and for every sequence of operations that compilers will generate from your code your algorithm will actually converge to a particular binary representation and not flip-flop on the last bit of the mantissa, for instance, and if you do not care about the actual precision of your algorithm (i.e., you want as much as possible of it) then yes, you might get away with using exact equality. Of course, you'll have to protect that part of your code by a sufficient number of warnings since you are using a highly unsafe operation in a very carefully controlled context. I'm not sure the trouble is really worth it. Anyway, in my view, such an unsafe operation shouldn't be in scope by default and definitely shouldn't be called (==). It's really quite like unsafePerformIO.

...

On the contrary, the verification that the absolute value between two terms is less than some threshold, may be arbitrary or dubious.

Only if you use an inappropriate theshold. Choosing thresholds and precision is an important part of numeric programming and should be done with great care. Roman

Richard A. O'Keefe

3:36 a.m.

On 14 Feb 2008, at 2:28 pm, Roman Leshchinskiy wrote:

...

Richard A. O'Keefe wrote:

...
On 12 Feb 2008, at 5:14 pm, jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable? With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either.

They obey a heck of a lot more of them. Any combination of Ints using (+), (-), (*), and negate is going to be congruent to the mathematically correct answer modulo 2**n for some n. In particular, (+) is associative for Ints.

...

It is not exceptionally hard to write reliable software using ints.

I did my BSc and MSc computing on a B6700, where the hardware *always* notified you in case of an integer overflow. In that case, it was perfectly easy to write reliable software. You just pretended that the type 'INTEGER' in your program meant 'mathematical integer', and if that got you into trouble, the machine was certain to tell you about it. Using languages that do not check for integer overflow, even on hardware (like, as it happens, both the different machines on my desk) that makes it cheap to do so, I *have* had trouble with multiplying two positive integers and getting a negative rules and also with a program that went into an infinite loop because it happened to multiply two positive numbers and get another positive number that was smaller than the one it started with. There's also the problem dividing two negative integers can give you a negative result. And one problem I definitely ran into was a Pascal 'for' loop with positive bounds that ran forever. When I contrast the amount of manual checking I have to do when writing C (or, for that matter, Haskell) with the amount of manual checking I have to do when using Smalltalk or SETL or Lisp, and when I remember how life was *better* for me in this respect back in the 70s, well, it doesn't make me happy. This would be my top priority request for Haskell': require that the default Int type check for overflow on all operations where overflow is possible, provide Int32, Int64 for people who actually *want* wraparound. I've been told that there was a day when there was serious trouble in the US financial markets because the volume of trade exceeded the 32-bit signed integer limit, and many programs started giving nonsense results. But the Smalltalk programs just kept powering on...

...

You just have to check for exceptional conditions.

Why should it be *MY* job to check for exceptional conditions? That's the *MACHINE*'s job. When you bought your computer, you paid for hardware that will do this job for you!

...

That's also the case for floating point.

If you think that, you do not understand floating point. x+(y+z) == (x+y)+z fails even though there is nothing exceptional about any of the operands or any of the results. I have known a *commercial* program blithely invert a singular matrix because of this kind of thing, on hardware where every kind of arithmetic exception was reported. There were no "exceptional conditions", the answer was just 100% wrong.

...

I guess it trapped on creating denormals. But again, presumably the reason the student used doubles here was because he wanted his program to be fast. Had he read just a little bit about floating point, he would have known that it is *not* fast under certain conditions.

Well, no. Close, but no cigar. (a) It wasn't denormals, it was underflow. (b) The fact underflow was handled by trapping to the operating system, which then completed the operating by writing a 0.0 to the appropriate register, is *NOT* a universal property of floating point, and is *NOT* a universal property of IEEE floating point. It's a fact about that particular architecture, and I happened to have the manual and he didn't. (c) x*x => 0 when x is small enough *is* fast on a lot of machines.

...

As it were, he seems to have applied what he though was an optimisation (using floating point) without knowing anything about it. A professional programmer would get (almost) no sympathy in such a situation.

You must be joking. Almost everybody working with neural nets uses floating point. (One exception I came across was some people using special vector processor hardware that didn't *have* floating point. These days, you could probably use a programmable GPU to good effect.) For neural net calculations, you have to do lots of dot products. When this example happened, machines were commonly 32 bit, not 64 bit. So doing the calculations in integers would (a) have limited him to 16 bits for the coefficients, instead of double's 53. This might just have been enough of a limit to prevent learning the kinds of things he wanted his net to learn. Actually, if you want to do a dot product on n-vectors, you need enough bits for n as well. Suppose you have 100 inputs, then you'll need 7 bits for that, so you are limited to (31-7)/2 = 12 bits, which is dangerously low. (doubles can do precise sums of up to 128 products of coefficients with up to 23 bits). (b) integer multiplication was very slow on that machine. On most modern machines, integer multiplication is quite slow compared with add, because architects look at C benchmarks and conclude that multiplication isn't important. So programmers learn to do multiply-heavy calculations in floating point, so the benchmarks show less integer multiplication, so the architects let integer multiply get relatively slower, and .... (c) the sigmoid function *has* to be done in floating point arithmetic. In fact I *wanted* him to use integer operations so that he could exploit the new-at-the-time graphics instructions (think MMX), but the project foundered on this step. He couldn't get a workable approximation of the sigmoid function using integers that didn't kill performance. (d) Much of the calculation needed for neural nets can be done using the Basic Linear Algebra Subroutines, which are available in seriously tuned form for most modern machines. If a programmer *didn't* use these (floating-point-only) libraries, I would be asking why not. If you are aware of any neural net software for general purpose hardware done by programmers you consider competent that *doesn't* use floating point, I would be interested to hear about it.

Roman Leshchinskiy

5:01 a.m.

Richard A. O'Keefe wrote:

...

On 14 Feb 2008, at 2:28 pm, Roman Leshchinskiy wrote:

...
Richard A. O'Keefe wrote:

...
On 12 Feb 2008, at 5:14 pm, jerzy.karczmarczuk@info.unicaen.fr wrote:

...
Would you say that *no* typical floating-point software is reliable? With lots of hedging and clutching of protective amulets around the word "reliable", of course not. What I *am* saying is that (a) it's exceptionally HARD to make reliable because although the operations are well defined and arguably reasonable they do NOT obey the laws that school and university mathematics teach us to expect them to obey

Ints do not obey those laws, either.

They obey a heck of a lot more of them. Any combination of Ints using (+), (-), (*), and negate is going to be congruent to the mathematically correct answer modulo 2**n for some n. In particular, (+) is associative for Ints.

Yes, but neither school nor, for the most part, university mathematics teach us to expect modulo arithmetic. Good programmers learn about it at some point in their carreer, though, and write their programs accordingly. If they intend to use floating point, they should learn about it, too. I do agree that most programmers don't know how to use floats properly and aren't even aware that they can be used improperly. But that's an educational problem, not a problem with floating point.

...

This would be my top priority request for Haskell': require that the default Int type check for overflow on all operations where overflow is possible, provide Int32, Int64 for people who actually *want* wraparound.

I don't understand this. Why use a type which can overflow in the first place? Why not use Integer?

...

...
You just have to check for exceptional conditions.

Why should it be *MY* job to check for exceptional conditions?

It shouldn't unless you use a type whose contract specifies that it's your job to check for them. Which is the case for Int, Float and Double. It's not the case for Integer and Rational.

...

If you think that, you do not understand floating point. x+(y+z) == (x+y)+z fails even though there is nothing exceptional about any of the operands or any of the results.

For all practical purposes, the semantics of (==) is not well defined for floating point numbers. That's one of the first things I used to teach my students about floats: *never* compare them for equality. So in my view, your example doesn't fail, it's undefined. That Haskell provides (==) for floats is unfortunate.

...

I have known a *commercial* program blithely invert a singular matrix because of this kind of thing, on hardware where every kind of arithmetic exception was reported. There were no "exceptional conditions", the answer was just 100% wrong.

If they used (==) for floats, then they simply didn't know what they were doing. The fact that a program is commercial doesn't mean it's any good.

...

...
I guess it trapped on creating denormals. But again, presumably the reason the student used doubles here was because he wanted his program to be fast. Had he read just a little bit about floating point, he would have known that it is *not* fast under certain conditions.

Well, no. Close, but no cigar. (a) It wasn't denormals, it was underflow.

"Creating denormals" and underflow are equivalent. Denormals are created as a result of underflow. A denormalised number is smaller than any representable normal number. When the result of an operation is too small to be represented by a normal number, IEEE arithmetic will either trap or return a denormal, depending on whether underflow is masked or not.

...

(b) The fact underflow was handled by trapping to the operating system, which then completed the operating by writing a 0.0 to the appropriate register, is *NOT* a universal property of floating point, and is *NOT* a universal property of IEEE floating point. It's a fact about that particular architecture, and I happened to have the manual and he didn't.

IIRC, underflow is a standard IEEE exception.

...

(c) x*x => 0 when x is small enough *is* fast on a lot of machines.

Only if underflow is masked (which it probably is by default). Although I vaguely recall that denormals were/are slower on some architectures.

...

...
As it were, he seems to have applied what he though was an optimisation (using floating point) without knowing anything about it. A professional programmer would get (almost) no sympathy in such a situation.

You must be joking. Almost everybody working with neural nets uses floating point.

[...]

If you are aware of any neural net software for general purpose hardware done by programmers you consider competent that *doesn't* use floating point, I would be interested to hear about it.

I'm not. But progammers I consider competent for this particular task know how to use floating point. Your student didn't but that's ok for a student. He had someone he could ask so hopefully, he'll know next time. To be clear, I do not mean to imply that programmers who do not know about floating point are incompetent. I'm only somewhat sceptical of programmers who do not know about it but still write software that relies on it. Roman

Richard A. O'Keefe

6:05 a.m.

On 14 Feb 2008, at 6:01 pm, Roman Leshchinskiy wrote:

...

I don't understand this. Why use a type which can overflow in the first place? Why not use Integer?

Why is this hard to understand? Dijkstra's classic "A Discipline of Programming" distinguishes several kinds of machine. I'm quoting from memory here. A Sufficiently Large Machine is one which can run your program to completion giving correct answers all the way. An Insufficiently Large Machine is one which can't do that and silently goes crazy instead. A Hopefully Sufficiently Large Machine is one which *either* does what a Sufficiently Large Machine would have *or* reports that it couldn't. The good thing about an SLM is that it always gives you right answers (assuming your program is correct). The bad thing is that you can't afford it. The good thing about an ILM is that you can afford it. The bad thing is that you can't trust it. The great thing about a HSLM is that you can both trust and afford it. Presumably the reason for having Int in the language at all is speed. As people have pointed out several times on this list to my knowledge, Integer performance is not as good as Int performance, not hardly, and it is silly to pay that price if I don't actually need it. The thing about using SafeInt is that I should get the *same* space and speed from SafeInt as I do from Int, or at the very least the same space and far better speed than Integer, while at the same time EITHER the results are the results I would have got using Integer *OR* the system promises to tell me about it, so that I *know* there is a problem. SafeInt is what you should use when you *THINK* your results should all fit in a machine int but aren't perfectly sure. (And this is nearly all the time.) Int is what you should use when you don't give a damn what the results are as long as you get them fast. (But in that case, why not use C or assembler?)

...

...
...
You just have to check for exceptional conditions. Why should it be *MY* job to check for exceptional conditions?

It shouldn't unless you use a type whose contract specifies that it's your job to check for them. Which is the case for Int, Float and Double.

Wrong. You're confusing two things here. One is Float and Double, where we get in serious trouble WITHOUT ANY EXCEPTIONAL CONDITIONS IN SIGHT. The other is Int overflow. There may also be an equivocation on 'checking'. When was the last time you proved that a large program would not incur an integer overflow? When was the last time you proved that a library package would not incur integer overflow provided it was called in accord with its contract. When was the last time you even *found* a sufficiently informative contract in someone else's Haskell code? The checking I am talking about is done by the hardware at machine speeds and provides *certainty* that overflow did not occur.

...

It's not the case for Integer and Rational.

...
If you think that, you do not understand floating point. x+(y+z) == (x+y)+z fails even though there is nothing exceptional about any of the operands or any of the results.

For all practical purposes, the semantics of (==) is not well defined for floating point numbers.

With respect to IEEE arithmetic, wrong.

...

That's one of the first things I used to teach my students about floats: *never* compare them for equality.

That's one of the warning signs I watch out for. "Never compare floats for equality" is a sure sign of someone who thinks they know about floats but don't.

...

So in my view, your example doesn't fail, it's undefined. That Haskell provides (==) for floats is unfortunate.

The operation is well defined and required by the IEEE standard.

...

If they used (==) for floats, then they simply didn't know what they were doing. The fact that a program is commercial doesn't mean it's any good.

Er, we weren't talking about (==) for floats; I don't know where you got that. I never said it was any good; quite the opposite. My point is that bad software escaped into the commercial market because floating point doesn't follow the laws people expect it to.

...

...
...
I guess it trapped on creating denormals. But again, presumably the reason the student used doubles here was because he wanted his program to be fast. Had he read just a little bit about floating point, he would have known that it is *not* fast under certain conditions. Well, no. Close, but no cigar. (a) It wasn't denormals, it was underflow.

"Creating denormals" and underflow are equivalent.

No they are not. Underflow in this sense occurs when the result is too small to be even a denormal. (The IEEE exceptions Underflow and Inexact are not the same. Creating denormals is likely to trigger Inexact but should not trigger Underflow. I am talking only about a condition that triggered Underflow.)

...

Denormals are created as a result of underflow. A denormalised number is smaller than any representable normal number. When the result of an operation is too small to be represented by a normal number, IEEE arithmetic will either trap or return a denormal, depending on whether underflow is masked or not.

No, we're talking about a situation where returning a denormal is not an option because there is no suitable denormal. This is underflow.

...

...
(b) The fact underflow was handled by trapping to the operating system, which then completed the operating by writing a 0.0 to the appropriate register, is *NOT* a universal property of floating point, and is *NOT* a universal property of IEEE floating point. It's a fact about that particular architecture, and I happened to have the manual and he didn't.

IIRC, underflow is a standard IEEE exception.

Underflow is indeed a standard IEEE exception. Like other standard IEEE exceptions, it is *disabled by default*. In this case, the hardware delivered the exception *to the operating system*, but the operating system did not deliver it to the *user code*. It is the *combination* of hardware and operating system that conforms to the IEEE standard (or not). So we are talking about a situation where the only legal IEEE outcomes are "return 0.0" or "raise the Underflow exception" and where raising an exception *in the user code* wasn't allowed and didn't happen. The hardware is allowed to trap to the operating system any time it feels like, for any reason (like 'this model doesn't support the SQRTD instruction') or none (hey, it's a Friday, I think I'll generate traps). The knowledge I had, and the student lacked, was *not* knowledge about an interface (the IEEE specification) but about an implementation. There's a lot of Haskell code out there with no performance guides in the documentation...

...

I'm not. But progammers I consider competent for this particular task know how to use floating point. Your student didn't but that's ok for a student.

Wrong. He *did* know "how to use floating point", and his code would have run at the expected speed on other hardware. It gave pretty good answers. What he didn't know was how one particular machine struck the balance between hardware and software. I wonder just how many programmers these days think Double.(*) is _always_ a cheap hardware instruction? Returning to our theme: the programmer expectation here is "a simple cost model." Violating that expectation led to a program with a huge unexpected cost problem. In the same way, violating other programmer expectations is likely to lead to unexpected correctness problems.

Roman Leshchinskiy

9:24 a.m.

Richard A. O'Keefe wrote:

...

On 14 Feb 2008, at 6:01 pm, Roman Leshchinskiy wrote:

...
I don't understand this. Why use a type which can overflow in the first place? Why not use Integer?

[...]

Presumably the reason for having Int in the language at all is speed. As people have pointed out several times on this list to my knowledge, Integer performance is not as good as Int performance, not hardly, and it is silly to pay that price if I don't actually need it.

Do I understand correctly that you advocate using overflowing ints (even if they signal overflow) even if Integers are fast enough for a particular program? I strongly disagree with this. It's premature optimisation of the worst kind - trading correctness for unneeded performance.

...

SafeInt is what you should use when you *THINK* your results should all fit in a machine int but aren't perfectly sure. (And this is nearly all the time.)

Again, I strongly disagree. You should use Integer unless your program is too slow and profiling shows that Integer is the culprit. If and only if that is the case should you think about alternatives. That said, I doubt that your SafeInt would be significantly faster than Integer.

...

...
...
...
You just have to check for exceptional conditions. Why should it be *MY* job to check for exceptional conditions?

It shouldn't unless you use a type whose contract specifies that it's your job to check for them. Which is the case for Int, Float and Double.

Wrong. You're confusing two things here. One is Float and Double, where we get in serious trouble WITHOUT ANY EXCEPTIONAL CONDITIONS IN SIGHT. The other is Int overflow.

I'm not sure what I'm confusing here, my comment referred specifically to exceptional conditions which floats provide plenty of. As to getting in trouble, I don't need floats for that, I manage to do it perfectly well with any data type including (). Seriously, though, I think we agree that using floating point numbers correctly isn't trivial, people who do that should know what they are doing and should best use existing libraries. I just don't see how floats are special in this regard.

...

The checking I am talking about is done by the hardware at machine speeds and provides *certainty* that overflow did not occur.

So you advocate using different hardware?

...

...
...
If you think that, you do not understand floating point. x+(y+z) == (x+y)+z fails even though there is nothing exceptional about any of the operands or any of the results.

For all practical purposes, the semantics of (==) is not well defined for floating point numbers.

With respect to IEEE arithmetic, wrong.

Yes, IEEE does define an operation which is (wrongly, IMO) called "equality". It's not a particularly useful operation (and it is not equality), but it does have a defined semantics. However, the semantics of (==) on floats isn't really defined in Haskell or C, for that matter, even if you know that the hardware is strictly IEEE conformant. In general, floating point numbers do not really have a useful notion of equality. They are approximations, after all, and independently derived approximations can only be usefully tested for approximate equality. And yes, x+(y+z) is approximately equal to (x+y)+z for floats. How approximate depends on the particular values, of course.

...

...
That's one of the first things I used to teach my students about floats: *never* compare them for equality.

That's one of the warning signs I watch out for. "Never compare floats for equality" is a sure sign of someone who thinks they know about floats but don't.

Hmm. Personally, I've never seen an algorithm where comparing for exact equality was algorithmically necessary. Sometimes (rarely) it is acceptable but necessary? Do you know of one? On the other hand, there are a lot of cases where comparing for exact equality is algorithmically wrong. As an aside, you might want to try googling for "Never compare floats for equality". I'm positive some of those people *do* know about floats.

...

...
"Creating denormals" and underflow are equivalent.

No they are not. Underflow in this sense occurs when the result is too small to be even a denormal.

I'm fairly sure that IEEE underflow occurs when the result cannot be represented by a *normal* number but I don't have a copy of the standard. Anyway, it's not important for this discussion, I guess.

...

Underflow is indeed a standard IEEE exception. Like other standard IEEE exceptions, it is *disabled by default*. In this case, the hardware delivered the exception *to the operating system*, but the operating system did not deliver it to the *user code*. It is the *combination* of hardware and operating system that conforms to the IEEE standard (or not). So we are talking about a situation where the only legal IEEE outcomes are "return 0.0" or "raise the Underflow exception" and where raising an exception *in the user code* wasn't allowed and didn't happen.

Now I'm curious. I would have guessed that it was an Alpha but that would behave differently (it would trap on underflow, but only in strict IEEE mode and only because it actually implemented flush to zero instead of gradual underflow).

...

...
I'm not. But progammers I consider competent for this particular task know how to use floating point. Your student didn't but that's ok for a student.

Wrong. He *did* know "how to use floating point", and his code would have run at the expected speed on other hardware. It gave pretty good answers.

Wrt speed - not necessarily. For instance, x86 is really bad when it comes to denormals. Have a look at http://www.cygnus-software.com/papers/x86andinfinity.html for instance. So while he knew how to get good results with floating point, he didn't know how to get good performance. Which, as you say, is not part of the IEEE standard but which you still have to know if you use floats. Roman

ajb＠spamcop.net

10:20 a.m.

G'day all. Richard A. O'Keefe wrote:

...

...
That's one of the warning signs I watch out for. "Never compare floats for equality" is a sure sign of someone who thinks they know about floats but don't.

Quoting Roman Leshchinskiy :

...

Hmm. Personally, I've never seen an algorithm where comparing for exact equality was algorithmically necessary.

One trick I've occasionally used is to avoid the need for a discriminated union of floating point and integer types by just using a floating point number. If you ignore bitwise operations and division/remainder, any integer operation that doesn't cause overflow on 32-bit integers will work just the same if you use IEEE-754 64-bit floating point numbers instead. That includes equality. Moreover, you even get a few more significant bits of precision. In these days of fast 64 and 128 bit words, it might not be entirely moral, but it works. Cheers, Andrew Bromage

Wilhelm B. Kloke

1:03 p.m.

ajb@spamcop.net schrieb:

...

G'day all.

Richard A. O'Keefe wrote:

...
Hmm. Personally, I've never seen an algorithm where comparing for exact equality was algorithmically necessary.

One trick I've occasionally used is to avoid the need for a discriminated union of floating point and integer types by just using a floating point number.

IMHO it is a perfectly good idea to use the FP processor for integer computations. You can use the Inexact Trap as Overflow Exception, a service you don't get from i386 (and most other) hardware for int operations. Of course your integers are limited to 24bit+sign in single precision and 54bit+sign in double. In i387 extended precision you get 64bit+sign. I would consider a good idea if ghc would provide language support to this sort of integers. -- Dipl.-Math. Wilhelm Bernhard Kloke Institut fuer Arbeitsphysiologie an der Universitaet Dortmund Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257 PGP: http://vestein.arb-phys.uni-dortmund.de/~wb/mypublic.key

Ben Franksen

6:32 p.m.

Wilhelm B. Kloke wrote:

...

ajb@spamcop.net schrieb:

...
G'day all.

Richard A. O'Keefe wrote:

...
Hmm. Personally, I've never seen an algorithm where comparing for exact equality was algorithmically necessary.

One trick I've occasionally used is to avoid the need for a discriminated union of floating point and integer types by just using a floating point number.

IMHO it is a perfectly good idea to use the FP processor for integer computations. You can use the Inexact Trap as Overflow Exception, a service you don't get from i386 (and most other) hardware for int operations. Of course your integers are limited to 24bit+sign in single precision and 54bit+sign in double. In i387 extended precision you get 64bit+sign.

I would consider a good idea if ghc would provide language support to this sort of integers.

No need, you can do that for yourself: {-# LANGUAGE GeneralizedNewtypeDeriving #-} newtype DInt = DInt Double deriving (Eq, Ord, Enum, Num) instance Show DInt where show (DInt x) = show (truncate x :: Integer) You can even make it H98 by defining the instances manually... Cheers Ben

Wilhelm B. Kloke

15 Feb 15 Feb

7:37 a.m.

Ben Franksen schrieb:

...

Wilhelm B. Kloke wrote:

...
ajb@spamcop.net schrieb:

...
I would consider a good idea if ghc would provide language support to this sort of integers.

No need, you can do that for yourself:

{-# LANGUAGE GeneralizedNewtypeDeriving #-} newtype DInt = DInt Double deriving (Eq, Ord, Enum, Num)

instance Show DInt where show (DInt x) = show (truncate x :: Integer)

Probably you missed the point I wanted to make. This works only properly, if you can catch the Inexact Trap which indicates the overflow. The problem whith normal Ints is that the hardware does not support overflow detection. You get silent wrong results if you use an Int type which maps to C int, but not if it maps to C float or double with Inexact trap enabled. Therefore you need add runtime checks to be sure that you notice a possible overflow condition. -- Dipl.-Math. Wilhelm Bernhard Kloke Institut fuer Arbeitsphysiologie an der Universitaet Dortmund Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257 PGP: http://vestein.arb-phys.uni-dortmund.de/~wb/mypublic.key

Ben Franksen

18 Feb 18 Feb

6:41 p.m.

Wilhelm B. Kloke wrote:

...

Ben Franksen schrieb:

...
Wilhelm B. Kloke wrote:

...
ajb@spamcop.net schrieb:

...
I would consider a good idea if ghc would provide language support to this sort of integers.

No need, you can do that for yourself:

{-# LANGUAGE GeneralizedNewtypeDeriving #-} newtype DInt = DInt Double deriving (Eq, Ord, Enum, Num)

instance Show DInt where show (DInt x) = show (truncate x :: Integer)

Probably you missed the point I wanted to make.

Obviously ;)

...

This works only properly, if you can catch the Inexact Trap which indicates the overflow. The problem whith normal Ints is that the hardware does not support overflow detection. You get silent wrong results if you use an Int type which maps to C int, but not if it maps to C float or double with Inexact trap enabled. Therefore you need add runtime checks to be sure that you notice a possible overflow condition.

I agree completely. Cheers Ben

Richard A. O'Keefe

15 Feb 15 Feb

3:30 a.m.

On 15 Feb 2008, at 2:03 am, Wilhelm B. Kloke wrote: IMHO it is a perfectly good idea to use the FP processor for integer

...

computations. You can use the Inexact Trap as Overflow Exception, a service you don't get from i386 (and most other) hardware for int operations.

A neat idea. However, the i386 has the INTO instruction, the SPARC family has the TRAPV instruction, and other processors have analogues. Some machines have two sets of add/subtract instructions, one trapping, one not.

Aaron Denney

14 Feb 14 Feb

6:04 p.m.

On 2008-02-14, Roman Leshchinskiy wrote:

...

Richard A. O'Keefe wrote:

...
Presumably the reason for having Int in the language at all is speed. As people have pointed out several times on this list to my knowledge, Integer performance is not as good as Int performance, not hardly, and it is silly to pay that price if I don't actually need it.

Do I understand correctly that you advocate using overflowing ints (even if they signal overflow) even if Integers are fast enough for a particular program? I strongly disagree with this. It's premature optimisation of the worst kind - trading correctness for unneeded performance.

"Fast enough" is not absolute. It's not trading correctness, it's trading /completion/. And if you expect everything to fit in [-2^31..2^31-1] or [0..2^32-1], finding out it doesn't might be valuable information about your problem domain. For "exploratory" coding, speed and knowing when something breaks can be more important than knowing that all possible corner case are covered, even ones you don't expect to hit.

...

...
SafeInt is what you should use when you *THINK* your results should all fit in a machine int but aren't perfectly sure. (And this is nearly all the time.)

Again, I strongly disagree. You should use Integer unless your program is too slow and profiling shows that Integer is the culprit. If and only if that is the case should you think about alternatives. That said, I doubt that your SafeInt would be significantly faster than Integer.

Why? GMP is pretty good, but it's not going to be anywhere near hardware speeds.

...

...
The checking I am talking about is done by the hardware at machine speeds and provides *certainty* that overflow did not occur.

So you advocate using different hardware?

At a minimum, any usable hardware sets flags on overflow. Testing on those is pretty cheap. Much cheaper than calling a GMP routine to compare with 2^32, for instance. -- Aaron Denney -><-

Roman Leshchinskiy

10:41 p.m.

Aaron Denney wrote:

...

On 2008-02-14, Roman Leshchinskiy wrote:

...
Richard A. O'Keefe wrote:

...
...
SafeInt is what you should use when you *THINK* your results should all fit in a machine int but aren't perfectly sure. (And this is nearly all the time.) Again, I strongly disagree. You should use Integer unless your program is too slow and profiling shows that Integer is the culprit. If and only if that is the case should you think about alternatives. That said, I doubt that your SafeInt would be significantly faster than Integer.

Why? GMP is pretty good, but it's not going to be anywhere near hardware speeds.

This how Integer is defined in the libraries: data Integer = S# Int# -- small integers | J# Int# ByteArray# -- large integers As long as the Int# doesn't overflow, you don't call any GMP routines. Roman

Richard A. O'Keefe

15 Feb 15 Feb

3:19 a.m.

On 14 Feb 2008, at 10:24 pm, Roman Leshchinskiy wrote:

...

Do I understand correctly that you advocate using overflowing ints (even if they signal overflow) even if Integers are fast enough for a particular program?

No you most certainly do NOT. There is no way to soundly, and I would have thought no way to plausibly, infer that from anything I wrote.

...

Again, I strongly disagree. You should use Integer unless your program is too slow and profiling shows that Integer is the culprit. If and only if that is the case should you think about alternatives. That said, I doubt that your SafeInt would be significantly faster than Integer.

SafeInt should be as fast as Int, or very nearly. The representation of SafeInt is identical to the representation of Int, so the space overheads are definitely lower.

...

...
The checking I am talking about is done by the hardware at machine speeds and provides *certainty* that overflow did not occur.

So you advocate using different hardware?

Again, this is the opposite of what I wrote. On my desk there are a Pentium machine and an UltraSPARC and a G4. They *all* support cheap integer overflow checks. I am saying that we should use the hardware we have already paid for!

...

It's not a particularly useful operation (and it is not equality), but it does have a defined semantics. However, the semantics of (==) on floats isn't really defined in Haskell or C, for that matter, even if you know that the hardware is strictly IEEE conformant.

The semantics of == on floats in C99 is, under certain circumstances, and on the machines I use with the compilers I use, defined to be those of the IEEE (or, as the C99 standard puts it, IEC) operation.

...

In general, floating point numbers do not really have a useful notion of equality. They are approximations.

The *numbers* are not approximations, the *operations* are approximate. In particular, floating point +, -, *, <, ==, abs, sign, and other related operations (but not /) are *exact* on integral values, if the results fit. AWK programs would be in terrible trouble if this were not so

askyle

11 Mar 11 Mar

11:53 p.m.

I'm not sure whether this is the right branch of the thread for this post. Maybe it belongs to a different thread altogether. But here goes: Maybe most of our gripes with floating point arithmetic (besides broken implementations) is that we expect it to behave like Real arithmetic, and when it doesn't, we whine that it's unreliable, ill-defined, etc etc etc. If we consider FP as a mathematical entity of its own (e.g as defined in IEEE 754), we see that it has a well-defined, reliable behaviour which happens to emulate the behaviour of the real numbers within some margins. The accuracy of this emulation can be analyzed in a rigorous manner, see http://www.validlab.com/goldberg/paper.pdf So if floating point (==) doesn't correspond with numeric equality, it's not FP's fault, but ours for expecting it to do! By the way, IEEE754 does define another comparison operator which corresponds to our notion of 'equality'. FP (==) is just a partial equivalence relation which makes more sense than mathematical equality in an FP context. Of course, if an implementation doesn't guarantee that 'x == x' yields true whenever x is not a NaN, then the implementation is broken. An earlier post said that "Haskell is not math" (or something like it). I beg to differ -- one of its selling points, after all, is supposed to be the ability to perform equational reasoning. We work hard to give well-defined semantics to our expressions. We rely on types to tell us which properties to expect of which values, and which manipulations are valid. But then Haskell goes and fools us (just like most other languages do) into thinking FP arithmetic is something that it's not: the behaviour of operations depends on an unseen environment (e.g. rounding mode), the order of evaluation matters, evaluation can fail... not at all what we'd call purely functional! So if FP arithmetic is not purely functional, what do we do? If we could do Haskell all over again, I'd propose a different approach to FP arithmetic (and for stuff like Int, for that matter), which I'm actually surprised not to see discussed more often since it's after all the usual Haskell approach to things which are not purely functional. The original topic of this thread should already have hinted at it. ;) ----- Ariel J. Birnbaum -- View this message in context: http://www.nabble.com/A-question-about-%22monad-laws%22-tp15411587p15976463.... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Wolfgang Jeltsch

13 Mar 13 Mar

7:43 p.m.

Am Mittwoch, 12. März 2008 00:53 schrieb askyle:

...

[…]

...

So if floating point (==) doesn't correspond with numeric equality, it's not FP's fault, but ours for expecting it to do!

No, I think, it’s the Prelude’s fault to define (==) as “floating point equality”. We should us a different identifier like floatingPointEq. (==) is expected to be an equivalence relation. (I think I already said this.)

...

[…]

Best wishes, Wolfgang

Dan Weston

8:10 p.m.

Not to be picky, but where did you hear that (==) established an equivalence relation? Not I expect from the Haskell98 Report! The only law I can find there is that x /= y iff not (x == y) So, the definition x == y = False x /= y = True would be perfectly legitimate, making x /= x = True, which kind of ruins the equivalence relation thing, no? Dan Wolfgang Jeltsch wrote:

...

Am Mittwoch, 12. März 2008 00:53 schrieb askyle:

...
[…]

...
So if floating point (==) doesn't correspond with numeric equality, it's not FP's fault, but ours for expecting it to do!

No, I think, it’s the Prelude’s fault to define (==) as “floating point equality”. We should us a different identifier like floatingPointEq. (==) is expected to be an equivalence relation. (I think I already said this.)

...
[…]

Best wishes, Wolfgang _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Wolfgang Jeltsch

14 Mar 14 Mar

2:13 p.m.

Am Donnerstag, 13. März 2008 21:10 schrieben Sie:

...

Not to be picky, but where did you hear that (==) established an equivalence relation?

I think that’s the way it should be according to most Haskeller’s opinion. It might be true that the Haskell 98 report doesn’t say so but I think that many library types and functions (Data.Set stuff, for example) rely on this. A future standard should state laws an instance has to obey for every class it introduces.

...

[…]

Best wishes, Wolfgang

askyle

6:05 a.m.

Wolfgang Jeltsch-2 wrote:

...

No, I think, it’s the Prelude’s fault to define (==) as “floating point equality”.

My bad, I meant IEEE (==) when I said it was "our" fault. I concur that the Prelude is at fault for using the (==) symbol for FP equality. Even if you don't demand from (==) to be an equivalence, you're giving a pure functional type to an impure operation (e.g because of SNaNs) My point was that since Haskell has a known and established mechanism for delimiting impurity, it seems as a shame not to use it to add some rigour to the myth-ridden, poorly understood floating point world. We need good FP for FP =) ----- Ariel J. Birnbaum -- View this message in context: http://www.nabble.com/A-question-about-%22monad-laws%22-tp15411587p16044986.... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Arnar Birgisson

11 Feb 11 Feb

3:52 p.m.

Hi all, On Feb 11, 2008 3:14 PM, apfelmus wrote:

...

I will be mean by asking the following counter question:

x + (y + z) = (x + y) + z

is a mathematical identity. If it is a mathematical identity, a programmer need not care about this law to implement addition + . Can anyone give me an example implementation of addition that violates this law?

Depends on what you mean by "addition". In general, algebraists call any associative and commutative operation on a set "addition", and nothing else. From that POV, there is by definition no "addition" that violates this law. Arnar

Jonathan Cast

4 p.m.

On 11 Feb 2008, at 7:52 AM, Arnar Birgisson wrote:

...

Hi all,

On Feb 11, 2008 3:14 PM, apfelmus wrote:

...
I will be mean by asking the following counter question:

x + (y + z) = (x + y) + z

is a mathematical identity. If it is a mathematical identity, a programmer need not care about this law to implement addition + . Can anyone give me an example implementation of addition that violates this law?

Depends on what you mean by "addition". In general, algebraists call any associative and commutative operation on a set "addition", and nothing else. From that POV, there is by definition no "addition" that violates this law.

I agree. The Num Double instance should be expelled from the Prelude immediately. jcc (What? Haskell has a Float type?)

Jonathan Cast

3:34 p.m.

On 11 Feb 2008, at 5:33 AM, Deokjae Lee wrote:

...

Tutorials about monad mention the "monad axioms" or "monad laws". The tutorial "All About Monads" says that "It is up to the programmer to ensure that any Monad instance he creates satisfies the monad laws".

The following is one of the laws.

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

However, this seems to me a kind of mathematical identity.

What do you mean by identity? It's easy enough to violate: newtype Gen a = Gen (StdGen -> a) runGen :: Gen a -> StdGen -> a runGen (Gen f) = f instance Monad Gen where return x = Gen (const x) a >>= f = Gen (\ r -> let (r1, r2) = split r in runGen (f (runGen a r1)) r2) [1] split returns two generators independent of each other and of its argument; thus, this `monad' violates all *three* of the monad laws, in the sense of equality. So, for example, the program do x <- a return x denotes a random variable independent of (and hence distinct from) a. In general, you want some kind of backstop `equality' (e.g., that the monad laws hold in the sense of identical distribution) when you violate them; otherwise, you will violate the expectations your users have from the syntax of the do notation. jcc [1] Test.QuickCheck

Miguel Mitrofanov

5:46 p.m.

...

(x >>= f) >>= g == x >>= (\v -> f v >>= g)

However, this seems to me a kind of mathematical identity. If it is mathematical identity, a programmer need not care about this law to implement a monad. Can anyone give me an example implementation of monad that violate this law ?

It's well known that "ListT m" monad violates this law in general (though it satisfies it for some particular monads m). For example, Prelude Control.Monad.List> runListT ((ListT [[(),()]] >> ListT [[()], [()]]) >> ListT [[1],[2]]) [[1,1],[1,2],[2,1],[2,2],[1,1],[1,2],[2,1],[2,2],[1,1],[1,2],[2,1], [2,2],[1,1],[1,2],[2,1],[2,2]] Prelude Control.Monad.List> runListT (ListT [[(),()]] >> (ListT [[()], [()]] >> ListT [[1],[2]])) [[1,1],[1,2],[1,1],[1,2],[2,1],[2,2],[2,1],[2,2],[1,1],[1,2],[1,1], [1,2],[2,1],[2,2],[2,1],[2,2]]

Dan Piponi

6:09 p.m.

IOn Feb 11, 2008 9:46 AM, Miguel Mitrofanov wrote:

...

It's well known that "ListT m" monad violates this law in general (though it satisfies it for some particular monads m). For example,

I went through this example in quite a bit of detail a while ago and wrote it up here: http://sigfpe.blogspot.com/2006/11/why-isnt-listt-monad.html . I tried to show not just why the monad laws fails to hold for ListT [], but also show how it almost holds. -- Dan

Ben Franksen

11:38 p.m.

Dan Piponi wrote:

...

IOn Feb 11, 2008 9:46 AM, Miguel Mitrofanov wrote:

...
It's well known that "ListT m" monad violates this law in general (though it satisfies it for some particular monads m). For example,

I went through this example in quite a bit of detail a while ago and wrote it up here: http://sigfpe.blogspot.com/2006/11/why-isnt-listt-monad.html . I tried to show not just why the monad laws fails to hold for ListT [], but also show how it almost holds.

...and the Unimo paper[1] explains how to easily write a 'correct' ListT. BTW, Unimo is an extreme case of the monad laws holding only w.r.t. the 'right' equality, i.e. in the laws m == m' is to be understood as observe_monad m == observe_monad m' (and even this '==' is of course /not/ the Eq class method but a semantic equality.) [1] http://web.cecs.pdx.edu/~cklin/papers/unimo-143.pdf Cheers Ben

Yitzchak Gale

12 Feb 12 Feb

11:47 a.m.

Ben Franksen wrote:

...

...and the Unimo paper[1] explains how to easily write a 'correct' ListT. BTW, Unimo is an extreme case of the monad laws holding only w.r.t. the 'right' equality, i.e. in the laws m == m' is to be understood as observe_monad m == observe_monad m' (and even this '==' is of course /not/ the Eq class method but a semantic equality.) [1] http://web.cecs.pdx.edu/~cklin/papers/unimo-143.pdf

Are you sure? Maybe I am missing something, but I don't see any claim that the Unimo ListT satisfies the laws any more than the old mtl ListT. It looks to me like Unimo is just an attempt to provide an easier way to create, use, and understand monads, not a change in their semantics. ListT-Done-Right could also be defined via the Unimo framework, and then it would satisfy the monad laws. Thanks, Yitz

apfelmus

2:56 p.m.

Yitzchak Gale wrote:

...

Ben Franksen wrote:

...
...and the Unimo paper[1] explains how to easily write a 'correct' ListT.

Are you sure? Maybe I am missing something, but I don't see any claim that the Unimo ListT satisfies the laws any more than the old mtl ListT.

...

ListT-Done-Right could also be defined via the Unimo framework, and then it would satisfy the monad laws.

The list monad transformer implemented with Unimo (figure 13) is different from ListT m a = m [a] (figure 11 for reference). Note that I say "the list monad transformer". I don't understand what's so special about "ListT m does not fulfill the monad laws", it just shows that naïvely using m [a] to implement the list monad transformer is incorrect for general m . In other words, there is a big bug in Control.Monad.List and that's all there is to it. Regards, apfelmus

6330

Age (days ago)

6362

Last active (days ago)

List overview

Download

58 comments

28 participants

participants (28)

Aaron Denney
ajb＠spamcop.net
Andrew Butterfield
apfelmus
Arnar Birgisson
askyle
Ben Franksen
Dan Piponi
Dan Weston
David Benbennick
Deokjae Lee
Derek Elkins
Felipe Lessa
Jan-Willem Maessen
Jed Brown
jerzy.karczmarczuk＠info.unicaen.fr
Jonathan Cast
Luke Palmer
Michael Reid
Miguel Mitrofanov
Neil Mitchell
Richard A. O'Keefe
Roman Leshchinskiy
Stefan O'Rear
Uwe Hollerbach
Wilhelm B. Kloke
Wolfgang Jeltsch
Yitzchak Gale