
On Thu, Jun 21, 2012 at 8:30 AM, George Giorgidze
Hi Derek,
Thanks for providing the executable example that demonstrates your point. It is an interesting one. See my response below. I think it takes us into the discussion as to what constitutes reasonable/law abiding instances of Eq and Ord and what client code that uses Eq and Ord instances can assume.
To give out my point in advance, Eq and Ord instances similar to yours (i.e., those that proclaim two values as equal but at the same time export or allow for function definitions that can observe that they are not equal; that is, to tell them apart) not only break useful properties of the Data.Set.Monad wrapper but they also break many useful properties of the underlaying Data.Set, and many other standard libraries and functions.
I will readily admit that the Ord instance arguably does not satisfy the laws one would want (though it is consistent with the Eq instance). It arguably is supposed to produce a total order which it does not. However, Eq is only required to be an equivalence relation (and even that is stretching the definition of "required"), and the Eq instance is certainly an equivalence relation. Furthermore, the only reason your library requires an Ord instance is due to the underlying Data.Set requiring it. You could just as well use a set implementation that didn't require Ord and the same issue would occur. I mention this to remove the "bad" Ord instance from consideration. For me saying two Haskell expressions are equal means observation equality. I.e. I can -always- substitute one for the other in any context. I tolerate "equal" meaning "equal modulo bottom", because unless you are catching asynchronous exceptions, which Haskell 2010 does not support, the only difference is between whether you get an answer or not, not what that answer is if you get it which is only so harmful. For Data.Set.Monad, fmap f . fmap g = fmap (f . g) holds (at least modulo bottom) for all f and g as required by the Functor laws. I have no problem if you want to say fromList . toList = id -given- (==) is the identity relation on defined values, but omitting the qualification is misleading and potentially dangerous. The Haskell Report neither requires nor enforces that those relations hold, which is particularly underscored by the fact, as you yourself demonstrated that even -standard- types fail to satisfy even the laws that you perhaps can interpret the Haskell Report as requiring. There have been violations of type safety due to assuming instances satisfied laws that they didn't.
On 20 June 2012 04:03, Derek Elkins
wrote: This is impressive because it's false. The whole point of my original response was to justify Dan's intuition but explain why it was misled in this case.
No, In my opinion, it is not false. The fact that you need to wrap the expression between fmap f and fmap g suggests that the problem is with mapping the functions f and g and not with toList and fromList as you suggest. See below for clarifications.
Let us concentrate on the ex4 and ex6 expressions in your code. These two most clearly demonstrate the issue.
import Data.Set.Monad
data X = X Int Int deriving (Show)
instance Eq X where X a _ == X b _ = a == b
instance Ord X where compare (X a _) (X b _) = compare a b
f (X _ b) = X b b
g (X _ b) = X 1 b
xs = Prelude.map (\x -> X x x) [1..5]
ex4 = toList $ fmap f . fmap g $ fromList xs
ex6 = toList $ fmap f . fromList . toList . fmap g $ fromList xs
print ex4 gives us [X 1 1,X 2 2,X 3 3,X 4 4,X 5 5]
and
print ex6 gives us [X 5 5]
From the first look, it looks like that (fromList . toList) is not identity. But if tested and checked separately it is. Maybe something weird is going on with (fmap f) and (fmap g) and/or their composition. Before we dive into that let us try one more example:
ex7 = toList $ fmap f . (empty `union`) . fmap g $ fromList xs
print ex7 just like ex6 gives us [X 5 5] should we assume that (empty `union`) is not identity either?
Correct. It is not. This hints that, probably something
is wrong with (fmap f), (fmap g), or their composition.
Let us check.
If one symbolically evaluates ex4 and ex6 (I did it with pen and paper and I am too lazy to type it here), one can notice that:
ex4 boils down to evaluating Data.Set.map (f . g) (Data.Set.fromList xs)
while
ex6 boils down to evaluating Data.Set.map f (Data.Set.map g (Data.Set.fromList xs))
(BTW, is not it great that Data.Set.Monad managed to fuse f and g for ex4)
So for your Eq and Ord instances and f and g functions the following does not hold for the underlaying Data.Set library:
map f . map g = map (f . g)
So putting identity functions like (fromList . toList) or (empty `union`) prevents the fusion and allows one to observe that (map f . map g) is not the same as (map (f . g)) for the underlaying Data.Set and for your particular choice of f and g.
This violates the second functor law. So does this mean that I (and few other people [1, 2]) should not have attempted to turn Set into a functor? I do not think so.
You didn't turn Data.Set.Set into a Functor. You turned Data.Set.Monad.Set into a Functor, and (modulo bottom) it indeed is one. It pretty much is one regardless of what the "underlying" type is. Rather little is required from the underlying type. That's why this technique works generally. As an exercise, you can see what laws are actually required from Data.Set.Set for this to be a monad.
(BTW, what distinguishes the approach used in Data.Set.Monad from other efforts is that it does not require changes in the definitions of standard Functor and Monad type classes).
In my opinion, the problem here lies with the Eq and Ord instances of the X data type AND with the function f that can tell apart two values that are proclaimed to be equal by the Eq instance. As I said, such instances and accompanied functions not only break useful properties of Data.Set.Monad, but also useful properties of the library that underlies it (i.e., the original Data.Set), and possibly many other standard libraries and functions (see [4]).
It certainly is the case that a lot of the laws are conditional on the satisfaction of laws on the used instances.
Putting the functor laws aside, there are even more fundamental set-oriented properties that can be broken by Ord instances that are not law abiding. See the following GHCi session taken from [3]:
Prelude> import Data.Set Prelude Data.Set> let x = fromList [0, -1, 0/0, -5, -6, -3] :: Set Float Prelude Data.Set> member 0 x True Prelude Data.Set> let x' = insert (0/0) x Prelude Data.Set> member 0 x' False
Is this because Data.Set is broken? No, in my opinion, this is because the Ord instance of Float does not satisfy the Ord laws (about total order).
Actually, it's worse than that. (==) on Float doesn't even satisfy being an equivalence relation.
To summarise, in my opinion the problem here lies with the fact that the Eq and Ord instances for the X data type proclaim two values as equal when it can be easily observed that they are not equal (in this case with the function f). Note, that the functions of Data.Set.Monad and Data.Set rely on your Eq and Ord instances.
Of course, there would be nothing wrong with the Eq and Ord instances for X as long as you would export it as abstract data type and you would not export functions that can observe two values to be different when your Eq instance says they are equal. One could also clearly mark functions that can tell two equal (in the sense of Eq) values apart as unsafe (perhaps maybe even in the SafeHaskell sense).
The opinions may differ, but I think the raised issue is more about what the specifications of Eq and Ord should be and what the client code that uses Eq and Ord instance can assume. There has been a long discussion on this topic [4].
Again the opinions may differ, but I think in this case the problem lies with the Eq and Ord instances for the X data type and not with the Data.Set.Monad and (its underlaying) Data.Set libraries.
Derek and Dan, thanks for the interesting example. I would be interested to hear whether you have an example that could potentially break set-oriented, and/or monad and functor laws for element types where Eq instance respects observational equality and Ord instance respects total order.
Cheers, George
[1] http://www.haskell.org/pipermail/haskell-cafe/2010-July/080977.html [2] http://haskell.1045720.n5.nabble.com/Functor-instance-for-Set-td5525789.html [3] http://stackoverflow.com/questions/6399648/what-happens-to-you-if-you-break-... [4] http://www.haskell.org/pipermail/haskell-cafe/2008-March/thread.html