Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

Hello All, I'm top posting because I'm getting bored and frustrated with this thread and I don't want to respond detail to everything Aaron has said below. Aaron: Where are you getting this equivalence stuff from? Searching the report for the word "equivalence" the only remotely relevant section seems to be in para. 17.6.. "When the “By” function replaces an Eq context by a binary predicate, the predicate is assumed to define an equivalence" Which is fair enough, but this is talking about the argument of "By" functions. The Haskell wiki refers me to wikipedia, which contains the words "In Haskell, a class Eq intended to contain types that admit equality would be declared in the following way" http://en.wikipedia.org/wiki/Type_class Not that this is necessarily authoritative, but it seems to be contaradicting some peoples interpretation. Also, on page 60 of the report I find the words "Even though the equality is taken at the list type.." So I don't know if all this is really is the correct reading of the report, but if so would like to appeal to movers and shakers in the language definition to please decide exactly what the proper interpretation of standard Eq and Ord "class laws" are and in the next version of the report give an explanation of these in plain English using terms that people without a Mathematics degree are likely to understand. Aaron's interpretation may indeed be very correct and precise, but I fear in reality this is just going to be incomprehensible to many programmers and a terrible source of bugs in "real world" code. I cite previous "left biasing" bugs Data.Map as evidence. I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa) which implies f x = f y for all definable f which implies (f x == f y) = True (for expression types which are instances of Eq). This pretty much requires structural equality for concrete types. For abstract types you can do something different provided functions which can give different answers for two "equal" arguments are not exposed. Anything else is just wrong (according to the language specification, even if it can be right in some mathematical sense). Before anyone jumps down my throat, I remind you that this is a request, not an assertion! :-) On the subject of ambiguity, bugs and maxima, I see we have in Data.List -- | 'maximum' returns the maximum value from a list, -- which must be non-empty, finite, and of an ordered type. -- It is a special case of 'Data.List.maximumBy', which allows the -- programmer to supply their own comparison function. maximum :: (Ord a) => [a] -> a maximum [] = errorEmptyList "maximum" maximum xs = foldl1 max xs -- | The 'maximumBy' function takes a comparison function and a list -- and returns the greatest element of the list by the comparison function. -- The list must be finite and non-empty. maximumBy :: (a -> a -> Ordering) -> [a] -> a maximumBy _ [] = error "List.maximumBy: empty list" maximumBy cmp xs = foldl1 max xs where max x y = case cmp x y of GT -> x _ -> y So I believe I'm correct in saying that maximumBy returns the last of several possible maximum elements of the list. This obviously needs specifying in the Haddock. Because maximumBy documentation is ambiguous in this respect, so is the overloaded maximum documentation. At least you can't figure it out from the Haddock. Despite this ambiguity, the statement that maximum is a special case of maximumBy is true *provided* max in the Ord instance is defined the way Aaron says is should be: (x==y = True) implies max x y = y. But it could be be made unconditionally true using.. maximum :: Ord a => [a] -> a maximum [] = error "List.maximum: empty list" maximum xs = maximumBy compare xs AFAICT, the original report authors either did not perceive an ambiguity in maximum, or just failed to notice and resolve it. If there is no ambiguity this could be for 2 reasons. 1 - It doesn't matter which maximum is returned because: (x==y) = True implies x=y 2 - It does matter, and the result is guaranteed to be the last maximum in all cases because: (x==y) = True implies max x y = y But without either of the above, it is unsafe to assume maximum = maximumBy compare Regarding the alleged "max law" this too is not mentioned in the Haddock for the Ord class, nor is it a "law" AFAICT from reading the report. The report (page 83) just says that the default methods are "reasonable", but presumably not manditory in any semantic sense. This interpretation also seems to be the intent of this from the second para. of Chapter 8: "The default method definitions, given with class declarations, constitute a specification only of the default method. They do not constitute a specification of the meaning of the method in all instances." I wouldn't dispute that the default definition is reasonable, but it's certainly not clear to me from the report that it's something that I can safely assume for all Ord instances. In fact AFAICS the report quite clearly telling me *not* to assume this. But I have to assume *something* for maximum to make sense, so I guess that must be: (x==y) = True implies x=y IOW "I have no idea if it's the first or last maximum that is returned, but who cares?" Again, the report doesn't make it clear that the (==) law above holds (at least not on page 82). But I think in the absence of any explicit statement to the contary I think most programmers would assume that it does apply. I think this is quite reasonable and I have no intention of changing my programming habits to cope with weird instances for which: (x == y) = True does not imply x=y or max x y is not safely interchangeble with max y x. I'm not saying some people are not right to want classes with more mathematically inspired "laws", but I see nothing in the report to indicate to me that Eq/Ord are those classes and consequently that the "naive" programmers interpretation of (==) is incorrect. Rather the contrary in fact. Regards -- Adrian Hey Aaron Denney wrote:
On 2008-03-12, Adrian Hey
wrote: Aaron Denney wrote:
On 2008-03-11, Adrian Hey
wrote: Having tried this approach myself too (with the clone) I can confirm that *this way lies madness*, so in future I will not be making any effort to define or respect "sane", unambiguous and stable behaviour for "insane" Eq/Ord instances for any lib I produce or hack on. Instead I will be aiming for correctness and optimal efficiency on the assumption that Eq and Ord instances are sensible. Good. But sensible only means that the Eq and Ord instances agree, not that x == y => f x == f y. So can I assume that max x y = max y x?
No. You can, however, assume that max x y == max y x. (Okay, this fails on Doubles, because of NaN. I'll agree that the Eq and Ord instances for Double are not sane.)
If not, how can I tell if I've made the correct choice of argument order.
When calling, or when defining max?
It depends on what types you're using, and which equivalence and ordering relations are being used.
When calling, and when it might matter which representative of an equivalence class comes back out (such as in sorts) you have no choice but to look at the documentation or implementation of max.
The Haskell report guarantees that x == y => max x y = y (and hence max y x = x), and the opposite choice for min. This is to ensure that (min x y, max x y) = (x,y) or (y,x). IOW, the report notices that choice of representatives for equivalence classes matters in some circumstances, and makes it easy to do the right thing. This supports the reading that Eq a is not an absolute equality relation, but an equivalence relation.
If I can't tell then I guess I have no alternative but document my arbitrary choice in the Haddock, and probably for the (sake of completeness) provide 2 or more alternative definitions of the "same" function which use a different argument order.
When defining max, yes, you must take care to make sure it useable for cases when Eq is an equivalence relation, rather than equality.
If you're writing library code, then it won't generally know whether Eq means true equality rather than equivalence. If this would let you optimize things, you need some other way to communicate this.
The common typeclasses are for generic, parameterizable polymorphism. Equivalence is a more generally useful notion than equality, so that's what I want captured by the Eq typeclass.
And no, an overloaded sort doesn't belong in Ord, either. If the implementation is truly dependent on the types in non-trivial, non-susbstitutable ways (i.e. beyond a substition of what <= means), then they should be /different/ algorithms.
It would be possible to right an "Equal a" typeclass, which does guarantee actual observable equality (but has no methods). Then you can write one equalSort (or whatever) of type equalSort :: (Eq a, Ord a, Equal a) => [a] -> [a] that will work on any type willing to guarantee this, but rightly fail on types that only define an equivalence relation.
A stable sort is more generally useful than an unstable one. It's composable for radix sorting on compound structures, etc. Hence we want to keep this guarantee.

On 2008-03-13, Adrian Hey
Hello All,
I'm top posting because I'm getting bored and frustrated with this thread and I don't want to respond detail to everything Aaron has said below.
Aaron: Where are you getting this equivalence stuff from?
Not from the prose in the report, but from what the code in the report seems designed to support. There are several places where the code seems to take a (small -- much usually isn't needed) bit of care to support equivalencies.
So I don't know if all this is really is the correct reading of the report, but if so would like to appeal to movers and shakers in the language definition to please decide exactly what the proper interpretation of standard Eq and Ord "class laws" are and in the next version of the report give an explanation of these in plain English using terms that people without a Mathematics degree are likely to understand.
I agree that the prose of the report should be clarified. Luke Palmer's message in haskell-cafe captures why I think that "Eq means equivalence, not strict observational equality" is a more generally useful notion. It's harder to guarantee observational equality, thus harder to use code that requires it of your types. Almost all the time (in my experience) equivalencies are all that's generically needed. My comments on this particular message are below.
Because maximumBy documentation is ambiguous in this respect, so is the overloaded maximum documentation. At least you can't figure it out from the Haddock.
True.
Despite this ambiguity, the statement that maximum is a special case of maximumBy is true *provided* max in the Ord instance is defined the way Aaron says is should be: (x==y = True) implies max x y = y.
Well, the way the report specifies that max's default definition is. I'd actually favor making that not an instance function at all, and instead have max and min be external functions.
AFAICT, the original report authors either did not perceive an ambiguity in maximum, or just failed to notice and resolve it. If there is no ambiguity this could be for 2 reasons.
1 - It doesn't matter which maximum is returned because: (x==y) = True implies x=y
2 - It does matter, and the result is guaranteed to be the last maximum in all cases because: (x==y) = True implies max x y = y
The second holds, so long as max isn't redefined. I'd be rather surprised at any redefinitino of max, as it's not part of any minimum definition for Ord, and I can't think of an actual optimization case for it.
Regarding the alleged "max law" this too is not mentioned in the Haddock for the Ord class, nor is it a "law" AFAICT from reading the report. The report (page 83) just says that the default methods are "reasonable", but presumably not manditory in any semantic sense. This interpretation also seems to be the intent of this from the second para. of Chapter 8:
Agreed. Elevating this to a "law" in my previous message was a mistake on my part. I still think this default in combination with the comment is very suggestive that (min x y, max x y) should preserve distinctness of elements. If you're unwilling to count on this holding for arbitrary Ord instances, why are you willing to count on (/=) and (==) returning opposite answers for arbitrary Eq instances?
I wouldn't dispute that the default definition is reasonable, but it's certainly not clear to me from the report that it's something that I can safely assume for all Ord instances. In fact AFAICS the report quite clearly telling me *not* to assume this. But I have to assume *something* for maximum to make sense, so I guess that must be: (x==y) = True implies x=y IOW "I have no idea if it's the first or last maximum that is returned, but who cares?"
Well, you have to assume something for maximum to do what it says it does, which isn't quite the same thing as "making sense"...
I'm not saying some people are not right to want classes with more mathematically inspired "laws", but I see nothing in the report to indicate to me that Eq/Ord are those classes and consequently that the "naive" programmers interpretation of (==) is incorrect. Rather the contrary in fact.
It's not a question of more or less mathematically inspired, it's a question of more or less useful. Yes, it's slightly harder to write code that can handle any equivalency correctly. But code that only handles observational equality correctly is less reuseable. The compilers cannot and do not check if the various laws are obeyed. They are purely "social" interfaces, in that the community of code writers determines the real meaning, and what can be depended on. The community absolutely should come to a consensus of what these meanings are and document them better than they are currently. Currently, if you write code assuming a stricter meaning of Eq a, the consequences are: (a) it's easier for you to write code (b) it's harder for others to interoperate with your code and use it. Generally, you're the one that gets to make this trade off, because you're writing the code. Whether someone else uses your code, or others', or writes their own is then their own trade off. Because, indeed, many many types inhabiting Eq do obey observational equality, the consequences of (b) may be minor. With regards to Haskell 98, my best guess is that some of the committee members thought hard about the code so that Eq a would usually work for any equivalence class, and others took it to mean observational equality and wrote prose with this understanding. -- Aaron Denney -><-

Adrian Hey
I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa)
The easiest counterexample are sets (or finite maps) provided as an abstract data type (i.e., exported without access to the implementation, in particular constructors). Different kinds of balanced trees typically do not produce the same representation for the same set constructed by different construction expressions. Therefore, (==) on sets will be expected to produce equality of sets, which will only be an equivalence on set representations.
which implies (f x == f y) = True (for expression types which are instances of Eq).
This specifies that (==) is a congurence for f, and is in my opinion the right specification: (==) should be a congurence on all datatypes with respect to all purely defineable functions. But at least nowadays people occasionally do export functions that allow access to representation details, for example Set.toList is not specified to be representation independent, and Set.showTree is outright specified to be implementation dependent. Prelude> :m + Data.Set Data.Set> showTree (fromList [1,2,3,4]) == showTree (fromList [4,3,2,1]) False Data.Set> fromList [1,2,3,4] == fromList [4,3,2,1] True I consider this as an argument to remove showTree from the interface of Data.Set, and to either specify toList to produce an ordered list (replacing toAscList), or to remove it from the interface as well. (mapMonotonic should of course be removed, too, or specified to fail (preferably in some MonadZero) if the precondition is violated, which should still be implementable in linear time.)
but if so would like to appeal to movers and shakers in the language definition to please decide exactly what the proper interpretation of standard Eq and Ord "class laws" are and in the next version of the report give an explanation of these
Strongly seconded, inserting ``precise'' before ``explanation'' ;-) (And I'd expect equivalences and congruences to be accessible on the basis of standard first-year math...) Wolfram

Hi folks I'm late into this thread, so apologies if I'm being dim. On 13 Mar 2008, at 16:17, kahl@cas.mcmaster.ca wrote:
Adrian Hey
wrote: I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa)
I wish I knew what = meant here. Did somebody say? I don't think it's totally obvious what equational propositions should mean. Nor do I think it's desirable to consider only those propositions expressible in QuickCheck. If = is reflexive and distinguishes undefined from True, then x = y implies (x == y) = True will be tricky to satisfy for quite a lot of types. What about undefined == undefined or repeat 'x' == repeat 'x' ? For some suitable (slightly subtle) definition of "finite", you might manage x finite and x = y implies (x == y) = True One rather intensional definition of x = y might be "x and y have a common n-step reduct" with respect to some suitable operational semantics. I don't think this is what Adrian had in mind, but it certainly falls foul of Wolfram's objection.
The easiest counterexample are sets (or finite maps) provided as an abstract data type (i.e., exported without access to the implementation, in particular constructors).
Different kinds of balanced trees typically do not produce the same representation for the same set constructed by different construction expressions.
This suggests that we should seek to define x = y on a type by type basis, to mean "x and y support the same observations", for some suitable notion of observation, which might depend crucially on what operations are exported from the (notional or actual) module which defines the type. If so, it's clearly crucial to allow some observations which rely on waiting for ever, in order to avoid _|_-induced collapse. Something of the sort should allow this...
Therefore, (==) on sets will be expected to produce equality of sets, which will only be an equivalence on set representations.
...but this...
which implies (f x == f y) = True (for expression types which are instances of Eq).
This specifies that (==) is a congurence for f, and is in my opinion the right specification: (==) should be a congurence on all datatypes with respect to all purely defineable functions.
...is more troublesome. Take f = repeat. Define f = f. I'd hope x == y = True would give us x = y, and that x == y would be defined if at least one of x and y is finite. That implies f x = f y, which should guarantee that f x == f y is not False.
But at least nowadays people occasionally do export functions that allow access to representation details,
[..]
I consider this as an argument to remove showTree from the interface of Data.Set, and to either specify toList to produce an ordered list (replacing toAscList), or to remove it from the interface as well.
Perhaps that's a little extreme but I agree with the sentiment. How about designating such abstraction- breaking functions "nosy", in the same way that functions which might break purity are "unsafe".
(mapMonotonic should of course be removed, too, or specified to fail (preferably in some MonadZero) if the precondition is violated, which should still be implementable in linear time.)
What's wrong with mapMonotonic that isn't wrong with, say, sortBy?, or Eq instances for parametrized types?
but if so would like to appeal to movers and shakers in the language definition to please decide exactly what the proper interpretation of standard Eq and Ord "class laws" are and in the next version of the report give an explanation of these
Strongly seconded, inserting ``precise'' before ``explanation'' ;-)
(And I'd expect equivalences and congruences to be accessible on the basis of standard first-year math...)
Before we can talk about what == should return, can we settle what we mean by = ? I think we need to pragmatic about breaking the rules, given suitable documentation and maybe warnings. We should at least aspire to some principles, which means we should try to know what we're talking about and to know what we're doing, even if we don't always do what we're talking about. I'll shut up now. Potatoes to peel Conor

Conor McBride
(mapMonotonic should of course be removed, too, or specified to fail (preferably in some MonadZero) if the precondition is violated, which should still be implementable in linear time.)
What's wrong with mapMonotonic that isn't wrong with, say, sortBy?, or Eq instances for parametrized types?
Prelude> :m + Data.Set Prelude Data.Set> toAscList $ mapMonotonic (10 -) (fromList [1 .. 5]) [9,8,7,6,5] Prelude Data.Set> 5 `member` mapMonotonic (10 -) (fromList [1 .. 5]) False Something's certainly wrong there!
Before we can talk about what == should return, can we settle what we mean by = ?
``='' is not in the Haskell interface! ;-) Therefore, I talked only about (==). The best way to include ``='' seems to be the semantic equality of P-logic [Harrison-Kieburtz-2005], which is quite a heavy calibre, and at least in that paper, classes are not yet included. Wolfram -------------------- @Article{Harrison-Kieburtz-2005, author = {William L. Harrison and Richard B. Kieburtz}, title = {The logic of demand in {Haskell}}, journal = JFP, year = 2005, volume = 15, number = 6, pages = {837--891}, abstract = {Haskell is a functional programming language whose evaluation is lazy by default. However, Haskell also provides pattern matching facilities which add a modicum of eagerness to its otherwise lazy default evaluation. This mixed or ``non-strict'' semantics can be quite difficult to reason with. This paper introduces a programming logic, P-logic, which neatly formalizes the mixed evaluation in Haskell pattern-matching as a logic, thereby simplifying the task of specifying and verifying Haskell programs. In p-logic, aspects of demand are reflected or represented within both the predicate language and its model theory, allowing for expressive and comprehensible program verification.} }

Hi On 13 Mar 2008, at 23:42, kahl@cas.mcmaster.ca wrote:
Conor McBride
responded to my comment: (mapMonotonic should of course be removed, too, or specified to fail (preferably in some MonadZero) if the precondition is violated, which should still be implementable in linear time.)
What's wrong with mapMonotonic that isn't wrong with, say, sortBy?, or Eq instances for parametrized types?
Prelude> :m + Data.Set Prelude Data.Set> toAscList $ mapMonotonic (10 -) (fromList [1 .. 5]) [9,8,7,6,5] Prelude Data.Set> 5 `member` mapMonotonic (10 -) (fromList [1 .. 5]) False
Something's certainly wrong there!
But nothing out of the ordinary: garbage in, garbage out. Happens all the time, even in Haskell. Why pick on mapMonotonic? Prelude Data.List> sortBy (\_ _ -> GT) [1,3,2,5,4] [4,5,2,3,1] Prelude Data.List> sortBy (\_ _ -> GT) [4,5,3,2,1] [1,2,3,5,4] I guess there's a question of what we might call "toxic waste"---junk values other than undefined. I think undefined is bad enough already. So the type system can't express the spec. I don't think we should be casual about that: we should be precise in documentation about the obligations which fall on the programmer. Some dirt is pragmatically necessary: we shouldn't pretend that it ain't so; we shouldn't pretend that dirt is clean.
Before we can talk about what == should return, can we settle what we mean by = ?
``='' is not in the Haskell interface! ;-)
No, but "is" is in the human interface!
Therefore, I talked only about (==).
Ah, but you talked about things. Which things? Is one of the things you talked about the same as (==)? the same as (flip (==))?
The best way to include ``='' seems to be the semantic equality of P-logic [Harrison-Kieburtz-2005], which is quite a heavy calibre, and at least in that paper, classes are not yet included.
I expect it's hard work. It's hard work in much better behaved systems. My point is that it's worth it, in order to facilitate more meaningful discussions. All the best Conor

Adrian Hey wrote:
I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa) which implies f x = f y for all definable f which implies (f x == f y) = True (for expression types which are instances of Eq). This pretty much requires structural equality for concrete types. For abstract types you can do something different provided functions which can give different answers for two "equal" arguments are not exposed.
How do you propose something like this to be specified in the language definition? The report doesn't (and shouldn't) know about abstract types. It only talks about things which are exported and things which are not. The distinction between implementation modules and client modules is made by the programmer, not by the language. So you can either require your law to hold everywhere, which IMO isn't a good idea, or you don't require it to hold. From the language definition point of view, I don't see any middle ground here. Also, when you talk about definable functions, do you include things like I/O? What if I want to store things (such as a Set) on a disk? If the same abstract value can have multiple representations, do I have to convert them all to some canonical representation before writing them to a file? This might be rather slow and is, IMO, quite unnecessary. From a more philosophical points of view, I'd say that the appropriate concept of equality depends a lot on the problem domain. Personally, I quite strongly disagree with restricting Eq instances in the way you propose. I have never found much use for strict structural equality (as opposed to domain-specific equality which may or may not coincide with the structural one).
On the subject of ambiguity, bugs and maxima, I see we have in Data.List
[...]
So I believe I'm correct in saying that maximumBy returns the last of several possible maximum elements of the list. This obviously needs specifying in the Haddock.
Because maximumBy documentation is ambiguous in this respect, so is the overloaded maximum documentation. At least you can't figure it out from the Haddock.
Why not simply say that maximumBy returns some maximum element from the list but it's not specified which one? That's how I always understood the spec. Code which needs a particular maximum element can't use maximumBy but such code is rare. I don't see how this is ambiguous, it just leaves certain things unspecified which is perfectly ok. Roman

Hi On 14 Mar 2008, at 03:48, Roman Leshchinskiy wrote:
Adrian Hey wrote:
I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa) which implies f x = f y for all definable f which implies (f x == f y) = True (for expression types which are instances of Eq). This pretty much requires structural equality for concrete types. For abstract types you can do something different provided functions which can give different answers for two "equal" arguments are not exposed.
How do you propose something like this to be specified in the language definition? The report doesn't (and shouldn't) know about abstract types.
Why not? Why shouldn't there be at least a standard convention, if not an abstype-like feature for establishing an abstraction barrier, and hence determine the appropriate observational equality for an abstract type?
So you can either require your law to hold everywhere, which IMO isn't a good idea, or you don't require it to hold. From the language definition point of view, I don't see any middle ground here.
Why not demand it in the definition, but allow "unsafe" leaks in practice? As usual. Why are you so determined that there's nothing principled to do here? People like to say "Haskell's easy to reason about". How much of a lie would you like that not to be?
Also, when you talk about definable functions, do you include things like I/O? What if I want to store things (such as a Set) on a disk? If the same abstract value can have multiple representations, do I have to convert them all to some canonical representation before writing them to a file?
Canonical representations are not necessary for observational congruence. Representation hiding is enough.
This might be rather slow and is, IMO, quite unnecessary.
From a more philosophical points of view, I'd say that the appropriate concept of equality depends a lot on the problem domain.
It's certainly true that different predicates may respect different equivalence relations. The equivalence relation you call equality should be the finest of those, with finer representational distinctions abstracted away. What that buys you is a class of predicates which are guaranteed to respect equality without further ado...
Personally, I quite strongly disagree with restricting Eq instances in the way you propose. I have never found much use for strict structural equality (as opposed to domain-specific equality which may or may not coincide with the structural one).
...which is how we use equality when we think. I certainly don't think "strict structural equality" should be compulsory. In fact, for Haskell's lazy data structures, you rather need lazy structural simulation if you want to explain why cycle "x" = cycle "xx" What would be so wrong with establishing a convention for saying, at each given type (1) this is the propositional equivalence which we expect functions on this type to respect (2) here is an interface which respects that equivalence (3) here are some unsafe functions which break that equivalence: use them at your own risk ? Why is it pragmatically necessary to make reasoning difficult? I'm sure that wise folk out there have wise answers to that question which they don't consider to be an embarrassment. When representation-hiding is bliss, 'tis folly to be wise. All the best Conor

Conor McBride wrote:
Hi
On 14 Mar 2008, at 03:48, Roman Leshchinskiy wrote:
Adrian Hey wrote:
I would ask for any correct Eq instance something like the law: (x==y) = True implies x=y (and vice-versa) which implies f x = f y for all definable f which implies (f x == f y) = True (for expression types which are instances of Eq). This pretty much requires structural equality for concrete types. For abstract types you can do something different provided functions which can give different answers for two "equal" arguments are not exposed.
How do you propose something like this to be specified in the language definition? The report doesn't (and shouldn't) know about abstract types.
Why not? Why shouldn't there be at least a standard convention, if not an abstype-like feature for establishing an abstraction barrier, and hence determine the appropriate observational equality for an abstract type?
Adrian's original question/proposal was about the language report. I'm only pointing out that all other considerations aside, it's not clear how to distinguish between the implementation part of the ADT and everything else in the report.
So you can either require your law to hold everywhere, which IMO isn't a good idea, or you don't require it to hold. From the language definition point of view, I don't see any middle ground here.
Why not demand it in the definition, but allow "unsafe" leaks in practice? As usual. Why are you so determined that there's nothing principled to do here? People like to say "Haskell's easy to reason about". How much of a lie would you like that not to be?
I'm not sure what you mean here. Should the report say something like "a valid Eq instance must ensure that x == y implies f x == f y for all f"? Probably not, since this requires structural equality which is not what you want for ADTs. Should it be "for all f which are not part of the implementation of the type"? That's a non-requirement if the report doesn't specify what the "implementation" is. So what should it say? "Unsafe leaks" are ok as long as they are rarely used. If you have to resort to unsafe leaks to define an ADT, then something is wrong.
Also, when you talk about definable functions, do you include things like I/O? What if I want to store things (such as a Set) on a disk? If the same abstract value can have multiple representations, do I have to convert them all to some canonical representation before writing them to a file?
Canonical representations are not necessary for observational congruence. Representation hiding is enough.
I beg to disagree. If the representation is stored on the disk, for instance, then it becomes observable, even if it's still hidden in the sense that you can't do anything useful with it other than read it back. Actually, we don't even need the disk. What about ADTs which implement Storable, for instance?
What would be so wrong with establishing a convention for saying, at each given type
(1) this is the propositional equivalence which we expect functions on this type to respect (2) here is an interface which respects that equivalence (3) here are some unsafe functions which break that equivalence: use them at your own risk
My (probably erroneous) understanding of the above is that you propose to call (==) "propositional equivalence" and to require that for every type, we define what that means. To be honest, I don't quite see how this is different from saying that the meaning of (==) should be documented for every type, which I wholeheartedly agree with. But the "unsafe" bit really doesn't make sense to me. As an example, consider the following data type: data Expr = Var String | Lam String Expr | App Expr Expr The most natural notion of equality here is probably equality up to alpha conversion and that's what I usually would expect (==) to mean. In fact, I would be quite surprised if (==) meant structural equality. Should I now consider the Show instance for this type somehow unsafe? I don't see why this makes sense. Most of the time I probably don't even want to make this type abstract. Are the constructors also unsafe? Why? To summarise my views on this: an Eq instance should define a meaningful equivalence relation and be documented. Requiring anything beyond that just doesn't make sense to me. Roman
participants (5)
-
Aaron Denney
-
Adrian Hey
-
Conor McBride
-
kahl@cas.mcmaster.ca
-
Roman Leshchinskiy