RE: the MPTC Dilemma (please solve)

I would not say that it's well-specified, no. What we do know is this: GHC may loop if you use -fallow-undecidable-instances -- but if it terminates, the program is well typed and should not "go wrong" at runtime. S | -----Original Message----- | From: haskell-prime-bounces@haskell.org [mailto:haskell-prime-bounces@haskell.org] On Behalf Of | Ashley Yakeley | Sent: 21 February 2006 20:13 | To: haskell-prime@haskell.org | Subject: Re: the MPTC Dilemma (please solve) | | Simon Peyton-Jones wrote: | | > Of course -fallow-undecidable-instances still lifts all restrictions, | > and then all bets are off. | | Is the behaviour of GHC with -fallow-undecidable-instances (and | -fcontext-stack) well-understood and specifiable? It is a very useful | option, as you can join (meet?) classes like this: | | class P a | class Q a | | class (P a,Q a) => PQ a | instance (P a,Q a) => PQ a | | -- | Ashley Yakeley | | _______________________________________________ | Haskell-prime mailing list | Haskell-prime@haskell.org | http://haskell.org/mailman/listinfo/haskell-prime

| Is the behaviour of GHC with -fallow-undecidable-instances (and | -fcontext-stack) well-understood and specifiable?
I would not say that it's well-specified, no.
to start improving that situation, could we please have a task ticket for "document differences in unconstrained instance handling", then ask everyone to attach source examples showing such differences? [can guests attach code to task tickets?] the hope being, of course, that implementations nominally providing the same feature will eventually converge on providing the same interpretation of all programs using that feature. an example of the current oddities (at least they seem odd to me;): both hugs and ghc claim to resolve overlapping instances in favour of the most specific instance declaration. both claim that functional dependencies uniquely determine the range types from the domain types. but they do not agree on which programs to accept when we try to combine best-match with FDs. I've already given an example where ghc allows me to define record selection, while hugs complains that the overlap violates the FDs. I reported that as a hugs bug, because I think the best-match resolution of overlaps should ensure that the FD violation cannot happen, so the code should be valid. there are different ways to interpret FDs (something to check, or something to use), but it seemed that ghc was doing the right thing there. thread start: http://www.haskell.org//pipermail/hugs-bugs/2006-February/001560.html but after further experimentation, I'm not longer sure that ghc is doing the right thing for the right reasons. here is a tiny example of one of the disagreements: {- ghc ok hugs "Instance is more general than a dependency allows" -} class C a b | a -> b instance C a b so what is ghc doing there? is it going to guarantee that b will always be uniquely determined? {- ghc ok hugs "Instance is more general than a dependency allows" -} class C b | -> b where c :: b instance C b where c = error "b" safely m = m `CE.catch` print main = do safely $ print $ (c::Int) safely $ print $ (c::Bool) safely $ print [id,c] oh, apparently not. unless b is uniquely determined to be universally quantified, and the instantiations happen "after" instance selection. {- ghc ok hugs "Instance is more general than a dependency allows" -} class C b | -> b where c :: b instance C b where c = error "b" class D a where d :: a -> String instance C a => D a where d a = "a" instance C Int => D Int where d a = "Int" -- try at ghci prompt: (d 1,d (1::Int)) -- gives: ("a","Int") so that parameter of C isn't all that unique. at least not long enough to influence instance selection in D. comments? cheers, claus

continuing the list of odd cases in type class handling, here is a small example where overlap resolution is not taken into account when looking at FDs. context: both hugs and ghc resolve instance overlaps in favour of the most specific declaration. so the following works in both ghc (darcs, 25022006) and hugs (minhugs 20051031): {- both ghc and hugs accept without 3rd par and FD neither accepts with 3rd par and FD -} data T = T deriving Show data F = F deriving Show class TEQ a b {- tBool | a b -> tBool -} where teq :: a -> b -> Bool instance TEQ a a {- T -} where teq _ _ = True instance TEQ a b {- F -} where teq _ _ = False test = print (teq True 'c', teq True False) and both print "(False,True)", so best-fit overlap resolution entirely determines which instance to choose! now, if we uncomment the third class parameter, they both complain (as expected, though about different things). however, if we also uncomment the functional dependency, to fix the ambiguity wrt the 3rd parameter, both complain that this FD is in conflict/inconsistent with the instances! as far as i understand it, the potential inconsistency should have been eliminated by the best-fit overlap resolution, so I submit this is a bug (and unlike the earlier example I submitted to hugs-bugs, this fails with both hugs and ghc; and it is less open to alternative interpretations, I hope). cheers, claus

[I suggest to keep follow-on discussions to the haskell prime list, to avoid further copies]
continuing the list of odd cases in type class handling, here is a small example where overlap resolution is not taken into account when looking at FDs.
actually, one needs to take both into account to arrive at the interpretation I favour: - variables in the range of an FD can never influence instance selection if the variables in the domain of that FD are given (for, if they did, there'd be two instances with different range types for the same domain types -> FD violation) - in other words, FDs not only tell us that some type relations are functional, they can be seen as roughly similar to what is called mode declarations in logic programming: they tell us with which input/output combinations a type relation may be used - for each FD a given constraint is subject to, the range types should be ignored during instance selection: the domain types of the FD constitute the inputs and are sufficient to compute unique FD range types as outputs of the type relation. the computed range types may then be compared with the range types given in the original constraint to determine whether the constraint can be fulfilled or not if we apply these ideas to the example I gave, instance resolution for the "3 parameter, with FD"-version proceeds exactly as it would for the "2 parameter"-version, using best-fit overlap resolution to determine a unique 3rd parameter (range of FD) from the first two (domain of FD). this would seem similar to what we do at the function level: f a b | let res = True, a==b = res f a b | let res = False, otherwise = res here, the implementation does not complain that f isn't functional because we could instantiate {a=1,b=1,res=False} as well as {a=1,b=1,res=True} - instead it treats res as output only, a and b as input, and lets first-fit pattern matching resolve the overlap in the patterns. these rules describe a function because we say it does. whereas, at the type class level, the implementations say "okay, you said this is a type function, with two input and one output parameters, but if I ignore that for the moment, then overlap resolution doesn't kick in because of the different 3rd input parameter, and now there are two instances where there should only be one {TEQ a a T, TEQ a a F}; and if I now recall that this should be a type function, I have to shout 'foul!'". am I the only one who thinks this does not makes sense?-) cheers, claus
{- both ghc and hugs accept without 3rd par and FD neither accepts with 3rd par and FD -}
data T = T deriving Show data F = F deriving Show
class TEQ a b {- tBool | a b -> tBool -} where teq :: a -> b -> Bool instance TEQ a a {- T -} where teq _ _ = True instance TEQ a b {- F -} where teq _ _ = False
test = print (teq True 'c', teq True False)

It's very hard to follow you here. Can you formalize your proposal below such that we can verify some formal results (e.g. tractable inference, coherence etc). Why not be "macho" and use a formal framework in which this all can be expressed? In one of your previous emails you said:
...make type classes a syntactic representation of their semantic domain.
What do you mean? Explaining type classes in terms of type classes? This can't work. In your previous email, you mentioned the Theory of Qualified Types (QT) and CHRs as formal type class frameworks but you seem to be reluctant to use either of these frameworks. Why? To a large extent, CHRs will do the job. See FDs and my reply to your other email regarding disequality constraints. In case, people don't know CHRs. Here's the type class to CHR translation. 1) For each class C => TC t we generate a propagation CHR rule TC t ==> C Ex: class Eq a => Ord a translates to rule Ord a ==> Eq a 2) For each instance C => TC t we generate a simplification CHR rule TC t <==> C Ex: instance Eq a => Eq [a] translates to rule Eq [a] <==> Eq a Roughly, the CHR semantics is as follows. Propagation rules add (propagate) the right-hand side if we find a match to the left-hand side. Simplification rules reduce (simplifify) the left-hand side by the right-hand side. Example: rule Ord a ==> Eq a -- (R1) rule Eq [a] <==> Eq a -- (R2) Ord a -->R1 Ord a, Eq a Eq [Int] -->R2 Eq Int This shows that CHRs are *very* close in terms of syntax and semantics of type classes. BTW, 1) shows that the superclass arrow should indeed be the other way around and 2) shows that instances do NOT correspond to Prolog-style Horn clauses. In fact, I don't really care if you're using CHRs, QT or whatever. As long as there's a (at least semi-) formal description. Otherwise, we can't start a discussion because we don't know what we're talking about. Martin Claus Reinke writes:
[I suggest to keep follow-on discussions to the haskell prime list, to avoid further copies]
continuing the list of odd cases in type class handling, here is a small example where overlap resolution is not taken into account when looking at FDs.
actually, one needs to take both into account to arrive at the interpretation I favour:
- variables in the range of an FD can never influence instance selection if the variables in the domain of that FD are given (for, if they did, there'd be two instances with different range types for the same domain types -> FD violation)
- in other words, FDs not only tell us that some type relations are functional, they can be seen as roughly similar to what is called mode declarations in logic programming: they tell us with which input/output combinations a type relation may be used
- for each FD a given constraint is subject to, the range types should be ignored during instance selection: the domain types of the FD constitute the inputs and are sufficient to compute unique FD range types as outputs of the type relation. the computed range types may then be compared with the range types given in the original constraint to determine whether the constraint can be fulfilled or not
if we apply these ideas to the example I gave, instance resolution for the "3 parameter, with FD"-version proceeds exactly as it would for the "2 parameter"-version, using best-fit overlap resolution to determine a unique 3rd parameter (range of FD) from the first two (domain of FD).
this would seem similar to what we do at the function level:
f a b | let res = True, a==b = res f a b | let res = False, otherwise = res
here, the implementation does not complain that f isn't functional because we could instantiate {a=1,b=1,res=False} as well as {a=1,b=1,res=True} - instead it treats res as output only, a and b as input, and lets first-fit pattern matching resolve the overlap in the patterns. these rules describe a function because we say it does.
whereas, at the type class level, the implementations say "okay, you said this is a type function, with two input and one output parameters, but if I ignore that for the moment, then overlap resolution doesn't kick in because of the different 3rd input parameter, and now there are two instances where there should only be one {TEQ a a T, TEQ a a F}; and if I now recall that this should be a type function, I have to shout 'foul!'".
am I the only one who thinks this does not makes sense?-)
cheers, claus
{- both ghc and hugs accept without 3rd par and FD neither accepts with 3rd par and FD -}
data T = T deriving Show data F = F deriving Show
class TEQ a b {- tBool | a b -> tBool -} where teq :: a -> b -> Bool instance TEQ a a {- T -} where teq _ _ = True instance TEQ a b {- F -} where teq _ _ = False
test = print (teq True 'c', teq True False)
_______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://haskell.org/mailman/listinfo/haskell-prime

It's very hard to follow you here.
is it really? perhaps you're not trying hard enough:
- for each FD a given constraint is subject to, the range types should be ignored during instance selection: the domain types of the FD constitute the inputs and are sufficient to compute unique FD range types as outputs of the type relation. the computed range types may then be compared with the range types given in the original constraint to determine whether the constraint can be fulfilled or not
is there a simpler way to put the idea? whatever way you formalize (n+1)-parameter classes "C t1..tn t(n+1)" with an FD "t1..tn -> t(n+1)", my suggestion is to formalize it in almost the same way, but treating parameter (n+1) differently. instead of checking checking for validity of instances wrt constraints of the form "C r1..rn r(n+1)" (where ri are the concrete types arising), check that any constraint "C r1..rn x" (where x is a free, but instantiable variable) will yield zero or one instantiations for x. then, when you have a constraint "C r1..rn r(n+1)", use something like this instead: "exists x. (C r1..rn x && x==r(n+1)" there, that's just as long, and not much clearer than saying that t1.. tn should be treated as inputs to and t(n+1) as an output of the inference process.
Can you formalize your proposal below such that we can verify some formal results (e.g. tractable inference, coherence etc).
I'd like to, but you're not giving me anything to work with. for instance, your "the TC to CHR translation" carefully omits all the features currently under discussion here.
Why not be "macho" and use a formal framework in which this all can be expressed?
because I'm the humble programmer and you're the mighty formalist?-) I don't find this style of discussion constructive. I'm interested in language design, so actually I do want to look at both theory and practice, but I'm most interested in using a language I can think in and thinking in a language I can use. for the present context, that is Haskell. if you prefer to think in CHRs, that's fine, but why do you expect me to translate for you?
...make type classes a syntactic representation of their semantic domain.
What do you mean? Explaining type classes in terms of type classes? This can't work.
that is neither what I meant, not what I said. semantics maps from some existing syntactic domain (here Haskell type classes) to some semantic domain (eg, QTs or CHRs), in order to help understand the former in terms of the latter and its nice theory. semantics-based language design maps from some semantic domain to some to be constructed syntactic domain, in order to help expressing the former in terms of the latter, preferably inheriting the elegance of the formers nice theory. QTs explain type classes, but QTs are more elegant, simple and expressive than current type class practice. most of Mark's extentions to type classes were based on this difference, as were most of his advanced programming techniques. I try to read current Haskell type classes as an immature, to be improved, representation of type predicate entailment, and where there are discrepancies, I prefer not to complicate the semantics, but to simplify the syntax. so, when I talk about "=>" in haskell, I think about predicate entailment in QT. and usually, I talk about cases where I can't map current practice to what predicate entailment would lead me to expect.
In your previous email, you mentioned the Theory of Qualified Types (QT) and CHRs as formal type class frameworks but you seem to be reluctant to use either of these frameworks. Why?
am I? give me a version of CHRs that accurately covers current type class practice, and I'll try to express my modifications. but you won't be able to do that until we have pinned down what current practice is. which is what I'm trying to do here..
In case, people don't know CHRs. Here's the type class to CHR translation.
there is no "the" translation. and this one omits all the interesting features under discussion here (MPTCs with flexible instances, FDs, overlap resolution), and worse, includes optimizations based on those omissions.
1) For each class C => TC t we generate a propagation CHR rule TC t ==> C Ex: class Eq a => Ord a translates to rule Ord a ==> Eq a
note that this is a case of encoding current syntax-level practice (of early, eager checking, then using) in the semantics. as I've argued in previous email discussing this, it seems quite possible to interpret implication as implication, if you are willing to accept more programs as valid. I think that would simplify the language.
2) For each instance C => TC t we generate a simplification CHR rule TC t <==> C Ex: instance Eq a => Eq [a] translates to rule Eq [a] <==> Eq a
this should be: for each instance C => TC t rule TC t <= C you can combine those to for all known instances Ci => TC ti rule TC x <= .. || (x==ti && Ci) || ... but to get the reverse implication, you need a closed-world assumption that you don't have for current Haskell (unless you abandon separate compilation, perhaps, or restrict instance syntax to something simple that doesn't have much to do with the issues we're talking about here).
Roughly, the CHR semantics is as follows. Propagation rules add (propagate) the right-hand side if we find a match to the left-hand side. Simplification rules reduce (simplifify) the left-hand side by the right-hand side.
so you're only actually using simplification rules one way.
This shows that CHRs are *very* close in terms of syntax and semantics of type classes.
no, it shows that (a simple form of) type classes can be encoded naturally (without complex rewriting) in a subset of CHR. the same can be said about QT. but CHR/=QT, so if we tried to devise Haskell variants that would allow us to represent all of QT, or all of CHR, those two variants would probably look different, wouldn't they?
BTW, 1) shows that the superclass arrow should indeed be the other way around and 2) shows that instances do NOT correspond to Prolog-style Horn clauses.
nonsense. 1) shows that current Haskell does not interpret the superclass arrow as the implication it should be. turning the implication around would still leave you with two interpretations of the same symbol in the same language (implication in classes being checked eagerly, implication in instances being checked lazily). 2) shows nothing of the sort. encoding a closed world assumption by collecting the rhss is inappropriate here, but standard practice in logic programming interpretation.
In fact, I don't really care if you're using CHRs, QT or whatever. As long as there's a (at least semi-) formal description. Otherwise, we can't start a discussion because we don't know what we're talking about.
being formal alone is neither necessary nor sufficient to know what we are talking about. the above translation is useless for the present problem. and if I look at "Theory of Overloading", it claims that FDs "just extend the proof requirements for an instance". which seems to miss the point that we want to draw conclusions from FDs: they are not just something to check, but something to use, and the present problem is all about when and how to use them. but let's look at the rules there: for each class SC scs => C cs | ds -> rs add rules rule C cs, C cs', ds==ds' => rs==rs' where scs: superclass constraint variables cs: class variables ds: variables in domain of FD rs: variables in range of FD the paper then goes on to introduce per-instance rules for FDs, which to me look like consequences of (a) the simplification CHR, which introduces a fact "C ts" for each "instance C ts", combined with (b) the class FD-CHR, which should match those facts with any constraints "C ts'" arising, to add equality constraints implied by the FD. so I don't see why you need those extra per-instance rules? what am I missing? and how would you like to reflect the best-fit overlap resolution in CHRs, without simply restating it? once we can agree on some form of TC2CHR translation that covers the current practice in unconstrained type class programming, with FDs and with best-fit overlap resolution, we can start to think about the modifications. cheers, claus
participants (3)
-
Claus Reinke
-
Martin Sulzmann
-
Simon Peyton-Jones