Proposal to solve Haskell's MPTC dilemma

This message, sent a few days ago to haskell-cafe, presents, informally, a proposal to solve Haskell's MPTC (multi-parameter type class) dilemma. If this informal proposal turns out to be acceptable, we (I am a volunteer) can proceed and make a concrete proposal. The discussion in haskell-cafe started well, involving a discussion related to module fragility (changes to type-correctness after a change in imported modules) but it is now quiet. Since there may be members of this list not in haskel-cafe, and because I'd like to hear your opinions and discuss the proposal here, as well as encourage the proposal's implementation, I am sending the message to this list also. The proposal has been published in the SBLP'2009 proceedings and is available at www.dcc.ufmg.br/~camarao/CT/solution-to-mptc-dilemma.pdf The dilemma is well-known (see e.g. hackage.haskell.org/trac/haskell-prime/wiki/MultiParamTypeClassesDilemma). www.haskell.org/ghc/dist/current/docs/html/users_guide/type-class-extensions.html presents a solution to the termination problem related to the introduction of MPTCs in Haskell. Neither FDs (functional dependencies) nor any other mechanism like ATs (associated types) are needed in our proposal, in order to introduce MPTCs in Haskell; the only change is in the ambiguity rule. This is explained below. The termination problem is essentially ortogonal and can be dealt with, with minor changes, as described in the web page above. Let us review the ambiguity rules used in Haskell-98 and in GHC. Haskell-98 ambiguity rule, adequate for single parameter type classes, is: a type C => T is ambiguous iff there is a type variable that occurs in C (the "context", or constraint set) but not in T (the simple, unconstrained type). For example: forall a.(Show a, Read a)=>String is ambiguous, because "a" occurs in the constraints (Show a,Read a) but not in the simple type (String). In the context of MPTCs, this rule alone is not enough. Consider, for example (Example 1): class F a b where f:: a->b class O a where o:: a and k = f o:: (F a b,O a) => b Type forall a b. (F a b,O a) => b can be considered to be not ambiguos, since overloading resolution can be defined so that instantiation of "b" can "determine" that "a" should also be instantiated (as FD b|->a does), thus resolving the overloading. GHC, since at least version 6.8, makes significant progress towards a definition of ambiguity in the context of MPTCs; GHC 6.10.3 User’s Guide says (section 7.8.1.1): "GHC imposes the following restrictions on the constraints in a type signature. Consider the type: forall tv1..tvn (c1, ...,cn) => type. ... Each universally quantified type variable tvi must be reachable from type. A type variable a is reachable if it appears in the same constraint as either a type variable free in the type, or another reachable type variable." For example, type variable "a" in constraint (O a) in (F a b, O a)=>b above is reachable, because it appears in (F a b) (the same constraint as type variable "b", which occurs in the simple type). Our proposal is: consider unreachability not as indication of ambiguity but as a condition to trigger overloading resolution (in a similar way that FDs trigger overloading resolution): when there is at least one unreachable variable and overloading is found not to be resolved, then we have ambiguity. Overloading is resolved iff there is a unique substitution that can be used to specialize the constraint set to one, available in the current context, such that the specialized constraint does not contain unreachable type variables. (A formal definition, with full details, is in the cited SBLP'09 paper.) Consider, in Example 1, that we have a single instance of F and O, say: instance F Bool Bool where f = not instance O Bool where o = True and consider also that "k" is used as in e.g.: kb = not k According to our proposal, kb is well-typed. Its type is Bool. This occurs because (F a b, O a)=>Bool can be simplified to Bool, since there exists a single substitution that unifies the constraints with instances (F Bool Bool) and (O Bool) available in the current context, namely S = (a|->Bool,b|->Bool). As another example (Example 2), consider: {-# NoMonomorphismRestriction, MultiParameterTypeClasses #-} data Matrix = ... data Vector = ... class Mult a b c where (*):: a ! b ! c instance Mult Matrix Vector Matrix where (*) = ... instance Mult Matrix Vector Vector where (*) = ... m1:: Matrix = ... m2:: Vector = ... m3:: Vector = ... m = (m1 * m2) * m3 GHC gives the following type to m: m :: forall a, b. (Mult Matrix Vector a, Mult a Vector b) => b However, "m" cannot be used effectively in a program compiled with GHC: if we annotate m::Matrix, a type error is reported. This occurs because the type inferred for "m" includes constraints (Mult Matrix Vector a, Mult a Vector Matrix) and, with b|->Matrix, type variable "a" appears only in the constraint set, and this type is considered ambiguous. Similarly, if "m" is annotated with type Vector, the type is also considered ambiguous. There is no means of specializing type variable "a" occurring in the type of "m" to Matrix. And no FD may be used in order to achieve such specialization, because we do not have here a FD: "a" and "b" do not determine "c" in class Mult: for (a,b = Matrix,Vector), we can have "c" equal to either Matrix or Vector. According to our proposal, type m:: forall a. (Mult Matrix Vector a, Mult a Vector Matrix)) => Matrix is not ambiguous, in the context of the class and instance declarations of Example 2 (since "a" can be instantiated to Matrix), whereas type m:: forall a.(Mult Matrix Vector a, Mult a Vector Matrix)) => Vector is ambiguous in this context. What do you think? Cheers, Carlos

On 05/26/10 15:42, Carlos Camarao wrote:
What do you think?
I think you are proposing using the current set of instances in scope in order to remove ambiguity. Am I right? ..I read the haskell-cafe thread so far, and it looks like I'm right. This is what I'll add to what's been said so far: Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) : module C where class F a b where f :: a -> b class O a where o :: a module P where import C instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct? Also, in your paper, example 2 includes
m = (m1 * m2) * m3 and you state In Example 2, there is no means of specializing type variable c0 occurring in the type of m to Matrix.
I suggest that there is an appropriate such means, namely, to write m = (m1 * m2 :: Matrix) * m3 . (Could the paper address how that solution falls short? Are there other cases in which there is more than just a little syntactical convenience at stake?, or is even that much added code too much for some use-case?) -Isaac

On 05/26/10 15:42, Carlos Camarao wrote:
I think you are proposing using the current set of instances in scope in order to remove ambiguity. Am I right?
I think that an important point is that it is not exactly "to remove ambiguity", because the proposal tries to solve the problem exactly when there is really no ambiguity. There would be ambiguity if there were two or more conflicting definitions, and the proposal is to use, upon occurrence of an unreachable type variable in a constraint, an instance in the current context that is the single instance that could be used to instantiate that type variable. This was perhaps made unclear when I incorrectly wrote in haskell-cafe that the proposal eliminates the need for defaulting. I should have said that defaulting is not necessary to force instantiation (of unreachable variables) when there are no conflicting definitions in the current context. That is, defaulting should be used exactly to remove ambiguity, i.e. when there exist conflicting definitions and when unreachable type variables appear that can be instantiated to such conflicting definitions.
Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) :
module C where
class F a b where f :: a -> b class O a where o :: a
module P where import C
instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o
module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct?
If qualified importation of k from both P and from Q was specified, we would have two *distinct* terms, P.k and Q.k. Unqualified importation must cause an error (two distinct definitions for the same non-overloaded name). For overloaded names with distinct instances, *ambiguity* would be reported, as e.g. in: module C where class O a where o::a module P where instance O Bool where o = True module Q where instance O Bool where o = False moduke Main where import P import Q main = do { print o::Bool }
Also, in your paper, example 2 includes m = (m1 * m2) * m3 and you state
In Example 2, there is no means of specializing type variable c0 occurring in the type of m to Matrix. In Example 2, there is no means of specializing type variable c0 occurring in the type of m to Matrix. I suggest that there is an appropriate such means, namely, to write m = (m1 * m2 :: Matrix) * m3
Yes, that solution is fine. Sorry, I should have written "There is no means of specializing type variable c0 occurring in the type of m to Matrix, without an explicit type annotation in the expression defining m."
(Could the paper address how that solution falls short? Are there other cases in which there is more than just a little syntactical convenience at stake? , or is even that much added code too much for some use-case?)
The point should have been that you cannot make a class with FDs eliminate the need for having to make explicit type annotations, in cases such as these. The example should also be such that it would need more type annotations, instead of just one... Cheers, Carlos

On Thu, May 27, 2010 at 10:39 AM, Carlos Camarao
Isaac Dupree:
Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) :
module C where
class F a b where f :: a -> b class O a where o :: a
module P where import C
instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o
module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct?
If qualified importation of k from both P and from Q was specified, we would have two *distinct* terms, P.k and Q.k.
I think Isaac's point is that P.k and Q.k have the same definition (f
o). If they don't produce the same value, then referential
transparency is lost.
--
Dave Menendez

On Thu, May 27, 2010 at 5:43 PM, David Menendez
On Thu, May 27, 2010 at 10:39 AM, Carlos Camarao
wrote: Isaac Dupree:
Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) :
module C where
class F a b where f :: a -> b class O a where o :: a
module P where import C
instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o
module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct?
If qualified importation of k from both P and from Q was specified, we would have two *distinct* terms, P.k and Q.k.
I think Isaac's point is that P.k and Q.k have the same definition (f o). If they don't produce the same value, then referential transparency is lost.
-- Dave Menendez
<http://www.eyrie.org/~zednenem/ http://www.eyrie.org/%7Ezednenem/>
The definitions of P.k and Q.k are textually the same but the contexts are different. "f" and "o" denote distinct values in P and Q. Thus, P.k and Q.k don't have the same definition. Thanks for the clarification. I thought the point was about coherence. Cheers, Carlos

On 05/27/10 17:42, Carlos Camarao wrote:
On Thu, May 27, 2010 at 5:43 PM, David Menendez
wrote: On Thu, May 27, 2010 at 10:39 AM, Carlos Camarao
wrote: Isaac Dupree:
Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) :
module C where
class F a b where f :: a -> b class O a where o :: a
module P where import C
instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o
module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct?
If qualified importation of k from both P and from Q was specified, we would have two *distinct* terms, P.k and Q.k.
I think Isaac's point is that P.k and Q.k have the same definition (f o). If they don't produce the same value, then referential transparency is lost.
-- Dave Menendez
<http://www.eyrie.org/~zednenem/http://www.eyrie.org/%7Ezednenem/> The definitions of P.k and Q.k are textually the same but the contexts are different. "f" and "o" denote distinct values in P and Q. Thus, P.k and Q.k don't have the same definition.
Oh, I guess you are correct: it is like defaulting: it is a similar effect where the same expression means different things in two different modules as if you had default (Int) in one, and default (Bool) in the other. Except: Defaulting according to the standard only works in combination with the 8 (or however many it is) standard classes; and defaulting in Haskell is already a bit poorly designed / frowned upon / annoying that it's specified per-module when nothing else in the language is*.(that's a rather surmountable argument) It may be worth reading the GHC user's guide which attempts to explain the difference between incoherent and non-incoherent instance selection, http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/type-class-extension... I didn't read both it and your paper closely enough that I'm sure anymore whether GHC devs would think your extension would require or imply -XIncoherentInstances ... my intuition was that IncoherentInstances would be implied... *(it's nice when you can substitute any use of a variable, such as P.k, with the expression that it is defined as -- i.e. the expression written so that it refer to the same identifiers, not a purely textual substitution -- but in main above, you can't write [assuming you imported C] "print (f o)" because it will be rejected for ambiguity. (Now, there is already an instance-related situation like this where Main imports two different modules that define instances that overlap in an incompatible way, such as two different instances for Functor (Either e) -- not everyone is happy about how GHC handles this, but at least those overlaps are totally useless and could perhaps legitimately result in a compile error if they're even imported into the same module.))

On Fri, May 28, 2010 at 2:36 AM, Isaac Dupree < ml@isaac.cedarswampstudios.org> wrote:
On 05/27/10 17:42, Carlos Camarao wrote:
On Thu, May 27, 2010 at 5:43 PM, David Menendez
wrote: On Thu, May 27, 2010 at 10:39 AM, Carlos Camarao
wrote: Isaac Dupree:
Your proposal appears to allow /incoherent/ instance selection. This means that an expression can be well-typed in one module, and well-typed in another module, but have different semantics in the two modules. For example (drawing from above discussion) :
module C where
class F a b where f :: a -> b class O a where o :: a
module P where import C
instance F Bool Bool where f = not instance O Bool where o = True k :: Bool k = f o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k :: Bool k = f o
module Main where import P import Q -- (here, all four instances are in scope) main = do { print P.k ; print Q.k } -- should result, according to your proposal, in -- False -- True -- , am I correct?
If qualified importation of k from both P and from Q was specified, we would have two *distinct* terms, P.k and Q.k.
I think Isaac's point is that P.k and Q.k have the same definition (f o). If they don't produce the same value, then referential transparency is lost.
-- Dave Menendez
<http://www.eyrie.org/~zednenem/ http://www.eyrie.org/%7Ezednenem/< http://www.eyrie.org/%7Ezednenem/>> The definitions of P.k and Q.k are textually the same but the contexts are different. "f" and "o" denote distinct values in P and Q. Thus, P.k and Q.k don't have the same definition.
Oh, I guess you are correct: it is like defaulting: it is a similar effect where the same expression means different things in two different modules as if you had default (Int) in one, and default (Bool) in the other. Except: Defaulting according to the standard only works in combination with the 8 (or however many it is) standard classes; and defaulting in Haskell is already a bit poorly designed / frowned upon / annoying that it's specified per-module when nothing else in the language is*.(that's a rather surmountable argument)
It may be worth reading the GHC user's guide which attempts to explain the difference between incoherent and non-incoherent instance selection,
http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/type-class-extension... I didn't read both it and your paper closely enough that I'm sure anymore whether GHC devs would think your extension would require or imply -XIncoherentInstances ... my intuition was that IncoherentInstances would be implied...
*(it's nice when you can substitute any use of a variable, such as P.k, with the expression that it is defined as -- i.e. the expression written so that it refer to the same identifiers, not a purely textual substitution -- but in main above, you can't write [assuming you imported C] "print (f o)" because it will be rejected for ambiguity. (Now, there is already an instance-related situation like this where Main imports two different modules that define instances that overlap in an incompatible way, such as two different instances for Functor (Either e) -- not everyone is happy about how GHC handles this, but at least those overlaps are totally useless and could perhaps legitimately result in a compile error if they're even imported into the same module.)) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
I have no idea why you think IncoherentInstances would be implied. The proposal says: do not issue ambiguity error if, using whatever means permitted and specified (OverlappingInstances, IncoherentInstances, whatever), you can select a single instance. The situation is as if we a FD: module C where class F a b | a->b where f :: a -> b class O a where o :: a module P where import C; instance F Bool Bool where f = not instance O Bool where o = True g:: Bool -> Bool g = f k::Bool k = g o module Q where import C instance F Int Bool where f = even instance O Int where o = 0 g::Int->Bool g = f k :: Bool k = g o module Main where import P import Q main = do { print P.k ; print Q.k } Cheers, Carlos

On 05/29/10 21:24, Carlos Camarao wrote:
The situation is as if we a FD:
Well, that is indeed equivalent here in the second argument of class F, but I constructed the example to show an issue in the class's *first* argument. Notice you needed to add type-signatures, on the functions you named "g" -- in particular their first arguments -- to make the example work with only FDs?
module C where class F a b | a->b where f :: a -> b class O a where o :: a
module P where import C; instance F Bool Bool where f = not instance O Bool where o = True g:: Bool -> Bool g = f k::Bool k = g o
module Q where import C instance F Int Bool where f = even instance O Int where o = 0 g::Int->Bool g = f k :: Bool k = g o
you can inline these "k"-definitions into module Main and it will work (modulo importing C). module Main where import C import P import Q main = do { print (((f :: Bool -> Bool) o) :: Bool); print (((f :: Int -> Bool) o) :: Bool) } These are two different expressions that are being printed, because " :: Bool -> Bool" is different from " :: Int -> Bool". In my example of using your proposal, one cannot inline in the same way, if I understand correctly (the inlining would cause ambiguity errors -- unless of course the above distinct type-signatures are added). If your proposal was able to require those -- and only those -- bits of type signatures that were essential to resolve the above ambiguity; for example, the ( :: Int) below, module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k = f (o :: Int) , then I would be fine with your proposal (but then I suspect it would have to be equivalent to FDs -- or in other words, that it's not really practical to change your proposal to have that effect). I stand by my assertion that "the same expression means different things in two different modules" is undesirable, (and that I suspect but am unsure that this undesirability is named "incoherent instances"). I'm trying to work out whether it's possible to violate the invariants of a Map by using your extension (having it select a different instance in two different places, given the same type).. I think, no it is not possible for Ord or any single-parameter typeclass, though there might be some kind of issues with multi-parameter typeclasses, if the library relies on a FD-style relationship between two class type-parameters and then two someones each add an instance that together violate that implied FD-relationship (which is allowed under your scheme, unlike if there was an actual FD). Er, odd, I need to play with some actual FD code to think about this, but I'm too sleepy / busy packing for a trip. Did any of the above make sense to you? It's fine if some didn't, type systems are complicated... and please point out if something I said was outright wrong. -Isaac

The situation is as if we [had] a FD: Well, that is indeed equivalent here in the second argument of class F, but I constructed the example to show an issue in the class's *first* argument.
The example should be equivalent in all respects (at least that was my motivation when I wrote it).
Notice you needed to add type-signatures, on the functions you named "g" -- in particular their first arguments -- to make the example work with only FDs?
I wrote g, with type signatures, just to avoid writing the type signatures (f:: Bool->Bool and f::Int->Bool) inside (f o). I wanted to use, textually, the same expression.
These are two different expressions that are being printed, because " :: Bool -> Bool" is different from " :: Int -> Bool". In my example of using your proposal, one cannot inline in the same way,
I wrote the example just to show that one can obtain the same effect with FDs: one not "inline", i.e. use "g o", in the same way.
If your proposal was able to require those -- and only those -- bits of type signatures that were essential to resolve the above ambiguity; for example, the ( :: Int) below, module Q where import C instance F Int Bool where f = even instance O Int where o = 0 k = f (o :: Int) , then I would be fine with your proposal (but then I suspect it would have to be equivalent to FDs -- or in other words, that it's not really practical to change your proposal to have that effect). I stand by my assertion that "the same expression means different things in two different modules" is undesirable, (and that I suspect but am unsure that this undesirability is named "incoherent instances").
"k::Bool; k=f o" in Q has exactly the same effect as "k=f(o::Int)", under our proposal. "(f o)::Bool" in P and in Q are not "the same expression", they are distinct expressions, because they occur in distinct contexts which make "f" and "o" in (f o)::Bool denote distinct values, just as "(g o)::Bool" are distinct expressions in P and Q in the example with a FD (because "g" and "o" in (g o)::Bool denote distinct values in P and in Q). Cheers, Carlos
participants (3)
-
Carlos Camarao
-
David Menendez
-
Isaac Dupree