TypeFamilies vs. FunctionalDependencies & type-level recursion

newer
Re: Please apply the comparison...

David Mazieres

29 May 2011 29 May '11

6:59 p.m.

I'm not sure if these extensions are still under discussion or if the topic is dead (wiki pages 5 years old). However, the Haskell' wiki page for FunctionalDependencies suggests AssociatedTypes is a more promising approach, yet the AssociatedTypes page misses a major Con compared to FunctionalDependencies that I think should be there. One of the more disgusting but also powerful implications of FunctionalDependencies is that, in conjunction with OverlappingInstances and UndecidableInstances, you can do recursive programming at the type level. This has been (ab)used to do some cool things (e.g., HList, HaskellDB, OOHaskell). As an example, consider the following simple (key, value) lookup function that operates at the type level (inspired by HList): {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE OverlappingInstances #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE UndecidableInstances #-} data HNil = HNil deriving (Show) data HCons h t = h :* t deriving (Show) infixr 2 :* data A = A deriving (Show) data B = B deriving (Show) data C = C deriving (Show) abc :: HCons (A, Int) (HCons (B, Int) (HCons (C, Int) HNil)) abc = (A, 1) :* (B, 2) :* (C, 3) :* HNil class HLookup k l v | k l -> v where hLookup :: k -> l -> v instance HLookup k (HCons (k, v) l) v where hLookup _ (h :* t) = snd h instance (HLookup k l v) => HLookup k (HCons h l) v where hLookup k (h :* t) = hLookup k t b :: Int b = hLookup B abc -- Returns 2 The key to hLookup is that OverlappingInstances selects the most specific match, thereby breaking the recursion when the requested key type matches the key of the pair at the head of the list. By contrast, TypeFamilies cannot permit a function such as hLookup. One may try something along the lines of: class HLookup k l where type HLookupResult k l hLookup :: k -> l -> HLookupResult k l instance HLookup k (HCons (k, v) l) where type HLookupResult k (HCons (k, v) l) = v hLookup _ (h :* t) = snd h instance (HLookup k l) => HLookup k (HCons h l) where type HLookupResult k (HCons h t) = HLookupResult k t hLookup k (h :* t) = hLookup k t But now you have overlapping type synonyms, which pose a safety threat where the more-specific instance is not in scope. Therefore, Haskell likely cannot be extended to accept code such as the above. One possible extension that *would* enable type-level recursive programming is the ability to constrain by inequality of types. I noticed GHC is on its way to accepting "a ~ b" as a constraint that types a and b are equal. If there were a corresponding "a /~ b", then one might be able to say something like: instance HLookup k (HCons (k, v) l) where ... instance (h /~ (k, v), HLookup k l) => HLookup k (HCons h l) where ... Of course, I have no idea how the compiler could know such constraints are sufficient to avoid overlapping types. I suspect figuring it out would be hard given that instance matching happens before context verification and that the overlapping errors must happen at instance matching time. My bigger point is that if the Haskell' committee ever considers adopting one of these proposals or one intended to supersede them, I hope there is a way to do recursive type-level programming. A likely corollary is the need for a mechanism to avoid instance overlap between recursive and base cases by asserting type inequality in the recursive case. David

Show replies by date

Alexey Khudyakov

29 May 29 May

9:29 p.m.

On 29.05.2011 22:59, David Mazieres wrote:

...

But now you have overlapping type synonyms, which pose a safety threat where the more-specific instance is not in scope. Therefore, Haskell likely cannot be extended to accept code such as the above.

No it could. Inconsistency arise from fact that it's not guaranted that all instances are known. Such guaranties are possible with closed type families. In such families instances could be added only in the same module with family declaration. Here is simplistic example:

...

data HTrue data HFalse

closed type family Equal a b :: * closed type instance a a = HTrue closed type instance a b = HFalse

No more instances could be added. So type could be determined unambigously. In type level programming type families often meant to be closed but there is no explicit way to say that and it limit their expressiveness. Also closed type families could help with lack of injectivity. It could be untracktable though.

...

One possible extension that *would* enable type-level recursive programming is the ability to constrain by inequality of types. I noticed GHC is on its way to accepting "a ~ b" as a constraint that types a and b are equal. If there were a corresponding "a /~ b", then one might be able to say something like:

instance HLookup k (HCons (k, v) l) where ... instance (h /~ (k, v), HLookup k l) => HLookup k (HCons h l) where ...

I can't see how it change things. AFAIR instances selected only by their heads, contexts are not taken into account. Besides type inequality could be easily implemented with closed type families. See code above. Here is contrived example of usage:

...

instance (Equal Thing Int ~ HFalse) => Whatever Thing

P.S. Also I have suspicion that version with fundeps will behave badly if more instances are added in another module.

Ben Millwood

10:45 p.m.

On Sun, May 29, 2011 at 7:59 PM, David Mazieres wrote:

...

One of the more disgusting but also powerful implications of FunctionalDependencies is that, in conjunction with OverlappingInstances and UndecidableInstances, you can do recursive programming at the type level. This has been (ab)used to do some cool things (e.g., HList, HaskellDB, OOHaskell).

It would seem very strange to me if haskell-prime made the choice of fundeps/type families based on the behaviour with OverlappingInstances. I'm under the impression that Overlapping is generally considered one of the more controversial extensions, and on the LanguageQualities wiki page [1] it's explicitly given as an example of something which violates them. So not only is Overlapping not in the language, but I imagine there are many people (myself included) who would like to ensure it stays out. My personal opinion is that if Haskell wants a more complete facility for type-level programming, that should be addressed directly, instead of via creative abuse of the class system and related machinery. Ben [1]: http://hackage.haskell.org/trac/haskell-prime/wiki/LanguageQualities

Dan Doel

11:35 p.m.

On Sun, May 29, 2011 at 6:45 PM, Ben Millwood wrote:

...

It would seem very strange to me if haskell-prime made the choice of fundeps/type families based on the behaviour with OverlappingInstances. I'm under the impression that Overlapping is generally considered one of the more controversial extensions, and on the LanguageQualities wiki page [1] it's explicitly given as an example of something which violates them. So not only is Overlapping not in the language, but I imagine there are many people (myself included) who would like to ensure it stays out.

My personal opinion is that if Haskell wants a more complete facility for type-level programming, that should be addressed directly, instead of via creative abuse of the class system and related machinery.

It should also be noted: the fact that functional dependencies work with overlapping instances, while type families don't is not really inherent in functional dependencies, but is instead related to choices about how functional dependencies work that differ from how type families do. And mainly, this is because functional dependencies fail to incorporate local information, meaning they fail to work appropriately in various situations (for instance, matching on a GADT may refine a type, but that new information may not propagate through a fundep). In my experience, you can construct examples that should lead to type soundness issues with fundeps, and only fail because of peculiarities in fundep handling. But fundeps could (and arguably should, to interact with GADTs and the like) be reworked to behave 'properly'. It's just that type families already do. I can't really recall what example I used in the past, but here's one off the cuff: module A where class C a b | a -> b where instance C a a where data T a where CT :: C a b => b -> T a module B where import A instance C Int Char where c :: Char c = case t of { CT x -> x } So, the question is: what should happen here? We've created a T Int in a context in which C Int Int, so it wraps an Int. Then we match in a context in which C Int Char. But the fundep tells us that there can only be one S such that C Int S. So we have some choices: 1) Disallow the overlapping instance C Int Char, because it is incompatible with the C Int Int from the other module. This is what GHC 7 seems to do. 2) Pretend that there may in fact be more than one instance C Int a, and so we can't infer what a is in the body of c. I think this is what GHC used to do, but it means that whether a fundep "a -> b" actually allows us to determine what b is from knowing a is kind of ad-hoc and inconsistent. 3) Allow c to type check. This is unsound. 1 is, I think, in line with type families. But it rules out the computation that fundeps + overlapping can do and type families cannot. Also, in an unrelated direction: there are conditions on type families that can allow some overlapping to be permitted. For instance, if you simply want a closed type function, like, taking the above as an example: type family F a :: * where instance F Int = Char instance F a = a Then this is a sufficient condition for overlapping to be permissible. And it covers a lot of the use cases for overlapping instances, I think. -- Dan

dm-list-haskell-prime＠scs.stanford.edu

30 May 30 May

12:16 a.m.

At Sun, 29 May 2011 19:35:15 -0400, Dan Doel wrote:

...

On Sun, May 29, 2011 at 6:45 PM, Ben Millwood wrote: 1) Disallow the overlapping instance C Int Char, because it is incompatible with the C Int Int from the other module. This is what GHC 7 seems to do.

This seems like the only reasonable option given the meaning of functional dependencies.

...

Also, in an unrelated direction: there are conditions on type families that can allow some overlapping to be permitted. For instance, if you simply want a closed type function, like, taking the above as an example:

type family F a :: * where instance F Int = Char instance F a = a

Something like this would be good. Though you'd need a corresponding value-level mechanism. Is this part of any pending proposal? I don't suppose there's any way to get GHC to accept such code? I only found one cryptic mention of "closed synonym families" under a speculative ideas list for type functions. David

dm-list-haskell-prime＠scs.stanford.edu

29 May 29 May

11:55 p.m.

Thanks for the responses. I realized after sending the message that it wasn't clear exactly what I was advocating, which is probably more modest that what people are thinking. Mostly I was hoping the AssociatedTypes wiki page could be updated to reflect that AssociatedTypes can't replace FunctionalDependencies. (After reading the FunctionalDependencies page, I converted a bunch of code over to TypeFamilies, thinking this would be more future-proof, only to realize it couldn't work.) I'm not sure what the process is for updating the wiki, as the page is locked down, but mailing haskell-prime seemed like a reasonable start. I'm absolutely not advocating making overlapping instances (or, worse, overlapping types) part of Haskell', nor under the impression that the committee would ever consider doing so. I'm just pointing out that right now OverlappingInstances are the only way to do recursive programming at the type level, for the specific reasons I outlined. I hope that before FunctionalDependencies or TypeFamilies or any other type-level programming becomes part of Haskell', there is a way to differentiate base and recursive cases *without* overlapping instances. The fact that TypeFamilies made it somewhat far into the process without a way to do recursion and that the limitation is not even documented on the wiki suggests that the Haskell' committee either thinks people don't care or thinks about the question differently. Either way, the point seemed worth noting somewhere. I don't have any great ideas on supporting recursion, so I suggested a not so great idea in my last email that abused the context. Here's another not so great idea that doesn't abuse the context... The point is just that it's possible to support recursion without overlapping instances: Add an annotation like "| x /~ y, ..." to instances denoting that the instance cannot be selected unless types x and y are known to be different. So the code from my previous message becomes: data HNil = HNil deriving (Show) data HCons h t = h :* t deriving (Show) infixr 2 :* class HLookup k l where type HLookupResult k l hLookup :: k -> l -> HLookupResult k l instance HLookup k (HCons (k, v) l) where ... type HLookupResult k (HCons (k, v) l) = v hLookup _ (h :* t) = snd h instance (HLookup k l) => HLookup k (HCons h l) | h /~ (k, v) where -- This is how we avoid overlap ------------> ^^^^^^^^^^^ type HLookupResult k (HCons h t) = HLookupResult k t hLookup k (h :* t) = hLookup k t I'm under no illusions that this specific syntax would fly, but possibly some other proposal will allow the equivalent. David

Roman Leshchinskiy

31 May 31 May

9:45 p.m.

On 30/05/2011, at 00:55, dm-list-haskell-prime@scs.stanford.edu wrote:

...

I'm absolutely not advocating making overlapping instances (or, worse, overlapping types) part of Haskell', nor under the impression that the committee would ever consider doing so. I'm just pointing out that right now OverlappingInstances are the only way to do recursive programming at the type level, for the specific reasons I outlined. I hope that before FunctionalDependencies or TypeFamilies or any other type-level programming becomes part of Haskell', there is a way to differentiate base and recursive cases *without* overlapping instances.

FWIW, I don't think this is really about type-level recursion. You can do recursive programming with type families: data Z data S n type family Plus m n type instance Plus Z n = n type instance Plus (S m) n = S (Plus m n) It's deciding type equality via overlapping instances that is problematic here. But, as others have pointed out, this is somewhat dodgy anyway. I suppose what you really want is something like this: data True data False type family Equal a b Where Equal a b ~ True if and only if a and b are known to be the same type and Equal a b ~ False if and only if they are known to be different types. You could, in theory, get this by defining appropriate instances for all type constructors in a program: type instance Equal Int Int = True type instance Equal Int [a] = False type instance Equal [a] Int = False type instance Equal [a] [b] = Equal a b ... But that's infeasible, of course. However, nothing prevents a compiler from providing this as a built-in. Arguably, this would be much cleaner than the solution based on fundeps and overlapping instances. Roman

Simon Peyton-Jones

14 Jun 14 Jun

9:36 a.m.

There was an interesting thread on haskell-prime [1], about the relationship between functional dependencies and type families. This message is my attempt to summarise the conclusions of that thread. I'm copying other interested parties (eg Oleg, Dimitrios) [1] http://www.haskell.org/pipermail/haskell-prime/2011-May/003416.html 1. As things stand in GHC you can do some things with functional dependencies that you can't do with type families. The archetypical example is type equality. We cannot write type family Eq a b :: * type instance Eq k k = True type instance Eq j k = False because the two instances overlap. But you can with fundeps class Eq a b c | a b -> c instance Eq k k True instance Eq j k False As Alexey mentioned, fundeps have other disadvantages, so it's reasonable to ask whether type families could extend to handle cases like this. 2. In a hypothetical extension to type families, namely *closed* type families, you could support overlap: closed type family Eq a b :: * where type instance Eq k k = True type instance Eq j k = False The idea is that you give all the equations together; matching is top-to-bottom; and the type inference only picks an equation when all the earlier equations are guaranteed not to match. So there is no danger of making different choices in different parts of the program (which is what gives rise to unsoundness). 3. Closed type families are not implemented yet. The only tricky point I see would be how to express in System FC the proof that "earlier equations don't match". Moreover, to support abstraction one might well want David's /~ predicate, so that you could say f :: forall a b. (a /~ b) => ..blah.. This f can be applied to any types a,b provided they don't match. I'm frankly unsure about all the consequences here. 4. As Roman points out, type families certainly do support recursion; it's just they don't support overlap. 5. David wants a wiki page fixed. But which one? And how is it "locked down"? 6. There is a very old Trac wiki page about "total" type families here http://hackage.haskell.org/trac/ghc/wiki/TypeFunctions/TotalFamilies Written by Manuel, I think it needs updating. That's my summary! Simon

Dan Doel

2:40 p.m.

On Tue, Jun 14, 2011 at 5:36 AM, Simon Peyton-Jones wrote:

...

There was an interesting thread on haskell-prime [1], about the relationship between functional dependencies and type families. This message is my attempt to summarise the conclusions of that thread. I'm copying other interested parties (eg Oleg, Dimitrios) [1] http://www.haskell.org/pipermail/haskell-prime/2011-May/003416.html

1. As things stand in GHC you can do some things with functional dependencies that you can't do with type families. The archetypical example is type equality. We cannot write type family Eq a b :: * type instance Eq k k = True type instance Eq j k = False because the two instances overlap. But you can with fundeps class Eq a b c | a b -> c instance Eq k k True instance Eq j k False As Alexey mentioned, fundeps have other disadvantages, so it's reasonable to ask whether type families could extend to handle cases like this.

Fundeps no longer allow this as of GHC 7, it seems. It causes the same fundep violation as the case: class C a b | a -> b instance C a R instance C T U for R /~ U. Which I find somewhat sensible, because the definitions together actually declare two relations for any type: Eq T T False Eq T T True and we are supposed to accept that because one is in scope, and has a particular form, the other doesn't exist. But they needn't always be in scope at the same time. Essentially, GHC 7 seems to have separated the issue of type-function overlapping, which is unsound unless you have specific conditions that make it safe---conditions which fundeps don't actually ensure (fundeps made it 'safe' in the past by not actually doing all the computation that type families do)---from the issue of instance overlapping, which is always sound at least. But if I'm not mistaken, type families can handle these cases. I'd personally say it's a step in the right direction, but it probably breaks a lot of HList-related stuff. -- Dan

Andrea Vezzosi

3:27 p.m.

On Tue, Jun 14, 2011 at 4:40 PM, Dan Doel wrote:

...

On Tue, Jun 14, 2011 at 5:36 AM, Simon Peyton-Jones wrote:

...
There was an interesting thread on haskell-prime [1], about the relationship between functional dependencies and type families. This message is my attempt to summarise the conclusions of that thread. I'm copying other interested parties (eg Oleg, Dimitrios) [1] http://www.haskell.org/pipermail/haskell-prime/2011-May/003416.html

1. As things stand in GHC you can do some things with functional dependencies that you can't do with type families. The archetypical example is type equality. We cannot write type family Eq a b :: * type instance Eq k k = True type instance Eq j k = False because the two instances overlap. But you can with fundeps class Eq a b c | a b -> c instance Eq k k True instance Eq j k False As Alexey mentioned, fundeps have other disadvantages, so it's reasonable to ask whether type families could extend to handle cases like this.

Fundeps no longer allow this as of GHC 7, it seems. It causes the same fundep violation as the case:

class C a b | a -> b instance C a R instance C T U

Are you sure that worked before? The following still does anyhow: {-# LANGUAGE MultiParamTypeClasses, OverlappingInstances, FunctionalDependencies , EmptyDataDecls, FlexibleInstances, UndecidableInstances #-} class TypeCast a b | a -> b, b->a where typeCast :: a -> b class TypeCast' t a b | t a -> b, t b -> a where typeCast' :: t->a->b class TypeCast'' t a b | t a -> b, t b -> a where typeCast'' :: t->a->b instance TypeCast' () a b => TypeCast a b where typeCast x = typeCast' () x instance TypeCast'' t a b => TypeCast' t a b where typeCast' = typeCast'' instance TypeCast'' () a a where typeCast'' _ x = x data R data T data U class C a b | a -> b instance TypeCast R b => C a b instance TypeCast U b => C T b In fact this is how IsFunction was implemented in 2005[1], and the same transformation appears to work for the Eq class too. If we allow TypeFamilies we can use (~) instead of the TypeCast hack, fortunately. [1] http://okmij.org/ftp/Haskell/isFunction.lhs -- Andrea

...

for R /~ U. Which I find somewhat sensible, because the definitions together actually declare two relations for any type:

Eq T T False Eq T T True

and we are supposed to accept that because one is in scope, and has a particular form, the other doesn't exist. But they needn't always be in scope at the same time.

Essentially, GHC 7 seems to have separated the issue of type-function overlapping, which is unsound unless you have specific conditions that make it safe---conditions which fundeps don't actually ensure (fundeps made it 'safe' in the past by not actually doing all the computation that type families do)---from the issue of instance overlapping, which is always sound at least. But if I'm not mistaken, type families can handle these cases.

I'd personally say it's a step in the right direction, but it probably breaks a lot of HList-related stuff.

-- Dan

_______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime

Dan Doel

3:49 p.m.

On Tue, Jun 14, 2011 at 11:27 AM, Andrea Vezzosi wrote:

...

...
class C a b | a -> b instance C a R instance C T U

Are you sure that worked before?

80%

...

The following still does anyhow:

data R data T data U class C a b | a -> b instance TypeCast R b => C a b instance TypeCast U b => C T b

In fact this is how IsFunction was implemented in 2005[1], and the same transformation appears to work for the Eq class too. If we allow TypeFamilies we can use (~) instead of the TypeCast hack, fortunately.

So it does. instance (b ~ R) => C a b instance (b ~ U) => C T b Which is odd. I don't think there's a way to justify this working. Either the preconditions are being taken into account, in which case there is no reason for this to behave differently from: instance C a R instance C T U or the preconditions are not being taken into account (which is the type class way), in which case both of the qualified instances are illegal, because they declare instances C T b for all b (plus a constraint on the use), which violates the fundep. I've seen examples like this before, and I think what GHC ends up doing (now) is fixing the 'b' to whatever instance it needs first. But I don't think that's very good behavior. In this case it happens that it's impossible to use at more than one instance, but if you accept the instances, you're back to the questions of type soundness that are only evaded because fundeps don't actually use all the information they allegedly express. -- Dan

dm-list-haskell-prime＠scs.stanford.edu

3:49 p.m.

At Tue, 14 Jun 2011 10:40:48 -0400, Dan Doel wrote:

...

...
1. As things stand in GHC you can do some things with functional dependencies that you can't do with type families. The archetypical example is type equality. We cannot write type family Eq a b :: * type instance Eq k k = True type instance Eq j k = False because the two instances overlap. But you can with fundeps class Eq a b c | a b -> c instance Eq k k True instance Eq j k False As Alexey mentioned, fundeps have other disadvantages, so it's reasonable to ask whether type families could extend to handle cases like this.

Fundeps no longer allow this as of GHC 7, it seems. It causes the same fundep violation as the case:

class C a b | a -> b instance C a R instance C T U

You absolutely still can use FunctionalDependencies to determine type equality in GHC 7. For example, I just verified the code below with GHC 7.02: *Main> typeEq True False HTrue *Main> typeEq (1 :: Int) (2 :: Int) HTrue *Main> typeEq (1 :: Int) False HFalse As always, you have to make sure one of the overlapping instances is more specific than the other, which you can do by substituting a parameter c for HFalse in the false case and fixing c to HFalse using another class like TypeCast in the context. (As contexts play no role in instance selection, they don't make the instance any more specific.) While I don't have convenient access to GHC 6 right this second, I'm pretty sure there has been no change for a while, as the HList paper discussed this topic in 2004. David {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE OverlappingInstances #-} {-# LANGUAGE UndecidableInstances #-} class TypeCast a b | a -> b where typeCast :: a -> b instance TypeCast a a where typeCast = id data HTrue = HTrue deriving (Show) data HFalse = HFalse deriving (Show) class TypeEq a b c | a b -> c where typeEq :: a -> b -> c instance TypeEq a a HTrue where typeEq _ _ = HTrue instance (TypeCast HFalse c) => TypeEq a b c where typeEq _ _ = typeCast HFalse

dm-list-haskell-prime＠scs.stanford.edu

3:27 p.m.

At Tue, 14 Jun 2011 09:36:41 +0000, Simon Peyton-Jones wrote:

...

5. David wants a wiki page fixed. But which one? And how is it "locked down"?

This page: http://hackage.haskell.org/trac/haskell-prime/wiki/FunctionalDependencies Currently under cons for FunctionalDependencies, it says: AssociatedTypes seem to be more promising. I proposed the following fix: AssociatedTypes seem to be more promising, but in their current form are not as general as FunctionalDependencies [link]. where the link points to this or another email thread. Is there a policy that only a proposal's "owner" can modify the wiki page? Or that you have to be a member of the Haskell' committee? At any rate, when I log into the Haskell' wiki, I don't get an Edit button. That's all I meant by locked down. David

Simon Peyton-Jones

9:52 p.m.

| http://hackage.haskell.org/trac/haskell-prime/wiki/FunctionalDependencies | | Currently under cons for FunctionalDependencies, it says: | | AssociatedTypes seem to be more promising. | | I proposed the following fix: | | AssociatedTypes seem to be more promising, but in their | current form are not as general as FunctionalDependencies | [link]. OK, that one. I've made that change. | Is there a policy that only a proposal's "owner" can modify the wiki | page? Or that you have to be a member of the Haskell' committee? I'm not sure. Malcolm Wallace is chair at the moment; I'm ccing him. Simon

Malcolm Wallace

16 Jun 16 Jun

1:42 p.m.

...

| Is there a policy that only a proposal's "owner" can modify the wiki | page? Or that you have to be a member of the Haskell' committee?

I'm not sure. Malcolm Wallace is chair at the moment; I'm ccing him.

I have no idea: I neither set up the wiki, nor do I have any interesting admin rights for it. Regards, Malcolm

oleg＠okmij.org

15 Jun 15 Jun

1:59 a.m.

I'd like to summarize the relationship between functional dependencies and type functions, and propose a solution that should get rid of overlapping instances. The solution does not require messing with System-FC. In fact, it can be implemented today (although ungainly). A small bit of GHC support will be appreciated. I believe that functional dependencies and type functions are not comparable. There are examples that can't be done (in principle or in practice) with type functions, and there are examples that cannot be implemented with functional dependencies. The example of the latter is http://okmij.org/ftp/Computation/extra-polymorphism.html#injectivity Functional dependencies have the advantage in the following two cases 1. mutual dependencies: class Add x y z | x y -> z, x z -> y, y z -> x I think this example can be emulated with type functions; the emulation didn't work with GHC 6.10, at least. It may work now. 2. combination with overlapping instances The type equality predicate is only one example, there are several of that kind. Just for the record, the following code class Eq a b c | a b -> c instance Eq k k True instance Eq j k False never worked. (GHC may have accepted the above declarations; that doesn't mean much since overlapping was detected lazily). The key insight of HList is the realization why the code above did not work and how to hack around it (TypeCast or ~ is needed). If GHC somehow provided Eq as a built-in, that would not have been enough. We need something like Eq at higher kinds. For example, http://okmij.org/ftp/Haskell/typecast.html#is-function-type That web page gives example of IsList, IsArray and other such predicates. The most practical application of overlapping instances is documented in http://www.haskell.org/pipermail/haskell-cafe/2010-June/078590.html http://www.haskell.org/pipermail/haskell-cafe/2010-June/078739.html The solution has been described already in the HList paper: provide something the typeOf at the type level. That is, assume a type function TypeOf :: * -> TYPEREP where TYPEREP is a family of types that describe the type of TypeOf's argument in the way TypeRep value described the type. The HList paper (Sec 9) has outlined an ugly solution: associate with each type constructor a type-level numeral; TYPEREP is a type-level list then, which contains the code for the head constructor and TYPEREPs of the arguments. TYPEREPs are easily comparable. The HList solution is certainly ugly and non-scalable. The biggest problem is assigning opaque integers to type constructors and ensuring the uniqueness. Once I toyed with the idea of defining a family of types to be the HList of Dot and Dash, so we would spell the name of the constructor in Morse code. I must stress that my proposal is rather timid: Matthias Blume, when designing a new FFI for SML/NJ has lifted the whole latin alphabet to the type level: http://people.cs.uchicago.edu/~blume/papers/nlffi-entcs.pdf so he could spell the names of C struct in SML types. This NLFFI has been used in SML/NJ, in production, for about 10 years. If SML developers could do that, certainly GHC developers could. Persistent rumors hold that someone is working on adding more kinds to GHC. A Kind NAME and a Kind TYPEREP seem good kinds to have (along with naturals, of course). GHC could then provide the type function TypeOf. Once we have a TYPEREP, we can compare and deconstruct it, removing the need for overlapping instances.

Simon Peyton-Jones

10:36 a.m.

| 1. mutual dependencies: | class Add x y z | x y -> z, x z -> y, y z -> x | | I think this example can be emulated with type functions; the | emulation didn't work with GHC 6.10, at least. It may work now. On this point, it doesn't *quite* work yet. The emulation you mention is something like this this: type family Arg1 a2 r :: * type family Arg2 a1 r :: * class (Arg2 x (R x y) ~ y, Arg1 y (R x y) ~ x) => Add x y where add :: x -> y -> R x y type R x y :: * type instance Arg1 Double Double= Int type instance Arg2 Int Double = Doulbe instance Add Int Double Double where add x y = toDouble x + y type R Int Double = Double (The example you give is a bit odd because you specified that the types of any two arguments determine the third, which probably isn't what you want for Add. But no matter.) What we need for this is equality superclasses. I have done all the heavy lifting for this (see our paper "Practical aspects of compiling Haskell with System FC"), but I just need a few hours to get the payoff by switching them on. Coming soon. Simon

oleg＠okmij.org

16 Jun 16 Jun

7:07 a.m.

...

| 1. mutual dependencies: | class Add x y z | x y -> z, x z -> y, y z -> x | I think this example can be emulated with type functions; the | emulation didn't work with GHC 6.10, at least. It may work now.

...

(The example you give is a bit odd because you specified that the types of any two arguments determine the third, which probably isn't what you want for Add. But no matter.)

Actually I meant the triple dependencies, when any two arguments determine the third. That class was implemented in the following http://okmij.org/ftp/Haskell/PeanoArithm.lhs (it was called Sum). The file also demonstrates exponentiation, as a ternary relation where any two arguments determine the third. (Unlike Sum, that relation is partial).

Simon Peyton-Jones

17 Jun 17 Jun

1:21 p.m.

| I'd like to summarize the relationship between functional dependencies | and type functions, and propose a solution that should get rid of | overlapping instances. The solution does not require messing with | System-FC. In fact, it can be implemented today (although | ungainly). A small bit of GHC support will be appreciated. This sounds good. I am keen to identify features that provide the expressiveness that you and David and others want. However my brain is small. Concerning "1. mutual dependencies" I believe that equality superclasses provide the desired expressiveness. The code may not look quite as nice, but equality superclasses (unlike fundeps) will play nicely with GADTs, type families, etc. Do you agree? Concerning "2. combination with overlapping instances", you say "The solution has been described already in the HList paper: provide something the typeOf at the type level. That is, assume a type function "TypeOf :: * -> TYPEREP". By "the HList paper" do you mean http://homepages.cwi.nl/~ralf/HList/? I see no TYPEREP in that paper. Do you think you might perhaps elaborate your proposed solution explicitly? Perhaps saying explicitly: - This is the support we need from GHC - This is how you can then code examples X,Y,Z I know that all of this is embedded variously in your past writings but I'm not very good at finding exactly the right bits and turning them into proposed features! Incidentally Pedro's new "deriving Generic" stuff does derive a kind of type-level type representation for types, I think. It's more or less as described in their paper. http://www.dreixel.net/research/pdf/gdmh_nocolor.pdf Thanks Simon

dm-list-haskell-prime＠scs.stanford.edu

3:17 p.m.

At Fri, 17 Jun 2011 13:21:41 +0000, Simon Peyton-Jones wrote:

...

Concerning "1. mutual dependencies" I believe that equality superclasses provide the desired expressiveness. The code may not look quite as nice, but equality superclasses (unlike fundeps) will play nicely with GADTs, type families, etc. Do you agree?

By equality superclasses, do you just mean being able to say "a ~ b" in a class context? If so, that is basically what you get by enabling fundeps and defining a class such as: class AssertEq a b | a -> b, b -> a instance AssertEq a a Unless I'm missing something, that is not sufficient to do a lot of things I would like to do, as those things require both OverlappingInstances and FunctionalDependencies (as well as UndecidableInstances). A good test would be whether superclasses let you eliminate N^2 mtl instances and do something equivalent to: instance (Monad m) => MonadState s (StateT s m) where get = StateT $ \s -> return (s, s) put s = StateT $ \_ -> return ((), s) instance (Monad (t m), MonadTrans t, MonadState s m) => MonadState s (t m) where get = lift get put = lift . put Superclass equality is crucial for type families, since you can't mention a class's own type function in its head, but are there other things they allow you do to?

...

Concerning "2. combination with overlapping instances", you say "The solution has been described already in the HList paper: provide something the typeOf at the type level. That is, assume a type function "TypeOf :: * -> TYPEREP".

...

Incidentally Pedro's new "deriving Generic" stuff does derive a kind of type-level type representation for types, I think. It's more or less as described in their paper. http://www.dreixel.net/research/pdf/gdmh_nocolor.pdf

I'm very excited about this new Representable class and have just started playing with it, so it is too early for me to say for sure. However, my initial impression is that this is going to make me want OverlappingInstances even more because of all the cool things you can do with recursive algorithms over the Rep types... An initial issue I'm running into (and again, this is preliminary, maybe I'll figure out a way without OverlappingInstances) is in implementing a function to serialize Generic data types to JSON. If the Haskell data type has selectors (i.e., is a record), then I would like to serialize/unserialize the type as a JSON object, where the names of the selectors are the JSON field names. If the Haskell data type does not have selectors, then I would like to serialize it as a JSON array. Generic deriving gives me the conIsRecord function, but this is at the value level, and I want to do something different at the type level. For a record, I need a list of (String, Value) pairs, while for a non-record, I need a list of Values. This leads me to write code such as the following: class JSON1 a r | a -> r toJSON1 :: a -> r instance (JSON a) => JSON1 (S1 NoSelector (K1 c a)) [Value] where toJSON1 (M1 (K1 a)) = [toJSON a] instance (Selector x, JSON a) => JSON1 (S1 x (K1 c a)) [(String, Value)] where toJSON1 s@(M1 (K1 a)) = [(nameOf s undefined, toJSON a)] where nameOf :: S1 c f p -> c -> String nameOf _ c -> selName c The key piece of magic I need here (and in so many other places) is to be able to do one thing at the type level if x is a particular type (e.g., NoSelector) or sometimes one of a small number of types, and to do something else if x is any other type. Closed type families might do it. One question I should think about is whether a combination of open type families with safe dynamic loading (if this is possible) and closed type families would cover everything I need. You'd need to be able to do things like: -- Type-level if-then-else closed type family IfEq :: * -> * -> * -> * -> * type instance IfEq a b c d = d type instance IfEq a a c d = c You would also need to be able to dangle closed type families off of open ones. (I.e., "type instance Foo ProxyType = MyClosedTypeFamily") I also can't quite figure out how to pass around ad-hoc polymorphic functions the way you can with proxy types and a class such as: class Function f a b | f a -> b where funcall :: f -> a -> b David

Simon Peyton-Jones

3:28 p.m.

| By equality superclasses, do you just mean being able to say "a ~ b" | in a class context? Yes. Or (F a ~ b). | Unless I'm missing something, that is not sufficient to do a lot of | things I would like to do, as those things require both | OverlappingInstances and FunctionalDependencies (as well as | UndecidableInstances). Correct. Hence Oleg's second point "2. combination with overlapping instances". Oleg claims to have a good story here. I'd like to see how he uses it to solve your problem. Simon

José Pedro Magalhães

20 Jun 20 Jun

7:57 a.m.

Hi all,

...

...
Concerning "2. combination with overlapping instances", you say "The solution has been described already in the HList paper: provide something the typeOf at the type level. That is, assume a type function "TypeOf :: * -> TYPEREP".

...

Incidentally Pedro's new "deriving Generic" stuff does derive a kind of type-level type representation for types, I think. It's more or less as described in their paper. http://www.dreixel.net/research/pdf/gdmh_nocolor.pdf

I'm very excited about this new Representable class and have just started playing with it, so it is too early for me to say for sure. However, my initial impression is that this is going to make me want OverlappingInstances even more because of all the cool things you can do with recursive algorithms over the Rep types...

Yes, most likely. I have a package defining a few generic functions and showing some example uses ( http://hackage.haskell.org/package/generic-deriving). There I use OverlappingInstances (and even UndecidableInstances).

...

An initial issue I'm running into (and again, this is preliminary, maybe I'll figure out a way without OverlappingInstances) is in implementing a function to serialize Generic data types to JSON. If the Haskell data type has selectors (i.e., is a record), then I would like to serialize/unserialize the type as a JSON object, where the names of the selectors are the JSON field names. If the Haskell data type does not have selectors, then I would like to serialize it as a JSON array.

Generic deriving gives me the conIsRecord function, but this is at the value level, and I want to do something different at the type level. For a record, I need a list of (String, Value) pairs, while for a non-record, I need a list of Values. This leads me to write code such as the following:

class JSON1 a r | a -> r toJSON1 :: a -> r

instance (JSON a) => JSON1 (S1 NoSelector (K1 c a)) [Value] where toJSON1 (M1 (K1 a)) = [toJSON a]

instance (Selector x, JSON a) => JSON1 (S1 x (K1 c a)) [(String, Value)] where toJSON1 s@(M1 (K1 a)) = [(nameOf s undefined, toJSON a)] where nameOf :: S1 c f p -> c -> String nameOf _ c -> selName c

The key piece of magic I need here (and in so many other places) is to be able to do one thing at the type level if x is a particular type (e.g., NoSelector) or sometimes one of a small number of types, and to do something else if x is any other type.

Right. I think this is often essential. Cheers, Pedro

dm-list-haskell-prime＠scs.stanford.edu

5:28 p.m.

At Mon, 20 Jun 2011 09:57:38 +0200, José Pedro Magalhães wrote:

...

class JSON1 a r | a -> r toJSON1 :: a -> r

instance (JSON a) => JSON1 (S1 NoSelector (K1 c a)) [Value] where toJSON1 (M1 (K1 a)) = [toJSON a]

instance (Selector x, JSON a) => JSON1 (S1 x (K1 c a)) [(String, Value)] where toJSON1 s@(M1 (K1 a)) = [(nameOf s undefined, toJSON a)] where nameOf :: S1 c f p -> c -> String nameOf _ c -> selName c

The key piece of magic I need here (and in so many other places) is to be able to do one thing at the type level if x is a particular type (e.g., NoSelector) or sometimes one of a small number of types, and to do something else if x is any other type.

Right. I think this is often essential.

One thing you could do to help in this specific case would be to use a different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for fields without selectors (or K1 NS). I presume you've already considered this and/or it's too late to make such a change. (Or to move the distinction up to the constructor with two different constructor tags, CR and CN for record and no-record.) Anyway, as you mention, there are other situations where you still need Overlapping/UndecidableInstances. David

Simon Peyton-Jones

21 Jun 21 Jun

7:34 a.m.

| One thing you could do to help in this specific case would be to use a | different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.) I don't think it's too late to make a change. The stuff has only just gone in, so it's still very malleable. There may be other considerations, but legacy code isn't one of them! Simon

José Pedro Magalhães

8:01 a.m.

Hi, 2011/6/21 Simon Peyton-Jones

...

| One thing you could do to help in this specific case would be to use a | different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.)

I don't think it's too late to make a change. The stuff has only just gone in, so it's still very malleable. There may be other considerations, but legacy code isn't one of them!

Simon Peyton-Jones

8:08 a.m.

I suppose that could be changed, yes, but what exactly are we trying to solve here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances. To be clear, I wasn't advocating a change, just saying that there's no GHC-HQ imperative to avoid them. Simon From: José Pedro Magalhães [mailto:jpm@cs.uu.nl] Sent: 21 June 2011 09:01 To: Simon Peyton-Jones Cc: David Mazieres expires 2011-07-21 PDT; oleg@okmij.org; ccshan@rutgers.edu; Dimitrios Vytiniotis; haskell-prime@haskell.org Subject: Re: TypeFamilies vs. FunctionalDependencies & type-level recursion Hi, 2011/6/21 Simon Peyton-Jones mailto:simonpj@microsoft.com> | One thing you could do to help in this specific case would be to use a | different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.) I don't think it's too late to make a change. The stuff has only just gone in, so it's still very malleable. There may be other considerations, but legacy code isn't one of them! I suppose that could be changed, yes, but what exactly are we trying to solve here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances. (Also, I always thought UndecidableInstances were "more evil", in some sense, and this change does nothing to remove the use of UndecidableInstances for generic programming.) Cheers, Pedro

José Pedro Magalhães

10:18 a.m.

Hi, 2011/6/21 Simon Peyton-Jones

...

I suppose that could be changed, yes, but what exactly are we trying to solve here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances.****

** **

To be clear, I wasn’t advocating a change, just saying that there’s no GHC-HQ imperative to avoid them.****

Ah, right. I would also be happy to help improving the current generics mechanism. Also, reading up on OverlappingInstances in the User's Guidehttp://www.haskell.org/ghc/docs/latest/html/users_guide/type-class-extension...(namely the incoherence example without IncoherentInstances) made me see the evilness of the current implementation. But I would only favor deprecating OverlappingInstances when there is a clear alternative supporting (and potentially improving) the current uses for OverlappingInstances. Pedro

...

Simon****

** **

*From:* José Pedro Magalhães [mailto:jpm@cs.uu.nl] *Sent:* 21 June 2011 09:01 *To:* Simon Peyton-Jones *Cc:* David Mazieres expires 2011-07-21 PDT; oleg@okmij.org; ccshan@rutgers.edu; Dimitrios Vytiniotis; haskell-prime@haskell.org

*Subject:* Re: TypeFamilies vs. FunctionalDependencies & type-level recursion****

** **

Hi,****

2011/6/21 Simon Peyton-Jones ****

| One thing you could do to help in this specific case would be to use a | different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.)****

I don't think it's too late to make a change. The stuff has only just gone in, so it's still very malleable. There may be other considerations, but legacy code isn't one of them!****

I suppose that could be changed, yes, but what exactly are we trying to solve here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances.

(Also, I always thought UndecidableInstances were "more evil", in some sense, and this change does nothing to remove the use of UndecidableInstances for generic programming.)

Cheers, Pedro****

** **

dm-list-haskell-prime＠scs.stanford.edu

4:41 p.m.

At Tue, 21 Jun 2011 10:01:24 +0200, José Pedro Magalhães wrote:

...

| One thing you could do to help in this specific case would be to use a | different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.)

I don't think it's too late to make a change. The stuff has only just gone in, so it's still very malleable. There may be other considerations, but legacy code isn't one of them!

I suppose that could be changed, yes, but what exactly are we trying to solve here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances.

Sorry, I wasn't necessarily advocating the change. This came out of a long discussion of UndecidableInstances. Oleg's TYPEOF approach is one way to avoid them (or at least to avoid them in conjunction with OverlappingInstances). Before Oleg posted his examples, Simon asked if it might be possible to use the new Generic class in a similar way. I was pointing out that no you can't, though the specific example I gave could be accommodated by a small change to the Generic interface. David

José Pedro Magalhães

22 Jun 22 Jun

9:08 a.m.

Hi David, 2011/6/21

...

At Tue, 21 Jun 2011 10:01:24 +0200, José Pedro Magalhães wrote:

...
| One thing you could do to help in this specific case would be to

use a

...
| different M1 tag--e.g., M1 S ... for selectors and M1 NS ... for | fields without selectors (or K1 NS). I presume you've already | considered this and/or it's too late to make such a change. (Or to | move the distinction up to the constructor with two different | constructor tags, CR and CN for record and no-record.)

I don't think it's too late to make a change. The stuff has only

just

...
gone in, so it's still very malleable. There may be other

considerations,

...
but legacy code isn't one of them!

I suppose that could be changed, yes, but what exactly are we trying to

solve

...
here? One can already specify different behavior for constructors with/without named fields. Are we trying to avoid OverlappingInstances? Then yes, this might help, but I'm not sure this change alone would make all generic programming possible without OverlappingInstances.

Sorry, I wasn't necessarily advocating the change. This came out of a long discussion of UndecidableInstances. Oleg's TYPEOF approach is one way to avoid them (or at least to avoid them in conjunction with OverlappingInstances). Before Oleg posted his examples, Simon asked if it might be possible to use the new Generic class in a similar way. I was pointing out that no you can't, though the specific example I gave could be accommodated by a small change to the Generic interface.

Ah, I see. Oleg's TYPEOF encodes only the type arguments to a datatype, whereas Rep encodes the structure of the datatype. As far as I can tell, TYPEOF never depends on the right-hand side of the datatype. A simple example is some datatype with a phantom type, say `data P a = P`. TYPEOF will encode the `a`, whereas Rep will not. So now it is very clear to me that you cannot get the TYPEOF if you have the Rep, in general. Cheers, Pedro

...

David

oleg＠okmij.org

21 Jun 21 Jun

7:35 a.m.

I have implemented type-level TYPEREP (along with a small library for higher-order functional programming at the type level). Overlapping instances may indeed be avoided. The library does not use functional dependencies either. http://okmij.org/ftp/Haskell/TTypeable/ David Mazi`eres suggested a good test

...

A good test would be whether superclasses let you eliminate N^2 mtl instances and do something equivalent to:

instance (Monad m) => MonadState s (StateT s m) where get = StateT $ \s -> return (s, s) put s = StateT $ \_ -> return ((), s)

instance (Monad (t m), MonadTrans t, MonadState s m) => MonadState s (t m) where get = lift get put = lift . put

Exactly this code is implemented in Example.hs. Here's an excerpt:

...

-- Default instance

...

instance (Monad (t m), MonadState m, MonadTrans t) => MonadState' (t m) HFalse where type MState' (t m) HFalse = MState m get' _ = trace "Default get" $ lift get put' _ = lift . put

-- Special instances

instance (Monad m) => MonadState' (StateT s m) HTrue where type MState' (StateT s m) HTrue = s get' _ = trace "Special get" . StateT $ \s -> return (s, s) put' _ s = StateT $ \_ -> return ((), s)

-- add more special instances if needed ...

plus one more general dispatching instance. Because of the additional flag, HTrue vs HFalse, the above instances do not overlap.

...

The key piece of magic I need here (and in so many other places) is to be able to do one thing at the type level if x is a particular type (e.g., NoSelector) or sometimes one of a small number of types, and to do something else if x is any other type.

TTypeable.hs does exactly that.

...

You'd need to be able to do things like:

-- Type-level if-then-else closed type family IfEq :: * -> * -> * -> * -> * type instance IfEq a b c d = d type instance IfEq a a c d = c

The code does that.

...

I also can't quite figure out how to pass around ad-hoc polymorphic functions the way you can with proxy types and a class such as:

class Function f a b | f a -> b where funcall :: f -> a -> b

And that too. Although the code is usable already, a bit of sugaring from GHC would greatly help. At the very least, automatic deriving of TYPEOF. A longer wish is for a a built-in equality comparison on TyCon_name. Perhaps we might wish to begin to consider if overlapping instances could be deprecated?

José Pedro Magalhães

8:01 a.m.

Hi Oleg, On Tue, Jun 21, 2011 at 09:35, wrote:

...

I have implemented type-level TYPEREP (along with a small library for higher-order functional programming at the type level). Overlapping instances may indeed be avoided. The library does not use functional dependencies either.

http://okmij.org/ftp/Haskell/TTypeable/

At a first glance, I think your TYPEOF encodes less information about types than the new generics Rep. By "less" I mean that one could get that information from Rep (which is derived automatically). Our generics do not, however, encode a type-level natural per type. This greatly simplifies type-level type equality. But I am still doubtful of the runtime performance of your code; also, the need to define a duplicate of the class (MonadState' in your example) bloats the code significantly. Cheers, Pedro

oleg＠okmij.org

22 Jun 22 Jun

7:36 a.m.

...

the need to define a duplicate of the class (MonadState' in your example) bloats the code significantly.

I'm quite puzzled at the statement. Is there really significant bloat? Let us count. As the first example, let's take class C a with N special instances, mutually non-overlapping instances, and one catch-all instance "instance C a". We have N+1 instances total. With the technique in the earlier message, we define an auxiliary class C' a flag with N special and one catch-all instance, all non-overlapping. We then define one instance of the original class C, performing the dispatch. The additional cost: one class declaration and one instance. Is one extra class and one instance considered a bloat? The extra class is systematically derived from the original one (so, one could get TH to do the deriving). Let's take a more elaborate example, when all instances overlap, as in class C a instance C [Int] instance C [a] instance C a the flag is not a simple boolean then. We may introduce a special TYPEREP ANY, and a special TYPEREP comparison function that makes ANY equal to anything. We will introduce a type family Find that returns the found element in a list. We will use TYPEREP themselves (with ANY, as appropriate) to dis-overlap the instances. Although details differ slightly, the approach applies. The extra cost: one instance and one class declaration, C'.

...

But I am still doubtful of the runtime performance of your code

I believe discussing performance may be a bit premature; keeping in mind that GHC's support for some of the comparisons may notably affect the performance. There have been already plans to add Nat kinds to GHC, presumably with an efficient comparison function.

José Pedro Magalhães

9:08 a.m.

Hi Oleg, On Wed, Jun 22, 2011 at 09:36, wrote:

...

...
the need to define a duplicate of the class (MonadState' in your example) bloats the code significantly.

I'm quite puzzled at the statement. Is there really significant bloat? Let us count.

As the first example, let's take class C a with N special instances, mutually non-overlapping instances, and one catch-all instance "instance C a". We have N+1 instances total.

With the technique in the earlier message, we define an auxiliary class C' a flag with N special and one catch-all instance, all non-overlapping. We then define one instance of the original class C, performing the dispatch. The additional cost: one class declaration and one instance. Is one extra class and one instance considered a bloat? The extra class is systematically derived from the original one (so, one could get TH to do the deriving).

My problem is exactly with this unnecessary repetition. If the extra class can be systematically derived from the original code, then it shouldn't have to be specified. All I'm saying is that if we would want to have such a mechanism for replacing OverlappingInstances, then we should also come up with a way (possibly even involving new syntax) to make it intuitive to use. (We've done something similar for the new generics stuff, by introducing DefaultSignatures.)

...

...
But I am still doubtful of the runtime performance of your code

I believe discussing performance may be a bit premature;

Perhaps, but this was prompted by your suggestion of considering deprecating OverlappingInstances. I have lots of code that relies on it, and I wouldn't want it to start triggering warnings just yet :-) Cheers, Pedro

dm-list-haskell-prime＠scs.stanford.edu

21 Jun 21 Jun

5:37 p.m.

At Tue, 21 Jun 2011 00:35:46 -0700 (PDT), oleg@okmij.org wrote:

...

I have implemented type-level TYPEREP (along with a small library for higher-order functional programming at the type level). Overlapping instances may indeed be avoided. The library does not use functional dependencies either.

http://okmij.org/ftp/Haskell/TTypeable/

This is pretty cool. One question I have is why you need UndecidableInstances. If we got rid of the coverage condition, would your code be able to work without relying on contexts for instance selection? Now I understand the reference to the ML paper. If you were to implement this in GHC you would encode the TC_code as the packge, module, and type name, letter by letter? (Or bit by bit since symbols can contain unicode?) Or could you use interface hashes (or whatever those hex numbers are when you run ghc-pkg -v)? How would you make this safe for dynamic loading? Would you have to keep track of all the package/module names that have been linked into the program and/or loaded, and not allow duplicates? Is dynamic unloading of code ever possible? Or at least is re-loading possible, since that appears to be key for debugging web servers and such? Finally, how do you express constraints in contexts using type families? For example, say I wanted to implement folds over tuples. With fundeps (and undecidable instances, etc.), I would use a proxy type of class Function2 to represent the function: {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE UndecidableInstances #-} class Function2 f a b r | f a b -> r where funcall2 :: f -> a -> b -> r -- An example, to show something of class Show and put the result in -- a list endo: data PolyShows = PolyShows instance (Show a) => Function2 PolyShows ([[Char]] -> [[Char]]) a ([[Char]] -> [[Char]]) where funcall2 _ start a = start . (show a :) -- Define a class for folds, as well as an instance for each tuple size: class TupleFoldl f z t r | f z t -> r where tupleFoldl :: f -> z -> t -> r instance TupleFoldl f z () z where tupleFoldl _ z _ = z instance (Function2 f z v0 r1, Function2 f r1 v1 r2) => TupleFoldl f z (v0,v1) r2 where tupleFoldl f z (v0,v1) = funcall2 f (funcall2 f z v0) v1 instance (Function2 f z v0 r1, Function2 f r1 v1 r2, Function2 f r2 v2 r3) => TupleFoldl f z (v0,v1,v2) r3 where tupleFoldl f z (v0,v1,v2) = funcall2 f (funcall2 f (funcall2 f z v0) v1) v2 -- -- ... and so on ... Now I can run:

...

...
...
tupleFoldl PolyShows (id::[String]->[String]) (1,2) [] ["1","2"]

In your case, how do I define an Apply instance equivalent to PolyShows? Moreover, the fundep code above depends on the Function2 constraints to make things work out correctly. It also has the nice property that ghc can figure out many of the types automatically. You have to specify id's type, but ghc figures out the type of [], figures out the return type, and deals with 1 and 2 being (Num a => a). Now is it the case that to do this with TYPEREP, you have to add an internal class to express the constraints, and just pass extra arguments to the class in a wrapper function? Presumably you then have a bunch of ~ constraints. Does the type inference work as well? It's a bit more typing, but if you can do it without undecidable instances, or even with undecidable instances but keeping a bounded context stack depth, then it would definitely be worth it. With fundeps and undecidable instances, if I use the default context stack depth of 21, my left and right folds crap out at 10 and 13 element tuples, respectively. David

oleg＠okmij.org

23 Jun 23 Jun

7:40 a.m.

...

One question I have is why you need UndecidableInstances. If we got rid of the coverage condition, would your code be able to work without relying on contexts for instance selection?

UndecidableInstances is the name for a conglomerate of various relaxations -- the fact that causes discord since we may agree about some relaxations and disagree about the others. One can see that all type functions (except the general Apply) are in fact terminating. For example, the termination of TREPEQ and TREPEQL can be seen by structural induction. However, since these two functions are mutually recursive, a termination checker, in my experience, will likely to have hard time seeing that. Let alone GHC (which isn't a termination checker to start with). In short, I need UndecidableInstances since GHC will not accept code without the flag since it can't see type functions terminating.

...

If you were to implement this in GHC you would encode the TC_code as the packge, module, and type name, letter by letter? (Or bit by bit since symbols can contain unicode?)

Indeed. It crossed my mind to make a joke that GHC ought to best SML by lifting the whole Unicode to the type level.

...

How would you make this safe for dynamic loading? Would you have to keep track of all the package/module names that have been linked into the program and/or loaded, and not allow duplicates? Is dynamic unloading of code ever possible? Or at least is re-loading possible, since that appears to be key for debugging web servers and such?

I must read up on Andreas Rossberg's work and check AliceML (which is, alas, no longer developed) to answer the questions on safe dynamic loading. AliceML is probably the most ambitious system to support type-safe loading. See, for example http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20The%20Missing%20Link.... and other papers on his web page. He now works at Google, btw. One comment about unloading: we've `dealt' with that issue in MetaOCaml by not doing anything about it, letting the garbage code accumulate. It is hard to know when all the code pointers to a DLL are gone. GC do not usually track code pointers. One has to modify GC, which is quite an undertaking. I suppose one may hack around by registering each (potential) entry point as a ForeignPointer, and do reference counting in a finalizer. Since a DLL may produce lots of closures, each code pointer in those closures should be counted as an entry point.

...

Finally, how do you express constraints in contexts using type families? For example, say I wanted to implement folds over tuples. With fundeps (and undecidable instances, etc.), I would use a proxy type of class Function2 to represent the function:

Your code can be re-written to use type functions almost mechanically (the method definitions stay the same -- modulo uncurrying, which I used for generality). I admit there is a bit of repetition (repeating the instance head). Perhaps a better syntax could be found. {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE UndecidableInstances #-} module PolyShow where -- Higher-order type-level functions class Apply f a where type ApplyR f a :: * apply :: f -> a -> ApplyR f a -- An example, to show something of class Show and put the result in -- a list endo: data PolyShows = PolyShows instance (Show a) => Apply PolyShows ([[Char]] -> [[Char]], a) where type ApplyR PolyShows ([[Char]] -> [[Char]], a) = [[Char]] -> [[Char]] apply _ (start,a) = start . (show a :) -- Define a class for folds, as well as an instance for each tuple size: class TupleFoldl f z t where type TupleFoldlR f z t tupleFoldl :: f -> z -> t -> TupleFoldlR f z t instance TupleFoldl f z () where type TupleFoldlR f z () = z tupleFoldl _ z _ = z instance (Apply f (ApplyR f (z, v0), v1), Apply f (z, v0)) => TupleFoldl f z (v0,v1) where type TupleFoldlR f z (v0,v1) = ApplyR f (ApplyR f (z,v0),v1) tupleFoldl f z (v0,v1) = apply f ((apply f (z,v0)), v1) test = tupleFoldl PolyShows (id::[String]->[String]) (1,2) [] -- ["1","2"] And here is a bonus data PolyShows' = PolyShows' instance (Show a, t ~ ([String] -> [String])) => Apply PolyShows' (t, a) where type ApplyR PolyShows' (t, a) = [[Char]] -> [[Char]] apply _ (start,a) = start . (show a :) test' = tupleFoldl PolyShows' id (1,2) [] -- ["1","2"] The type annotation on id is no longer needed.

...

With fundeps and undecidable instances, if I use the default context stack depth of 21, my left and right folds crap out at 10 and 13 element tuples, respectively.

When I generate my web pages with HSXML, I set the context depth to 180.

dm-list-haskell-prime＠scs.stanford.edu

25 Jun 25 Jun

5:08 p.m.

At Thu, 23 Jun 2011 00:40:38 -0700 (PDT), oleg@okmij.org wrote:

...

...
How would you make this safe for dynamic loading? Would you have to keep track of all the package/module names that have been linked into the program and/or loaded, and not allow duplicates? Is dynamic unloading of code ever possible? Or at least is re-loading possible, since that appears to be key for debugging web servers and such?

I must read up on Andreas Rossberg's work and check AliceML (which is, alas, no longer developed) to answer the questions on safe dynamic loading. AliceML is probably the most ambitious system to support type-safe loading. See, for example

http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20The%20Missing%20Link....

and other papers on his web page. He now works at Google, btw.

I wish I knew ML better, but from looking at that paper, I can't figure out the key piece of information, which is how to detect overlapping type instance declarations. (Does ML even have something equivalent?) In the absence of type families, I'm okay using something like haskell-plugins that returns Dynamic. In the absence of type families, Safe Haskell ought to be able to guarantee that typereps are all unique, making cast safe in such a situation. (Even if somehow two different pieces of code end up with conflicting functional dependencies, this may muck with the expected behavior of the program, as I get a different method from the one I was expecting, but it won't undermine type safety.) If there were a way to use Baars & Swiestra instead of Dynamic, that would be even better. With type families, I'm at a loss to figure out how this could even work. You would need to know have a runtime representation of every type family that's ever been instantiated, and you would need to check this against all the type families in the dynamic module you are loading. That seems very complicated.

...

MetaOCaml by not doing anything about it, letting the garbage code accumulate. It is hard to know when all the code pointers to a DLL are gone. GC do not usually track code pointers. One has to modify GC, which is quite an undertaking. I suppose one may hack around by registering each (potential) entry point as a ForeignPointer, and do reference counting in a finalizer. Since a DLL may produce lots of closures, each code pointer in those closures should be counted as an entry point.

It would be very cool to GC the code, but that's more than I'm hoping for.

...

...
Finally, how do you express constraints in contexts using type families? For example, say I wanted to implement folds over tuples. With fundeps (and undecidable instances, etc.), I would use a proxy type of class Function2 to represent the function:

Your code can be re-written to use type functions almost mechanically (the method definitions stay the same -- modulo uncurrying, which I used for generality). I admit there is a bit of repetition (repeating the instance head). Perhaps a better syntax could be found.

This is great. And in fact I was almost sold on your idea, until I realized that GHC does not accept the following program: {-# LANGUAGE TypeFamilies #-} x :: () x = const () (eq a b) where a :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) a = undefined b :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) b = undefined eq :: a -> b -> NatEq a b eq _ _ = undefined Notice that there are no undecidable instances required to compile this function (or the portions of your TTypeable that this function depends on). Yet GHC is still limiting my context stack (to 21 by default). Obviously any kind of unicode encoding of type names is going to run into the same sort of problem. So we seem to have come full circle: Undecidable instances are problematic because program compilation depends on this arbitrary context stack depth parameter. One way to avoid UndecidableInstances is to assign a unique type-level Nat to each type. But now the comparison of types is depending on this context stack and likely worsening the problem. A few thoughts: First, obviously if we went base 2 instead of unary for the encoding, types would get a lot shorter. If we took a SHA-256 hash of the type name and encoded it, we could then pick some default context depth (e.g., 21 + 256) that would make this work most of the time. Second, I believe it is kind of unfair of GHC to reject the above program. Without UndecidableInstances, GHC ought to be able to figure out that the type checker will terminate. Moreover, in the real world, GHC only has a finite amount of time and memory, so there is no difference between divergence and really big types. I can already crash GHC by feeding it a doubly-exponential-sized type such as: a = let f0 x = (x, x) f1 x = f0 (f0 x) f2 x = f1 (f1 x) f3 x = f2 (f2 x) f4 x = f3 (f3 x) f5 x = f4 (f4 x) in f5 () So given that decidable inputs already exist on which GHC wedges my machine for a while and then crashes, what's the point of artificially limiting compilation of other types of programs whose inputs aren't known to be decidable, since often they are in fact decidable? Finally, I still think most of the "magic" in everything we've been talking about boils down to being able to have a type variable that can take on any type *except* a certain handful. This would allow us to avoid overlapping instances of both classes and type families, would allow us to define TypeEq, etc. Moreover, it would be a more direct expression of what programmers intend, and so make code easier to read. Such a constraint cannot be part of the context, because instance selection would need to depend on it, so some new syntax would be necessary. David

oleg＠okmij.org

5 Jul 5 Jul

4:47 a.m.

Sorry for the late reply.

...

GHC does not accept the following program: {-# LANGUAGE TypeFamilies #-} x :: () x = const () (eq a b) where a :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) a = undefined b :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) b = undefined eq :: a -> b -> NatEq a b eq _ _ = undefined

Notice that there are no undecidable instances required to compile this function (or the portions of your TTypeable that this function depends on). Yet GHC is still limiting my context stack (to 21 by default). Obviously any kind of unicode encoding of type names is going to run into the same sort of problem.

I would call this a bug, a recent one. Old versions of GHC (before 6.10, I think) imposed no context stack restrictions on programs with decidable instances. Such a restriction is a recent addition (perhaps introduced as a safe-guard since it was discovered that the old versions of GHC did not properly implement the coverage condition). I would recommend to file this as a bug, including the sample code above. Your old message argues well why this new behavior should be considered as a bug.

...

But now the comparison of types is depending on this context stack and likely worsening the problem.

I agree that dependence on the arbitrary limit is unsatisfactory. That is why I was hoping that Nat kinds will eventually make their way into GHC (there are many arguments for their inclusion, and some people are reportedly have been working on them).

...

Finally, I still think most of the "magic" in everything we've been talking about boils down to being able to have a type variable that can take on any type *except* a certain handful. This would allow us to avoid overlapping instances of both classes and type families, would allow us to define TypeEq, etc. Moreover, it would be a more direct expression of what programmers intend, and so make code easier to read. Such a constraint cannot be part of the context, because

Alas, such `type variables' with inequality constraints are quite complex, and the consequences of their introduction are murky. Let me illustrate. First of all, the constraint /~ (to be used as t1 /~ Int) doesn't help to define TypeEq, etc. because constraints are not used when selecting an instance. Instances are selected only by matching a type to the instance head. Instance selection based on constraints is quite a change in the type checker, and is unlikely to be implemented. Let us see how the selection based on mismatch could be implemented. To recall, the constraint "t1 ~ t2" means that there exists a substitution on free type variables such that its application to t1 and t2 makes the types identical. In symbols, \exists\sigma. t1\sigma = t2\sigma We call the type t matches the instance whose head is th if \exists\sigma. th\sigma = t Now, we wish to define selection of an instance based on dis-equality. We may say, for example, that the instance with the head th is selected for type t if \not\exists\sigma. th\sigma = t In other words, we try to match t against th. On mismatch, we select the instance. Let us attempt to define TypeEq class TypeEq a b c | a b -> c instance TypeEq x x True instance TypeEq x (NOT x) False and resolve (TypeEq Int Bool x). We start with Bool: Bool matches x, so the second instance cannot be selected. The first instance can't be selected either. Thus, we fail. One may say: we should have started by matching Int first, which would instantiate x. Matching Bool against so instantiated x will fail, and the second instance will be selected. The dependence on the order of matching leaves a bad taste. It is not clear that there is always some order; it is not clear how difficult it is to find one. I submit that introducing disequality selection is quite a subtle matter. The TTypeable approach was designed to avoid large changes in the type checker. If you attend the Haskell implementors workshop, you might wish to consider giving a talk about overlapping instances and the ways to get around them or implement properly. About dynamic loading [perhaps this should be moved in a separate thread?]

...

I wish I knew ML better, but from looking at that paper, I can't figure out the key piece of information, which is how to detect overlapping type instance declarations. (Does ML even have something equivalent?)

ML does not have type-classes; they can easily be emulated via dictionary passing. Also, local open (recently introduced in OCaml) suffices quite frequently, as it turns out. I admit though I don't fully understand the problem:

...

In the absence of type families, I'm okay using something like haskell-plugins that returns Dynamic. ... With type families, I'm at a loss to figure out how this could even work. You would need to know have a runtime representation of every type family that's ever been instantiated, and you would need to check

Type instance selection (including type family applications) are all performed at compile time. At run-time, when a plug-in is loaded, no instance selection is performed. If you wish to adjust the interfaces of the plug-in based on the host environment, you have to program this directly (by passing dictionaries explicitly). This is not very difficult. Incidentally, several approaches to `typed compilation' (or, de-serialization of typed code) are described at http://okmij.org/ftp/tagless-final/tagless-typed.html#typed-compilation illustrating how to implement a simple type-checker, to run alongside the serializer.

sanzhiyan＠gmail.com

21 Jun 21 Jun

6:03 p.m.

On , oleg@okmij.org wrote:

...

[...] Exactly this code is implemented in Example.hs. Here's an excerpt:

...

...
-- Default instance

...

...
instance (Monad (tm), MonadState m, MonadTrans t)

...

...
=> MonadState' (tm) HFalse where

...

...
type MState' (tm) HFalse = MState m

...

...
get' _ = trace "Default get" $ lift get

...

...
put' _ = lift . put

...

...

...

...
-- Special instances

...

...

...

...
instance (Monad m)

...

...
=> MonadState' (StateT sm) HTrue where

...

...
type MState' (StateT sm) HTrue = s

...

...
get' _ = trace "Special get" . StateT $ \s -> return (s, s)

...

...
put' _ s = StateT $ \_ -> return ((), s)

...

...

...

...
-- add more special instances if needed ...

...

plus one more general dispatching instance. Because of the additional

...

flag, HTrue vs HFalse, the above instances do not overlap.

Can you add more special instances for types that match (tm) without using OverlappingInstances and without modifying the instances you have above?

oleg＠okmij.org

23 Jun 23 Jun

8:01 a.m.

...

Can you add more special instances for types that match (tm) without using OverlappingInstances and without modifying the instances you have above?

Excellent question, thank you for noticing. The problem with overlapping instances is that they are ill-founded: one could potentially add more and more special instances: a > [a] > [[a]] > [[[a]]] > .... GHC could be coerced to choose an instance early (using various quantification tricks); when a more specialized instances shows up in another module, which eventually gets linked in into the whole program, trouble could happen. With type functions, the trouble does happen, in the form of a segmentation fault. With functional dependencies, it is not clear what sort of trouble could happen. Still, an intuitive behavior could be demonstrated. The two approaches to tame overlapping instances I know of (closed type families, instance chains) `close the world'. The programmer must declare the limit of the specialization chain. The compiler will see the limit when compiling a module, certain that the limit won't change in another module. The same approach is implemented in TTypeable. The type alias type Special = StateT_cde :/ NIL was introduce deliberately. It defines the list of specializations; if we use the wildcards, the set of types to specialize among could be infinite, but their lower specialization bound is certain. To add a new special instance, one does not need to modify any existing instance in my code. But one has to modify the type alias Special, adding the TC_code of those special types, or the appropriate wildcard. I imagine the type alias Special is in a module of its own, placed in a directory named like Config/. A hackage library in general may have tunable parameters, specific to a particular installation. These parameters could be collected in Config/*.hs files.

5115

Age (days ago)

5152

Last active (days ago)

List overview

Download

38 comments

12 participants

participants (12)

Alexey Khudyakov
Andrea Vezzosi
Ben Millwood
Dan Doel
David Mazieres
dm-list-haskell-prime＠scs.stanford.edu
José Pedro Magalhães
Malcolm Wallace
oleg＠okmij.org
Roman Leshchinskiy
sanzhiyan＠gmail.com
Simon Peyton-Jones