Injective type families for GHC - syntax proposals

Haskellers, there is some discussion going on at the moment about implementing injective type families in GHC [1]. I'd like to hear your opinions on proposed syntax for this new feature. == Problem == Currently GHC assumes that type families are not injective. This is of course conservative and the idea behind implementing injective type families is to allow the programmer to explicitly declare that a type family has the injectivity property (which GHC will of course verify). == Forms of injectivity == Injectivity can take forms more complicated than just determining the LHS from RHS: A. RHS determines all arguments on the LHS. Example: type family Id a where Id a = a Here the RHS uniquely determines LHS. B. RHS determines some but not all arguments on the LHS. Example (contrieved): type family G a b c where G Int Char Bool = Bool G Int Char Int = Bool G Bool Int Int = Int Here RHS uniquely determines `a` and `b` but not `c`. C. RHS and some LHS arguments uniquely determine some LHS arguments. Example: type family Plus a b where Plus Z m = m Plus (S n) m = S (Plus n m) Here knowing the RHS and one of LHS arguments uniqely determines the other LHS argument. At the moment we've only seen use cases for A, which means that we will most likely only implement A. If someone comes up with compelling examples of either B or C we will revise and consider implementing these more involved forms of injectivity. == Syntax == So the most important decission to make is what syntax do we want for this new feature :-) One of the features we would like is extensibility, ie. if we only implement injectivity of type A we still want to be able to add injectivities B and C sometime in the future without breaking A. When thinking about syntax an important (and open) question is whether injectivity becomes part of TypeFamilies language extension or do we add a new InjectiveTypeFamilies extension for this feature. The problem with incorporating injectivity into existing TypeFamilies is that we would have to come up with syntax that is fully backwards compatible with what we have now. Below is a list of proposals that have come up until now in the discussion between developers. Note that all proposals are given for closed type families. For open type families things would be identical except that the `where` keyword would be skipped. === Proposal 1 === Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form: {{{ result A -> B }}} where `A` is a possibly-empty list of type variables declared in type family head and `B` is non-empty list of said type variables. Things on the left and right of `->` are called LHS and RHS of an injectivity condition, respectively. `result` becomes a restricted word that cannot be used as a type variable's identifier in a type family head. Examples: type family Id a | result -> a where type family G a b c | result -> a b where type family Plus a b | result a -> b, result b -> a where Pros: * has natural reading: `result a -> b` reads as "knowing the result and the type variable a determines b". * extensible for the future Cons: * steals `result` identifier in the type family head. This means it would be illegal to have a type variable named `result` in a type family. * the above also makes this proposal backwards incompatible with TypeFamilies extension. Further proposals are based on this one in an attempt to solve the problem of stealing syntax and backwards incompatibility. === Proposal 2 === Use `Result` instead of `result`: type family Id a | Result -> a where type family G a b c | Result -> a b where type family Plus a b | Result a -> b, Result b -> a where Pros: * has natural reading * extensible for the future * allows to have type variables named `result` (does not steal syntax) Cons: * all other keywords in Haskell are lowercase. This would be the first one that is capitalized. * confusion could arise if we have a type `Result` and use it in the type family head. Unknowns (for me): * not sure if it would be possible to avoid name clash between `Result` and type of the same name. If not then this proposal would also be backwards incompatible with TypeFamilies. === Proposal 3 === Use `type` instead of `result`: type family Id a | type -> a where type family G a b c | type -> a b where type family Plus a b | type a -> b, type b -> a where Pros: * extensible for the future * no syntax stealing Cons: * no natural reading * usage of `type` here might seem unintuitive === Proposal 4 === Use name of the type family instead of `result`: type family Id a | Id -> a where type family G a b c | G -> a b where type family Plus a b | Plus a -> b, Plus b -> a where Pros: * extensible for the future * no syntax stealing Cons: * writing something like `Plus a` might be confusing. It looks as if Plus could be partially applied. == Further discussion == I'd like to hear what you think about these proposals. Neither of them is perfect so perhaps someone can come with something better. You can also check out the GHC Trac ticket [1] and the GHC Wiki page [2] (which repeats most of this but has some more details as well). [1] https://ghc.haskell.org/trac/ghc/ticket/6018 [2] https://ghc.haskell.org/trac/ghc/wiki/InjectiveTypeFamilies Janek

On Wed, Sep 17, 2014 at 7:10 AM, Jan Stolarek
Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form:
{{{ result A -> B }}}
Was `_` considered here as the result token? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form:
{{{ result A -> B }}}
Was `_` considered here as the result token? No, it wasn't.
Also, I think it's best to actually take a look at the wiki as new proposals are being added there. Janek

Jan,
I notice two things
1) that where notation in the style of closed type families is quite nice!
2) what about adding some annotation to the args that act injectively wrt
the result? eg
type family Foo *a *b -- or some other marking annotation
rather than
type family Foo a b | $syntaxPlaceHolder -> a b -- or
Or should we actually think of the solver process as being analagous to
fundeps?
On Wed, Sep 17, 2014 at 10:02 AM, Jan Stolarek
Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form:
{{{ result A -> B }}}
Was `_` considered here as the result token? No, it wasn't.
Also, I think it's best to actually take a look at the wiki as new proposals are being added there.
Janek
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Maybe it's worth adding an explicit type variable for the result? Something like: type family Id a -> r | r -> a where type family G a b c -> res | res -> a b where type family Plus a b -> result | result a -> b, result b -> a where It's a little verbose, but certainly is not confusing, since LHS really determines RHS.

Hi, I have some code that currently needs AllowAmbiguousTypes to compile in 7.8, but didn't need it in 7.6. It's actually an example of your form of injectivity C. The crux is that for list concatenation (like Conal's example of addition), any two of the inputs/result determines the other input/result. So it might be useful to have the following. type family (as :: [Nat]) ++ (bs :: [Nat]) :: [Nat] | result as -> bs, result bs -> as where '[] ++ bs = bs (a ': as) ++ bs = a ': (as ++ bs) I have serious doubts that GHC could check this injectivity, though. Fortunately I only use this at concrete enough types that AllowAmbiguousTypes is sufficient. But really, the types aren't ambiguous, GHC just doesn't know about the injectivity of (++). The code below is simplified from code that computes a tensor product of a tensor with an identity matrix whose size is determined from the shapes of the input and output tensors. \begin{code} {-# LANGUAGE AllowAmbiguousTypes, DataKinds, KindSignatures, TypeFamilies, TypeOperators #-} module Test where import GHC.TypeLits type family (as :: [Nat]) ++ (bs :: [Nat]) :: [Nat] type instance '[] ++ bs = bs type instance (a ': as) ++ bs = a ': (as ++ bs) data Tensor (s :: [Nat]) = Tensor -- content elided -- apparently GHC reduces (++) enough to see that n is determined leftUnit :: Tensor s -> Tensor ('[n, n] ++ s) leftUnit Tensor = Tensor -- accepted in 7.6, not accepted in 7.8 without AllowAmbiguousTypes rightUnit :: Tensor s -> Tensor (s ++ '[n, n]) rightUnit Tensor = Tensor -- also accepted without AllowAmbiguousTypes outsideUnit :: Tensor s -> Tensor ('[n] ++ s ++ '[n]) outsideUnit Tensor = Tensor useleftUnit :: Tensor '[1,1,2] useleftUnit = leftUnit Tensor -- type of Tensor is inferred userightUnit :: Tensor '[1,2,2] userightUnit = rightUnit (Tensor :: Tensor '[1]) -- type must be provided \end{code} On Wed, Sep 17, 2014 at 01:10:13PM +0200, Jan Stolarek wrote:
Haskellers,
there is some discussion going on at the moment about implementing injective type families in GHC [1]. I'd like to hear your opinions on proposed syntax for this new feature.
== Problem ==
Currently GHC assumes that type families are not injective. This is of course conservative and the idea behind implementing injective type families is to allow the programmer to explicitly declare that a type family has the injectivity property (which GHC will of course verify).
== Forms of injectivity ==
Injectivity can take forms more complicated than just determining the LHS from RHS:
A. RHS determines all arguments on the LHS. Example:
type family Id a where Id a = a
Here the RHS uniquely determines LHS.
B. RHS determines some but not all arguments on the LHS. Example (contrieved):
type family G a b c where G Int Char Bool = Bool G Int Char Int = Bool G Bool Int Int = Int
Here RHS uniquely determines `a` and `b` but not `c`.
C. RHS and some LHS arguments uniquely determine some LHS arguments. Example:
type family Plus a b where Plus Z m = m Plus (S n) m = S (Plus n m)
Here knowing the RHS and one of LHS arguments uniqely determines the other LHS argument.
At the moment we've only seen use cases for A, which means that we will most likely only implement A. If someone comes up with compelling examples of either B or C we will revise and consider implementing these more involved forms of injectivity.
== Syntax ==
So the most important decission to make is what syntax do we want for this new feature :-) One of the features we would like is extensibility, ie. if we only implement injectivity of type A we still want to be able to add injectivities B and C sometime in the future without breaking A.
When thinking about syntax an important (and open) question is whether injectivity becomes part of TypeFamilies language extension or do we add a new InjectiveTypeFamilies extension for this feature. The problem with incorporating injectivity into existing TypeFamilies is that we would have to come up with syntax that is fully backwards compatible with what we have now.
Below is a list of proposals that have come up until now in the discussion between developers. Note that all proposals are given for closed type families. For open type families things would be identical except that the `where` keyword would be skipped.
=== Proposal 1 ===
Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form:
{{{ result A -> B }}}
where `A` is a possibly-empty list of type variables declared in type family head and `B` is non-empty list of said type variables. Things on the left and right of `->` are called LHS and RHS of an injectivity condition, respectively. `result` becomes a restricted word that cannot be used as a type variable's identifier in a type family head. Examples:
type family Id a | result -> a where type family G a b c | result -> a b where type family Plus a b | result a -> b, result b -> a where
Pros:
* has natural reading: `result a -> b` reads as "knowing the result and the type variable a determines b".
* extensible for the future
Cons:
* steals `result` identifier in the type family head. This means it would be illegal to have a type variable named `result` in a type family.
* the above also makes this proposal backwards incompatible with TypeFamilies extension.
Further proposals are based on this one in an attempt to solve the problem of stealing syntax and backwards incompatibility.
=== Proposal 2 ===
Use `Result` instead of `result`:
type family Id a | Result -> a where type family G a b c | Result -> a b where type family Plus a b | Result a -> b, Result b -> a where
Pros:
* has natural reading
* extensible for the future
* allows to have type variables named `result` (does not steal syntax)
Cons:
* all other keywords in Haskell are lowercase. This would be the first one that is capitalized.
* confusion could arise if we have a type `Result` and use it in the type family head.
Unknowns (for me):
* not sure if it would be possible to avoid name clash between `Result` and type of the same name. If not then this proposal would also be backwards incompatible with TypeFamilies.
=== Proposal 3 ===
Use `type` instead of `result`:
type family Id a | type -> a where type family G a b c | type -> a b where type family Plus a b | type a -> b, type b -> a where
Pros:
* extensible for the future
* no syntax stealing
Cons:
* no natural reading
* usage of `type` here might seem unintuitive
=== Proposal 4 ===
Use name of the type family instead of `result`:
type family Id a | Id -> a where type family G a b c | G -> a b where type family Plus a b | Plus a -> b, Plus b -> a where
Pros:
* extensible for the future
* no syntax stealing
Cons:
* writing something like `Plus a` might be confusing. It looks as if Plus could be partially applied.
== Further discussion ==
I'd like to hear what you think about these proposals. Neither of them is perfect so perhaps someone can come with something better. You can also check out the GHC Trac ticket [1] and the GHC Wiki page [2] (which repeats most of this but has some more details as well).
[1] https://ghc.haskell.org/trac/ghc/ticket/6018 [2] https://ghc.haskell.org/trac/ghc/wiki/InjectiveTypeFamilies
Janek _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
--
Jason McCarty

On Wed, 17 Sep 2014, Jan Stolarek
Haskellers,
there is some discussion going on at the moment about implementing injective type families in GHC [1]. I'd like to hear your opinions on proposed syntax for this new feature.
As an old Lisper I look forward to an all sexp version of Haskell source code. I do note your ':-)' below ;) oo--JS.
== Problem ==
Currently GHC assumes that type families are not injective. This is of course conservative and the idea behind implementing injective type families is to allow the programmer to explicitly declare that a type family has the injectivity property (which GHC will of course verify).
== Forms of injectivity ==
Injectivity can take forms more complicated than just determining the LHS from RHS:
A. RHS determines all arguments on the LHS. Example:
type family Id a where Id a = a
Here the RHS uniquely determines LHS.
B. RHS determines some but not all arguments on the LHS. Example (contrieved):
type family G a b c where G Int Char Bool = Bool G Int Char Int = Bool G Bool Int Int = Int
Here RHS uniquely determines `a` and `b` but not `c`.
C. RHS and some LHS arguments uniquely determine some LHS arguments. Example:
type family Plus a b where Plus Z m = m Plus (S n) m = S (Plus n m)
Here knowing the RHS and one of LHS arguments uniqely determines the other LHS argument.
At the moment we've only seen use cases for A, which means that we will most likely only implement A. If someone comes up with compelling examples of either B or C we will revise and consider implementing these more involved forms of injectivity.
== Syntax ==
So the most important decission to make is what syntax do we want for this new feature :-) One of the features we would like is extensibility, ie. if we only implement injectivity of type A we still want to be able to add injectivities B and C sometime in the future without breaking A.
When thinking about syntax an important (and open) question is whether injectivity becomes part of TypeFamilies language extension or do we add a new InjectiveTypeFamilies extension for this feature. The problem with incorporating injectivity into existing TypeFamilies is that we would have to come up with syntax that is fully backwards compatible with what we have now.
Below is a list of proposals that have come up until now in the discussion between developers. Note that all proposals are given for closed type families. For open type families things would be identical except that the `where` keyword would be skipped.
=== Proposal 1 ===
Use syntax similar to functional dependencies. The injectivity declaration begins with `|` following type family declaration head. `|` is followed by a list of comma-separated injectivity conditions. Each injectivity condition has the form:
{{{ result A -> B }}}
where `A` is a possibly-empty list of type variables declared in type family head and `B` is non-empty list of said type variables. Things on the left and right of `->` are called LHS and RHS of an injectivity condition, respectively. `result` becomes a restricted word that cannot be used as a type variable's identifier in a type family head. Examples:
type family Id a | result -> a where type family G a b c | result -> a b where type family Plus a b | result a -> b, result b -> a where
Pros:
* has natural reading: `result a -> b` reads as "knowing the result and the type variable a determines b".
* extensible for the future
Cons:
* steals `result` identifier in the type family head. This means it would be illegal to have a type variable named `result` in a type family.
* the above also makes this proposal backwards incompatible with TypeFamilies extension.
Further proposals are based on this one in an attempt to solve the problem of stealing syntax and backwards incompatibility.
=== Proposal 2 ===
Use `Result` instead of `result`:
type family Id a | Result -> a where type family G a b c | Result -> a b where type family Plus a b | Result a -> b, Result b -> a where
Pros:
* has natural reading
* extensible for the future
* allows to have type variables named `result` (does not steal syntax)
Cons:
* all other keywords in Haskell are lowercase. This would be the first one that is capitalized.
* confusion could arise if we have a type `Result` and use it in the type family head.
Unknowns (for me):
* not sure if it would be possible to avoid name clash between `Result` and type of the same name. If not then this proposal would also be backwards incompatible with TypeFamilies.
=== Proposal 3 ===
Use `type` instead of `result`:
type family Id a | type -> a where type family G a b c | type -> a b where type family Plus a b | type a -> b, type b -> a where
Pros:
* extensible for the future
* no syntax stealing
Cons:
* no natural reading
* usage of `type` here might seem unintuitive
=== Proposal 4 ===
Use name of the type family instead of `result`:
type family Id a | Id -> a where type family G a b c | G -> a b where type family Plus a b | Plus a -> b, Plus b -> a where
Pros:
* extensible for the future
* no syntax stealing
Cons:
* writing something like `Plus a` might be confusing. It looks as if Plus could be partially applied.
== Further discussion ==
I'd like to hear what you think about these proposals. Neither of them is perfect so perhaps someone can come with something better. You can also check out the GHC Trac ticket [1] and the GHC Wiki page [2] (which repeats most of this but has some more details as well).
[1] https://ghc.haskell.org/trac/ghc/ticket/6018 [2] https://ghc.haskell.org/trac/ghc/wiki/InjectiveTypeFamilies
Janek _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thank you all for replies. To Carter:
what about adding some annotation to the args that act injectively wrt the result? I'm afraid that's not enough. How would you annotate Plus using this syntax?
type family Plus a b | Plus a -> b, Plus b -> a where ...
Or should we actually think of the solver process as being analagous to fundeps? I think that injectivity will be very similar to fundeps: you'll have some types determining the others, except for type families not type classes.
Maybe it's worth adding an explicit type variable for the result? Oh, this sounds good. But we must remember that type families already allow providing the kind of
To flicky frans: the result like this: type family Id a :: * where type family G a b c :: * where type family Plus a b :: Nat where So we'd have to do this: type family Id a :: (r :: *) | r -> a where type family G a b c :: (res :: *) | res -> a b where type family Plus a b :: (result :: Nat) | result a -> b, result b -> a where This would effectively force the user to provide kind annotation for the result but that doesn't seem to be a high price to pay. To Jason:
I have some code that currently needs AllowAmbiguousTypes to compile in 7.8, but didn't need it in 7.6. It's actually an example of your form of injectivity C. I took the liberty of reporting this as a GHC bug: #9607. However, I don't see how injectivity would help here. I of course see that '++ is injective in the same way as Plus but I don't see how this would aid in type checking of rightUnit (especially that it used to work!). The error message mentions injectivity but that doesn't mean that injectivity would really help.
Janek

A follow up on flicky frans' proposal: we already allow introducing type variables to name a type family result, so this new syntax should be a straightforward addition.

Plus a b :: (result :: Nat) That looks quite unintuitive to me, since `result' reads like a variable of type Nat. This is similar to legal (with KindSignatures) (5 :: (Int :: *)). Maybe it would be better to use something like associated type synonyms? type family (r ~ *) => Id a :: r | r -> a where type family (res ~ *) => G a b c :: r | res -> a b where type family (result ~ Nat) => Plus a b :: result | result a -> b, result b -> a where

On Thu, Sep 18, 2014 at 11:02 AM, flicky frans
Plus a b :: (result :: Nat)
That looks quite unintuitive to me, since `result' reads like a variable of type Nat.
But isn't that exactly what it is supposed to be? -Brent

2014-09-18 21:38 GMT+04:00, Brent Yorgey
On Thu, Sep 18, 2014 at 11:02 AM, flicky frans
wrote: Plus a b :: (result :: Nat)
That looks quite unintuitive to me, since `result' reads like a variable of type Nat.
But isn't that exactly what it is supposed to be?
At least (Plus a b :: Nat) /= (Plus a b :: (result :: Nat)), since a variable of type Nat is not the type of Plus a b. In (a :: (b :: c)) `a' is a variable, `b' is a type, `c' is a kind (in System F, of course) But yes, I made a mistake, `result' should be a variable. Maybe this? type family Plus a b = r :: * | r a -> b, r b -> a where

Sorry. type family Plus a b = r :: Nat | r a -> b, r b -> a where

On Fri, 19 Sep 2014 00:17:49 +0400, flicky frans
Sorry.
type family Plus a b = r :: Nat | r a -> b, r b -> a where
+1 Simple, intuitive and backwards compatible. I don't have to introduce a variable if I don't want to, either.
participants (8)
-
Brandon Allbery
-
Brent Yorgey
-
Carter Schonwald
-
flicky frans
-
Jan Stolarek
-
Jason McCarty
-
Jay Sulzberger
-
Niklas Haas