Injective type families

Hi all, I'd like to take a stab at implementing injective type families (#6018). My plan of attack looks like this: 1. Implement injective type families that are: a) injective in all arguments b) only admit RHS that is a concrete type or a type variable introduced by the LHS or a recursive call to self. 2. Lift restriction a) ie. allow type families injective only in some arguments 3. Lift restriction b) ie. allow injective type families to call other type families. I'm not sure if 3) really gets implemented as this seems more in lines of #4259, which I don't intend to approach. Now, let's discuss the most important of all matters - the syntax! :-) Note that syntax must: 1. allow to define injectivity only for some parameters 2. work for both open and closed type families Here's my proposal (based on Richard's idea to use functional-dependencies-like syntax): 1. Standard injective type family (all parameters uniquely determined by the RHS): injective type family F a b c | a b c type family instance F Int Char Bool = Bool type family instance F Char Bool Int = Int type family instance F Bool Int Char = Char 2. Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS): injective type family G a b c | a b type family instance G Int Char Bool = Bool type family instance G Int Char Int = Bool type family instance G Bool Int Int = Int Here knowing the RHS allows us to determine a and b, but not c. 3. Type families where knowing the RHS and some parameters on the LHS makes other parameters injective: Example 1: knowing the RHS and any single parameter uniquely determines other parameters injective type family H a b c | a -> b c, b -> a c, c -> a b type family instance H Int Char Double = Int type family instance H Bool Double Char = Int Example 2: knowing the RHS and either a or b allows to uniquely determine other parameters, but knowing the RHS and c gives us no information about a or b injective type family J a b c | a -> b c, b -> a c type family instance J Int Char Double = Int type family instance J Bool Double Double = Int For closed type families this notation would be identical: type family declaration would be followed by the "where" keyword. In comment 34 of #6018 Richard proposed a notation that uses a keyword "Result". This would allow for a more uniform notation. Notice that in my proposal some families use ->, while others don't. The idea is that on the right side of the arrow we write what we can infer from the things on the left of the arrow. If the only thing left of the arrow is the right hand side of the type family equation (the result) then the arrow can be elided. In other words these would be equivalent: injective type family F a b c | a b c injective type family F a b c | result -> a b c injective type family G a b c | a b injective type family G a b c | result -> a b (note: it is also true that "injective type family G a b c | result c -> a b") injective type family H a b c | a -> b c, b -> a c, c -> a b injective type family H a b c | result a -> b c, result b -> a c, result c -> a b injective type family J a b c | a -> b c, b -> a c injective type family J a b c | result a -> b c, result b -> a c I find the notation explicitly using "result" to be easier to understand, but it is also more verbose. I also suspect that it might be a bit easier to parse. Which of the notations do you find better? What do you think about using the "injective" keyword? Are there any alternative proposals? Janek

I'm not convinced I have very good taste where it comes to concrete syntax, but I will say that the `injective` keyword below is redundant -- it can be inferred from the presence of the `|`.
Also, by restriction (b), I imagine you also allow things like `Maybe a`, where `a` is a variable bound on the LHS. That doesn't seem to be included in a tight reading of your proposal.
Thanks for taking this on!
Richard
On Jul 10, 2014, at 10:37 AM, Jan Stolarek
Hi all,
I'd like to take a stab at implementing injective type families (#6018). My plan of attack looks like this:
1. Implement injective type families that are: a) injective in all arguments b) only admit RHS that is a concrete type or a type variable introduced by the LHS or a recursive call to self. 2. Lift restriction a) ie. allow type families injective only in some arguments 3. Lift restriction b) ie. allow injective type families to call other type families.
I'm not sure if 3) really gets implemented as this seems more in lines of #4259, which I don't intend to approach.
Now, let's discuss the most important of all matters - the syntax! :-) Note that syntax must: 1. allow to define injectivity only for some parameters 2. work for both open and closed type families
Here's my proposal (based on Richard's idea to use functional-dependencies-like syntax):
1. Standard injective type family (all parameters uniquely determined by the RHS):
injective type family F a b c | a b c type family instance F Int Char Bool = Bool type family instance F Char Bool Int = Int type family instance F Bool Int Char = Char
2. Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS):
injective type family G a b c | a b type family instance G Int Char Bool = Bool type family instance G Int Char Int = Bool type family instance G Bool Int Int = Int
Here knowing the RHS allows us to determine a and b, but not c.
3. Type families where knowing the RHS and some parameters on the LHS makes other parameters injective:
Example 1: knowing the RHS and any single parameter uniquely determines other parameters
injective type family H a b c | a -> b c, b -> a c, c -> a b type family instance H Int Char Double = Int type family instance H Bool Double Char = Int
Example 2: knowing the RHS and either a or b allows to uniquely determine other parameters, but knowing the RHS and c gives us no information about a or b
injective type family J a b c | a -> b c, b -> a c type family instance J Int Char Double = Int type family instance J Bool Double Double = Int
For closed type families this notation would be identical: type family declaration would be followed by the "where" keyword.
In comment 34 of #6018 Richard proposed a notation that uses a keyword "Result". This would allow for a more uniform notation. Notice that in my proposal some families use ->, while others don't. The idea is that on the right side of the arrow we write what we can infer from the things on the left of the arrow. If the only thing left of the arrow is the right hand side of the type family equation (the result) then the arrow can be elided. In other words these would be equivalent:
injective type family F a b c | a b c injective type family F a b c | result -> a b c
injective type family G a b c | a b injective type family G a b c | result -> a b (note: it is also true that "injective type family G a b c | result c -> a b")
injective type family H a b c | a -> b c, b -> a c, c -> a b injective type family H a b c | result a -> b c, result b -> a c, result c -> a b
injective type family J a b c | a -> b c, b -> a c injective type family J a b c | result a -> b c, result b -> a c
I find the notation explicitly using "result" to be easier to understand, but it is also more verbose. I also suspect that it might be a bit easier to parse. Which of the notations do you find better? What do you think about using the "injective" keyword? Are there any alternative proposals?
Janek _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On Thu, Jul 10, 2014 at 9:37 PM, Jan Stolarek
1. Standard injective type family (all parameters uniquely determined by the RHS): injective type family F a b c | a b c
<snip>
2. Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS): injective type family G a b c | a b
English is not my mother tongue. Perhaps you mean: 'the image under the type function is uniquely determined by the parameters on the right of the vertical bar'? As they stand, the sentences confuse which determines what. Such confusion has a way of squirreling into official documentation. -- Kim-Ee

* Kim-Ee Yeoh
On Thu, Jul 10, 2014 at 9:37 PM, Jan Stolarek
wrote: 1. Standard injective type family (all parameters uniquely determined by the RHS): injective type family F a b c | a b c
<snip>
2. Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS): injective type family G a b c | a b
English is not my mother tongue. Perhaps you mean: 'the image under the type function is uniquely determined by the parameters on the right of the vertical bar'?
The result of any function is always determined by that function's parameters. Injectivity means that the *parameters* are determined by the result. So I think Jan's definition is correct.
As they stand, the sentences confuse which determines what.
Such confusion has a way of squirreling into official documentation.

On Fri, Jul 11, 2014 at 3:27 AM, Roman Cheplyaka
The result of any function is always determined by that function's parameters.
Injectivity means that the *parameters* are determined by the result.
So I think Jan's definition is correct.
What's correct could still be confusing. Now that you brought it up, I no longer know what "RHS" in OP means! It could be (1) the result. Or (2) the right of the vertical bar. For easy reference, here are the statements in question: * "Standard injective type family (all parameters uniquely determined by the RHS)" * "Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS):" So when you say "arguments always determine result", one has to pause and think about what that means in the presence of multiple arguments as we have here. Because that doesn't hold when only some arguments are applied. Also, note the usage of the expression "uniquely determined" in OP. When a single-argument function is injective, we say that the argument 'uniquely determines' (as opposed to just determine) the result. But note how it appears flipped in OP! Finally, let's look at how the verb 'determine' is used. When defining a type family case-by-case, one works from arguments to the result determined by the arguments. But saying "Injectivity means that the *parameters* are determined by the result" raises the specter of the inverse function. Suddenly, the arrow of work gets flipped in reverse. Doesn't that only contribute to cognitive noise? Imprecise language not only trips up experts, it also sets up insurmountable barriers for outsiders. We can do much better here. -- Kim-Ee

Jan, this is great! Thanks for attacking this issue. Regarding "result", I do not like the idea to introduce arbitrary words with special meanings. What if somebody writes
injective type family F a result c | result -> a result c
it will be totally confusing. One could write like this:
injective type family F a b c | F a b c -> a b c -- (*)
or even shorter:
injective type family F a b c | F -> a b c -- (**)
in (*) the syntax is inconsistent because to the left of the "|"
juxtaposition is not meaning application. Also (*) would permit "... |
F x b c -> a b c" which is confusing and would require a naming rule.
(**) can be read as "F's result uniquely determines all of a b and c".
It sounds ok if you repeat it often enough :-)
Regarding "injective" I go with Richard. It is unneeded noise.
Cheers,
Gabor
On 7/10/14, Jan Stolarek
Hi all,
I'd like to take a stab at implementing injective type families (#6018). My plan of attack looks like this:
1. Implement injective type families that are: a) injective in all arguments b) only admit RHS that is a concrete type or a type variable introduced by the LHS or a recursive call to self. 2. Lift restriction a) ie. allow type families injective only in some arguments 3. Lift restriction b) ie. allow injective type families to call other type families.
I'm not sure if 3) really gets implemented as this seems more in lines of #4259, which I don't intend to approach.
Now, let's discuss the most important of all matters - the syntax! :-) Note that syntax must: 1. allow to define injectivity only for some parameters 2. work for both open and closed type families
Here's my proposal (based on Richard's idea to use functional-dependencies-like syntax):
1. Standard injective type family (all parameters uniquely determined by the RHS):
injective type family F a b c | a b c type family instance F Int Char Bool = Bool type family instance F Char Bool Int = Int type family instance F Bool Int Char = Char
2. Type family injective only in some parameters (ie. only some parameters uniquely determined by the RHS):
injective type family G a b c | a b type family instance G Int Char Bool = Bool type family instance G Int Char Int = Bool type family instance G Bool Int Int = Int
Here knowing the RHS allows us to determine a and b, but not c.
3. Type families where knowing the RHS and some parameters on the LHS makes other parameters injective:
Example 1: knowing the RHS and any single parameter uniquely determines other parameters
injective type family H a b c | a -> b c, b -> a c, c -> a b type family instance H Int Char Double = Int type family instance H Bool Double Char = Int
Example 2: knowing the RHS and either a or b allows to uniquely determine other parameters, but knowing the RHS and c gives us no information about a or b
injective type family J a b c | a -> b c, b -> a c type family instance J Int Char Double = Int type family instance J Bool Double Double = Int
For closed type families this notation would be identical: type family declaration would be followed by the "where" keyword.
In comment 34 of #6018 Richard proposed a notation that uses a keyword "Result". This would allow for a more uniform notation. Notice that in my proposal some families use ->, while others don't. The idea is that on the right side of the arrow we write what we can infer from the things on the left of the arrow. If the only thing left of the arrow is the right hand side of the type family equation (the result) then the arrow can be elided. In other words these would be equivalent:
injective type family F a b c | a b c injective type family F a b c | result -> a b c
injective type family G a b c | a b injective type family G a b c | result -> a b (note: it is also true that "injective type family G a b c | result c -> a b")
injective type family H a b c | a -> b c, b -> a c, c -> a b injective type family H a b c | result a -> b c, result b -> a c, result c -> a b
injective type family J a b c | a -> b c, b -> a c injective type family J a b c | result a -> b c, result b -> a c
I find the notation explicitly using "result" to be easier to understand, but it is also more verbose. I also suspect that it might be a bit easier to parse. Which of the notations do you find better? What do you think about using the "injective" keyword? Are there any alternative proposals?
Janek _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On 07/10/2014 02:34 PM, Gabor Greif wrote:
Jan, this is great! Thanks for attacking this issue.
Regarding "result", I do not like the idea to introduce arbitrary words with special meanings. What if somebody writes
injective type family F a result c | result -> a result c
it will be totally confusing.
One could write like this:
injective type family F a b c | F a b c -> a b c -- (*)
or even shorter:
injective type family F a b c | F -> a b c -- (**)
in (*) the syntax is inconsistent because to the left of the "|" juxtaposition is not meaning application. Also (*) would permit "... | F x b c -> a b c" which is confusing and would require a naming rule.
(**) can be read as "F's result uniquely determines all of a b and c". It sounds ok if you repeat it often enough :-)
At the risk of being too clever, one could use the keyword "type" to reference the result. If F is a family of types (a "type family"), then the result is conceptually a single type in this family.
type family F a b c | type -> a b c
-Isaac

To Richard: > the `injective` keyword below is redundant -- it can be inferred from the presence of the `|`. Yes, I was wondering if someone will point it out. I agree it can be omitted. Since Gabor also raised that concern I officially drop the idea of using "injective" keyword :-) > Also, by restriction (b), I imagine you also allow things like `Maybe a`, where `a` is a > variable bound on the LHS. That doesn't seem to be included in a tight reading of your > proposal. Isn't 'Maybe a' a "concrete type"? To Kim-Ee: Roman has this right, but let me attempt to clarify further: > I no longer know what "RHS" in OP means! It could be (1) the result. Or (2) the right of the > vertical bar. RHS means (1) the result. So in this example: injective type family F a b c | a b c type family instance F Int Char Bool = Bool type family instance F Char Bool Int = Int type family instance F Bool Int Char = Char Bool is the RHS of the first equation, Int is the RHS of the second equation and Char is the RHS of the third equation. Note: I sometime use terms "equation" and "clause" interchangeably. Now that you brought it up it would be a Good Thing to give "(2) the right of the vertical bar" a concrete name, so we can refer to it concisely in a discussion. How about "injectivity conditions"? > So when you say "arguments always determine result", one has to pause and think about what that > means in the presence of multiple arguments as we have here. Because that doesn't hold when > only some arguments are applied. Type families cannot be partially applied - you alwyas have all the parameters. > When a single-argument function is injective, we say that the argument 'uniquely determines' > (as opposed to just determine) the result. No, you have the definition backwards. When an argument of a relation uniquely determines the result then we call that relation a "function". When a function is injective it means that every element of the image is assigned to at most one element of the domain. Which means you can determine the argument from the result. > When defining a type family case-by-case, one works from arguments to the result determined by > the arguments. But saying "Injectivity means that the *parameters* are determined by the > result" raises the specter of the inverse function We want to define functions as injective so that GHC can infer (a ~ b) based on (F a ~ F b). So in a sense we want inverse functions here. Why do you say about a "specter of the inverse function" as if it was something unthinkable? To Gabor: > Jan, this is great! Thanks for attacking this issue. Thank me when it's implemented. > Regarding "result", I do not like the idea to introduce arbitrary > words with special meanings Well, if we followed that path then certainly "result" would become an "arbitrary word with a special meaning" a.k.a. "keyword". I'm not pushing strongly for this (in fact I'm not pushing in any direction with the syntax), but these are the alternatives I see: 1) use the "result" keyword: syntax is more coherent and easier to understand, but we pay the price of introducing a new keyword and more verbosity 2) don't use the "result" keyword: syntax is shorter, but a bit incoherent which may make it harder to understand (it does for me, even though I wrote down these examples). > What if somebody writes > > > injective type family F a result c | result -> a result c > > it will be totally confusing. This would become an error. Janek
participants (6)
-
Gabor Greif
-
Isaac Dupree
-
Jan Stolarek
-
Kim-Ee Yeoh
-
Richard Eisenberg
-
Roman Cheplyaka