Re: We need to add role annotations for 7.8

Thank you to everyone who has been helping me understand this issue in greater depth. *tl;dr: As long as we don't expect any libraries beyond to core to annotate, I'm cool. This presumes that the extra safety isn't, in practice, dependent on transitive adoption by libraries. It also implies that representational is the only possible default, and that there can be no migration from it.* My approach to thinking about this is guided by thinking about supporting an eco-system with 1000s of libraries (hackage), a few dozen of which are heavily promoted (the platform), and a small set that are closely tied to the compiler (the core). The availability, speed of release, motivation, and even skill of the the developers varies widely over that range. I also think about the various "stances" of different developers: - *End developer*: makes use of libraries, but just builds apps - *Internal developer*: makes libraries for internal use in a project - *Casual library writer*: makes libraries, primarily for their own needs, but distributed on hackage - *Popular library writer:* actively maintains libraries which are widely used - *Core library writer: *maintainer of a core package that stays in lock step with the compiler Then, I think about, for each of these, what is the effect on a new feature on them, their existing code, and future code? Does it affect them only if they are using the feature? If they aren't using the feature? For library writers, how does the feature affect clients? If a client wants to use a feature, under what conditions does the library need to do something? This last issue of the "transitivity" the feature is often the biggest concern. *Given that... onto type roles:* The default of *representational* is the only option, because a default of *nominal* would require far too many developers to have to update their code. I don't believe that we can ever migrate to *nominal* as default. The feature implies that any abstract data type that uses a type parameter in certain ways needs annotate to get the full safety afforded now afforded. However, without annotation, the data type is still no worse off than it was before (there is added safety, but not perhaps relevant to the stand point of the library writer). Further, this (pre-existing) non-safety isn't likely a huge concern. Making sure the docs take the tone that most developers need to nothing, and when developers need to be concerned seems like an important way to ensure the right outcome. A key question here is transitivity: Is it possible for module A to not annotate a type, and then have module B by a different author use the type in A in another abstract type, that *is* annotated, and get the benefit. Seems the answer is "partially". If the answer were "no", then use of the feature would be dependent on transitive adoption, and that is where the big burden on developers comes from. The degree to which we believe this "partially" is important: If we are willing to believe that the only library writers we care about doing this are those in the core, then fine. In this case we shouldn't feel compelled to suggest to library writers that they annotate, ever. I'm good with this. If the team here thinks otherwise, that we need to start a campaign to get every library writer to eventually annotate, then I have deep objections. I read the paper, and understand how the authors felt the syntax options were all less than perfect, and choose what they did. But that choice, perhaps unwittingly, the implication that it forces -XCPP on all libraries except perhaps some of the core. This is because they all need to support previous compilers. So, a one line annotation has turned into an ugly beast, and perhaps added -XCPP where there was none, which is really unfortunate. (I, like many, consider it a defeat when one has to resort to -XCPP.) It seems to me that the paper didn't really consider less-perfect, heuristic solutions. It might have had significantly less impact on library writers were some heuristic (no constructors exported? has any type constraint on the parameter? etc..) might have allowed most data types to go without annotation at the cost of a few (where *nominal* was incorrectly inferred) requiring immediate action. In this situation, a non-language feature (pragma or other device) might have been more palatable. Finally, on the choice of terms, *nominal*, *representational*, and *phantom* all seem like clear, self-explanatory choices to me. - Mark

The degree to which we believe this "partially" is important: If we are willing to believe that the only library writers we care about doing this are those in the core, then fine. In this case we shouldn't feel compelled to suggest to library writers that they annotate, ever. I'm good with this. If the team here thinks otherwise, that we need to start a campaign to get every library writer to eventually annotate, then I have deep objections. The situation today is that · A client of a library can use GND to do bad things to the library (e.g. change the “key” type of (Map key value)). · Role annotations allow the library author to prevent that happening. Would you say that means that we are “compelled to suggest to library writers that they annotate”? I would have thought that it would indeed be good to suggest to them that a new opportunity exists for them to make their library more robust to clients. They are free to do nothing, or to take advantage of the suggestion. It’s an upside-only situation. Looking further ahead, when you say that “there can be no migration from representational-by-default”, do you have data to support that? Notably, any client not using GND could not observe a change. So simply seeing how many library modules use GND would be an upper bound on how many libraries would fail to compile you were to ask us to change the default. Is that 1% of Hackage modules? 10%? 0.1%? I don’t know. The awkward bit is that if a client is using GND, which fails after a change to nominal-by-default, the fix is to change the library, not the client, and I can see that is awkward. Simon From: Libraries [mailto:libraries-bounces@haskell.org] On Behalf Of Mark Lentczner Sent: 25 March 2014 15:10 To: libraries@haskell.org Libraries; ghc-devs@haskell.org Subject: Re: We need to add role annotations for 7.8 Thank you to everyone who has been helping me understand this issue in greater depth. tl;dr: As long as we don't expect any libraries beyond to core to annotate, I'm cool. This presumes that the extra safety isn't, in practice, dependent on transitive adoption by libraries. It also implies that representational is the only possible default, and that there can be no migration from it. My approach to thinking about this is guided by thinking about supporting an eco-system with 1000s of libraries (hackage), a few dozen of which are heavily promoted (the platform), and a small set that are closely tied to the compiler (the core). The availability, speed of release, motivation, and even skill of the the developers varies widely over that range. I also think about the various "stances" of different developers: * End developer: makes use of libraries, but just builds apps * Internal developer: makes libraries for internal use in a project * Casual library writer: makes libraries, primarily for their own needs, but distributed on hackage * Popular library writer: actively maintains libraries which are widely used * Core library writer: maintainer of a core package that stays in lock step with the compiler Then, I think about, for each of these, what is the effect on a new feature on them, their existing code, and future code? Does it affect them only if they are using the feature? If they aren't using the feature? For library writers, how does the feature affect clients? If a client wants to use a feature, under what conditions does the library need to do something? This last issue of the "transitivity" the feature is often the biggest concern. Given that... onto type roles: The default of representational is the only option, because a default of nominal would require far too many developers to have to update their code. I don't believe that we can ever migrate to nominal as default. The feature implies that any abstract data type that uses a type parameter in certain ways needs annotate to get the full safety afforded now afforded. However, without annotation, the data type is still no worse off than it was before (there is added safety, but not perhaps relevant to the stand point of the library writer). Further, this (pre-existing) non-safety isn't likely a huge concern. Making sure the docs take the tone that most developers need to nothing, and when developers need to be concerned seems like an important way to ensure the right outcome. A key question here is transitivity: Is it possible for module A to not annotate a type, and then have module B by a different author use the type in A in another abstract type, that is annotated, and get the benefit. Seems the answer is "partially". If the answer were "no", then use of the feature would be dependent on transitive adoption, and that is where the big burden on developers comes from. The degree to which we believe this "partially" is important: If we are willing to believe that the only library writers we care about doing this are those in the core, then fine. In this case we shouldn't feel compelled to suggest to library writers that they annotate, ever. I'm good with this. If the team here thinks otherwise, that we need to start a campaign to get every library writer to eventually annotate, then I have deep objections. I read the paper, and understand how the authors felt the syntax options were all less than perfect, and choose what they did. But that choice, perhaps unwittingly, the implication that it forces -XCPP on all libraries except perhaps some of the core. This is because they all need to support previous compilers. So, a one line annotation has turned into an ugly beast, and perhaps added -XCPP where there was none, which is really unfortunate. (I, like many, consider it a defeat when one has to resort to -XCPP.) It seems to me that the paper didn't really consider less-perfect, heuristic solutions. It might have had significantly less impact on library writers were some heuristic (no constructors exported? has any type constraint on the parameter? etc..) might have allowed most data types to go without annotation at the cost of a few (where nominal was incorrectly inferred) requiring immediate action. In this situation, a non-language feature (pragma or other device) might have been more palatable. Finally, on the choice of terms, nominal, representational, and phantom all seem like clear, self-explanatory choices to me. - Mark

*Apologies*
On Tue, Mar 25, 2014 at 8:47 AM, Simon Peyton Jones
The situation today is that · A client of a library can use GND to do bad things to the library (e.g. change the “key” type of (Map key value)). · Role annotations allow the library author to prevent that happening. Would you say that means that we are “compelled to suggest to library writers that they annotate”?
Well... I don't think we should. The reason is that this situation is very sad for it puts the burden upon the library writer, for potential abuse of an extension to Haskell she might not even be aware of! She writes a perfectly safe, reasonable abstracted type, and bam, now has to worry about a very hard to understand situation involving the interaction to two separate Haskell extensions. And furthermore, adding that protection requires yet a third (CPP), and makes the "protection" often as long as the abstract type itself. Looking further ahead, when you say that “there can be no migration from
representational-by-default”, do you have data to support that? Notably, any client not using GND could not observe a change. So simply seeing how many library modules use GND would be an upper bound on how many libraries would fail to compile you were to ask us to change the default. Is that 1% of Hackage modules? 10%? 0.1%? I don’t know.
You are wrong that use of GND is the upper bound: The burden is on the type
author, not the GND user. And so, while only a small percent of Hackage
uses GND (though I note that more and more literature promotes GND (very
handy in Shake, for example)...) in order to keep them from breaking, a
potentially much larger percentage of Hackage has to get fixed.
What's more, the ability to remedy the situation is in the wrong place: If
the default changes, and my GND library breaks, all my users are broken,
and worse, I can't do anything about it until I compel the libraries I
depend on to annotate.
This is why we can't ever change the default.
On Tue, Mar 25, 2014 at 4:23 PM, Richard Eisenberg
The problem is, in the actual datatype definition, the constraints tend not to appear? Should we look around for other functions with constraints?
Right - we've been advocating removing them for years, and only placing the
constraints on the functions that need them, since they really present no
constraint on the data type itself. Of course, the presence of GND and
roles means that they now *would be* saying something about the type - as
they are indicating that the integrity of the type requires the constraint.
So yes, a shift to using this as the marker for nominal would require a
change in developer habit. But so does annotation.
I agree that other heuristics are pretty fragile: names of modules,
presence of constraints in functions, and even status of constructor export
are all a) far too removed from the code site in question, and b) things
that are much more fluid during development. I would be against any of
these.
On Wed, Mar 26, 2014 at 8:46 PM, Edward Kmett
Personally, looking at it 10 years on, having a nominal default would look pretty terrible to me. I'd be stuck annotating everything I write. Nothing easy could just be easy.
Agree whole-heartedly.
Worth reiterating: Easy things should not need annotation.
On Wed, Mar 26, 2014 at 11:44 PM, Ganesh Sittampalam
I think that in theory the basic principle should be that by default you can only write a GND if you could have written it by hand in the same scope - i.e. you can only do it if you have access to the relevant methods and datatype constructors etc
This is much closer to the approach I wish had been taken: The burden is on the correct party. The client of the lib, wishing to use it in a new way, unbeknownst to the library author. I don't know enough about the type theory, but could we have disallowed GND in the presence of type families anywhere in the class being derived?

The first line was supposed to say: *Apologies for the delayed and multi-message response.*

I think a few clarifications might help: - Roles, as originally conceived, were not an attempt to make Unsafe code Safe. Instead, they make unsafe things safe. Before roles, it was quite possible to write Haskell code that would cause a seg fault at runtime. Now, this is (short of unsafeCoerce & friends) impossible, as far as we know. This is independent of any concern with Safe Haskell. That is why certain code that used to work with GND can no longer do so, and why there is no easy fix -- the old code is unsafe, not just Unsafe. - Role annotations are never necessary to ensure type safety. To reiterate: all Haskell code, with or without type annotations, is now safe from the interaction between GND and TypeFamilies. - The whole debate here is about *abstraction* -- whether or not a user outside of a library can fiddle with that library's expected invariants. - Edward and Mark have said that with a default of a nominal role "Nothing easy could just be easy." Yet, we accept the need for deriving Eq and Show without question. I think, if we ignore its current alienness, a role annotation is on a similar order -- a role annotation (in a world with a nominal default) would be granting new capabilities to users of a type, just like adding instances of classes. - If you could use GND only where the constructors are available, then some valid current use of GND would break, I believe. It would mean that GND would be unable to coerce a (Map String Int) to a (Map String Age), because the constructor of Set is (rightly) not exported. This would have a direct runtime significance for some users -- their code would run slower. Richard On Mar 28, 2014, at 12:17 PM, Mark Lentczner wrote:
Apologies On Tue, Mar 25, 2014 at 8:47 AM, Simon Peyton Jones
wrote: The situation today is that · A client of a library can use GND to do bad things to the library (e.g. change the “key” type of (Map key value)). · Role annotations allow the library author to prevent that happening. Would you say that means that we are “compelled to suggest to library writers that they annotate”? Well... I don't think we should.
The reason is that this situation is very sad for it puts the burden upon the library writer, for potential abuse of an extension to Haskell she might not even be aware of! She writes a perfectly safe, reasonable abstracted type, and bam, now has to worry about a very hard to understand situation involving the interaction to two separate Haskell extensions. And furthermore, adding that protection requires yet a third (CPP), and makes the "protection" often as long as the abstract type itself.
Looking further ahead, when you say that “there can be no migration from representational-by-default”, do you have data to support that? Notably, any client not using GND could not observe a change. So simply seeing how many library modules use GND would be an upper bound on how many libraries would fail to compile you were to ask us to change the default. Is that 1% of Hackage modules? 10%? 0.1%? I don’t know.
You are wrong that use of GND is the upper bound: The burden is on the type author, not the GND user. And so, while only a small percent of Hackage uses GND (though I note that more and more literature promotes GND (very handy in Shake, for example)...) in order to keep them from breaking, a potentially much larger percentage of Hackage has to get fixed.
What's more, the ability to remedy the situation is in the wrong place: If the default changes, and my GND library breaks, all my users are broken, and worse, I can't do anything about it until I compel the libraries I depend on to annotate.
This is why we can't ever change the default.
On Tue, Mar 25, 2014 at 4:23 PM, Richard Eisenberg
wrote: The problem is, in the actual datatype definition, the constraints tend not to appear? Should we look around for other functions with constraints? Right - we've been advocating removing them for years, and only placing the constraints on the functions that need them, since they really present no constraint on the data type itself. Of course, the presence of GND and roles means that they now would be saying something about the type - as they are indicating that the integrity of the type requires the constraint. So yes, a shift to using this as the marker for nominal would require a change in developer habit. But so does annotation.
I agree that other heuristics are pretty fragile: names of modules, presence of constraints in functions, and even status of constructor export are all a) far too removed from the code site in question, and b) things that are much more fluid during development. I would be against any of these.
On Wed, Mar 26, 2014 at 8:46 PM, Edward Kmett
wrote: Personally, looking at it 10 years on, having a nominal default would look pretty terrible to me. I'd be stuck annotating everything I write. Nothing easy could just be easy. Agree whole-heartedly. Worth reiterating: Easy things should not need annotation.
On Wed, Mar 26, 2014 at 11:44 PM, Ganesh Sittampalam
wrote: I think that in theory the basic principle should be that by default you can only write a GND if you could have written it by hand in the same scope - i.e. you can only do it if you have access to the relevant methods and datatype constructors etc This is much closer to the approach I wish had been taken: The burden is on the correct party. The client of the lib, wishing to use it in a new way, unbeknownst to the library author. I don't know enough about the type theory, but could we have disallowed GND in the presence of type families anywhere in the class being derived?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hi Mark, I appreciate your analysis in terms of classes of users -- I think that is helpful for framing the discussion. About transitivity: I think we're in the clear here. Let's say package A exports types missing role annotations. If package B imports package A and wants to have the full safety afforded by roles, that is no problem whatsoever. Package B has annotations on its types (which may use package A's types) that may restrict certain parameters to be nominal, as appropriate. If package A had role annotations, it's quite possible that package B could omit some annotations (as role inference propagates nominal roles), but there is no problem inherent in this. (Indeed, if package A adds annotations in the future, package B would have redundant, but harmless, annotations.) So, I disagree with Mark's "partially" below -- I think we're fully OK in this regard. About heuristics: we briefly considered some, though there's no documentation of this anywhere. Specifically, we thought about giving nominal roles to parameters used in class constraints. The problem is, in the actual datatype definition, the constraints tend not to appear? Should we look around for other functions with constraints? That seems likely to be more confusing than helpful. Furthermore, I strongly don't like the idea of using heuristics to infer a feature such as this -- it can cause strange behavior and is hard to specify. Richard On Mar 25, 2014, at 11:09 AM, Mark Lentczner wrote:
Thank you to everyone who has been helping me understand this issue in greater depth.
tl;dr: As long as we don't expect any libraries beyond to core to annotate, I'm cool. This presumes that the extra safety isn't, in practice, dependent on transitive adoption by libraries. It also implies that representational is the only possible default, and that there can be no migration from it.
My approach to thinking about this is guided by thinking about supporting an eco-system with 1000s of libraries (hackage), a few dozen of which are heavily promoted (the platform), and a small set that are closely tied to the compiler (the core). The availability, speed of release, motivation, and even skill of the the developers varies widely over that range.
I also think about the various "stances" of different developers: End developer: makes use of libraries, but just builds apps Internal developer: makes libraries for internal use in a project Casual library writer: makes libraries, primarily for their own needs, but distributed on hackage Popular library writer: actively maintains libraries which are widely used Core library writer: maintainer of a core package that stays in lock step with the compiler Then, I think about, for each of these, what is the effect on a new feature on them, their existing code, and future code? Does it affect them only if they are using the feature? If they aren't using the feature? For library writers, how does the feature affect clients? If a client wants to use a feature, under what conditions does the library need to do something? This last issue of the "transitivity" the feature is often the biggest concern.
Given that... onto type roles:
The default of representational is the only option, because a default of nominal would require far too many developers to have to update their code. I don't believe that we can ever migrate to nominal as default.
The feature implies that any abstract data type that uses a type parameter in certain ways needs annotate to get the full safety afforded now afforded. However, without annotation, the data type is still no worse off than it was before (there is added safety, but not perhaps relevant to the stand point of the library writer). Further, this (pre-existing) non-safety isn't likely a huge concern. Making sure the docs take the tone that most developers need to nothing, and when developers need to be concerned seems like an important way to ensure the right outcome.
A key question here is transitivity: Is it possible for module A to not annotate a type, and then have module B by a different author use the type in A in another abstract type, that is annotated, and get the benefit. Seems the answer is "partially". If the answer were "no", then use of the feature would be dependent on transitive adoption, and that is where the big burden on developers comes from.
The degree to which we believe this "partially" is important: If we are willing to believe that the only library writers we care about doing this are those in the core, then fine. In this case we shouldn't feel compelled to suggest to library writers that they annotate, ever. I'm good with this. If the team here thinks otherwise, that we need to start a campaign to get every library writer to eventually annotate, then I have deep objections.
I read the paper, and understand how the authors felt the syntax options were all less than perfect, and choose what they did. But that choice, perhaps unwittingly, the implication that it forces -XCPP on all libraries except perhaps some of the core. This is because they all need to support previous compilers. So, a one line annotation has turned into an ugly beast, and perhaps added -XCPP where there was none, which is really unfortunate. (I, like many, consider it a defeat when one has to resort to -XCPP.)
It seems to me that the paper didn't really consider less-perfect, heuristic solutions. It might have had significantly less impact on library writers were some heuristic (no constructors exported? has any type constraint on the parameter? etc..) might have allowed most data types to go without annotation at the cost of a few (where nominal was incorrectly inferred) requiring immediate action. In this situation, a non-language feature (pragma or other device) might have been more palatable.
Finally, on the choice of terms, nominal, representational, and phantom all seem like clear, self-explanatory choices to me.
- Mark
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
participants (3)
-
Mark Lentczner
-
Richard Eisenberg
-
Simon Peyton Jones