[GHC] #10723: Make declarations in signatures "weakly bound" until they are used

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature | Status: new request | Priority: normal | Milestone: Component: Package | Version: 7.11 system | Keywords: backpack | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Revisions: | -------------------------------------+------------------------------------- Suppose you are the author of a library in a Backpack world, and you publish a signature package which defines the entire public facing interface of your library. The library `foo` which uses of your library decides to `include` the signature package for convenience, but actually only uses a small portion of the API. Later, you make a BC-breaking change in one part of the library and release a new signature package. The library `bar` which uses your library includes this NEW signature package, using a different portion of the API which was unaffected by the by the BC change. Now, a hapless user tries to use `foo` and `bar`, but Backpack complains that the requirements are not compatible. What's the problem here? The practice of writing reusable signature packages for people to use caused the requirements of `foo` and `bar` to become too large, since they included a lot of junk that these libraries didn't actually use. It would be far better if you could `include` a signature package, but only "require" the bits of it that you actually used! How can we achieve this? 1. We augment the `ModIface` of signature merges (#10690) to record whether or not a declaration was (transitively) used or not by some module. Used declarations must be filled, but unused ones are treated more flexibly: if they are merged with a different, incompatible but used requirement, they disappear, and we don't check if an implementing module actually implemented the declaration. (If two unused incompatible requirements are merged, we just erase the name.) 2. How do we compute the usage info? I think it will have to be done during shaping (which runs the renamer). We only need to annotate each declaration a signature with the transitive set of names from other signatures that it has used--this can be incrementally computed. (It's not necessary to annotate declarations in modules, since they are always assumed to use holes). Then whenever a declaration from a signature is used in a module, we mark its transitive set as used. This information can then be used later when constructing the merged `ModIface` which represents the "public requirement" of the package. So, for example, a package containing only signatures would contain all unused declarations (however, they may start being used by a package which includes them). Any unused declaration which isn't mixed with another incompatible declaration can be imported (causing it to be used), but we will complain if you try to use a name and we can't tell which declaration to use. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 7.11 Resolution: | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Description changed by ezyang: Old description:
Suppose you are the author of a library in a Backpack world, and you publish a signature package which defines the entire public facing interface of your library. The library `foo` which uses of your library decides to `include` the signature package for convenience, but actually only uses a small portion of the API.
Later, you make a BC-breaking change in one part of the library and release a new signature package. The library `bar` which uses your library includes this NEW signature package, using a different portion of the API which was unaffected by the by the BC change.
Now, a hapless user tries to use `foo` and `bar`, but Backpack complains that the requirements are not compatible.
What's the problem here? The practice of writing reusable signature packages for people to use caused the requirements of `foo` and `bar` to become too large, since they included a lot of junk that these libraries didn't actually use. It would be far better if you could `include` a signature package, but only "require" the bits of it that you actually used!
How can we achieve this?
1. We augment the `ModIface` of signature merges (#10690) to record whether or not a declaration was (transitively) used or not by some module. Used declarations must be filled, but unused ones are treated more flexibly: if they are merged with a different, incompatible but used requirement, they disappear, and we don't check if an implementing module actually implemented the declaration. (If two unused incompatible requirements are merged, we just erase the name.)
2. How do we compute the usage info? I think it will have to be done during shaping (which runs the renamer). We only need to annotate each declaration a signature with the transitive set of names from other signatures that it has used--this can be incrementally computed. (It's not necessary to annotate declarations in modules, since they are always assumed to use holes). Then whenever a declaration from a signature is used in a module, we mark its transitive set as used. This information can then be used later when constructing the merged `ModIface` which represents the "public requirement" of the package.
So, for example, a package containing only signatures would contain all unused declarations (however, they may start being used by a package which includes them). Any unused declaration which isn't mixed with another incompatible declaration can be imported (causing it to be used), but we will complain if you try to use a name and we can't tell which declaration to use.
New description: Suppose you are the author of a library in a Backpack world, and you publish a signature package which defines the entire public facing interface of your library. The library `foo` which uses of your library decides to `include` the signature package for convenience, but actually only uses a small portion of the API. Later, you make a BC-breaking change in one part of the library and release a new signature package. The library `bar` which uses your library includes this NEW signature package, using a different portion of the API which was unaffected by the by the BC change. Now, a hapless user tries to use `foo` and `bar`, but Backpack complains that the requirements are not compatible. What's the problem here? The practice of writing reusable signature packages for people to use caused the requirements of `foo` and `bar` to become too large, since they included a lot of junk that these libraries didn't actually use. It would be far better if you could `include` a signature package, but only "require" the bits of it that you actually used! How can we achieve this? 1. We augment the `ModIface` of signature merges (#10690) to record whether or not a declaration was (transitively) used or not by some module. Used declarations must be filled, but unused ones are treated more flexibly: if they are merged with a different, incompatible but used requirement, they disappear, and we don't check if an implementing module actually implemented the declaration. (If two unused incompatible requirements are merged, we just erase the name.) 2. How do we compute the usage info? I think it will have to be done during shaping (which runs the renamer). We only need to annotate each declaration a signature with the transitive set of names from other signatures that it has used--this can be incrementally computed. (It's not necessary to annotate declarations in modules, since they are always assumed to use holes). Then whenever a declaration from a signature is used in a module, we mark its transitive set as used. This information can then be used later when constructing the merged `ModIface` which represents the "public requirement" of the package. So, for example, a package containing only signatures would contain all unused declarations (however, they may start being used by a package which includes them). Any unused declaration which isn't mixed with another incompatible declaration can be imported (causing it to be used), but we will complain if you try to use a name and we can't tell which declaration to use. (PS: another moral here, is that `include`s are bad UNLESS you are including a signature package! Because an include for a concrete module is a dependency you can't override...) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 7.11 Resolution: | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by skilpat): * cc: skilpat (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 7.11 Resolution: | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by skilpat): The use case you have in mind (which is a good one) sounds exactly like one of the motivations for thinning in Paper Backpack. The idea is that clients could include only the portion of a larger "signature package" that they actually use: {{{ -- using the syntax of the POPL paper package containers-sig-3 where include prelude-sig-1 Data.Set :: [...] Data.Tree :: [...] Data.Graph :: [...] package client where include prelude-sig-1 include containers-sig-3(Data.Set) M = [import Data.Set; ...] }}} The result is that `client` has holes for the prelude stuff and for `Data.Set`, but not for `Data.Tree` or `Data.Graph` -- assuming `Data.Set` doesn't actually make use of (i.e. import) the latter two. So if there's a new version of `containers-sig-3` that makes BC-incompatible changes to `Data.Graph`, then `client` is totally unaffected and will link against an implementation of either version of containers. A problem with thinning is that it's highly dependent on the import graph of the signature package, which isn't a very good signal to clients what kinds of feature selection they can employ in using that signature package. But the nice thing is that it allows ex post facto selection of signature subpackages, so to speak, after the original author has already set it in stone. Anyway, back to the proposal at hand. It seems feasible to me so long as the usage info is used for warnings rather than for implicitly "thinning" a signature package; in other words, so long as it's employed explicitly rather than implicitly. For example, the usage info about `containers- sig-3` could enable a warning that says "you included `containers-sig-3` but didn't use some modules; do you want to annotate the include?". Instead, if the usage info enabled the client package to only (syntactically) depend on `Data.Set` despite no such annotation, that would feel kinda weird IMO. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 7.11 Resolution: | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): It is hard to say what will actually happen in practice until we have people actually using backpack, but I suspect that thinning only at the module level won't be enough. There are two reasons: 1. Think about the average Haskell module: it contains a lot of functions! This means that the normal signature someone will write for a module that already exists is going to have a lot of functions. Here, thinning at the expression level seems more important. 2. How are people going to write signatures? I think that for library level signatures, the best use of peoples time is for signatures to automatically be inferred from existing implementations rather than forcing people to write signatures all the time. So a signature will in general contain way too much stuff! Thinning definitely is necessary in this case. The benefit of implicit thinning is that, although it is strange from a language design perspective, it makes a lot of sense intuitively: there is no dependence on the import graph: a units requirements are completely specified as the set of requirements that it uses, and the set of requirements it available to includers. Yes, it is not syntactically evident what this is, but the whole point of a compiler is that you can get it to tell you what it is (eg some sort of Haddock documentation. Where as with thinning, sometimes it is not permissible to thin a requirement depending on whether it was imported by a module. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: new Priority: lowest | Milestone: Component: Package system | Version: 7.11 Resolution: | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ezyang): * priority: normal => lowest Comment: Since we no longer have a shaping pre-pass, it's a LOT more difficult to actually do this. Lowering priority accordingly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10723: Make declarations in signatures "weakly bound" until they are used -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature request | Status: closed Priority: lowest | Milestone: Component: Package system | Version: 7.11 Resolution: wontfix | Keywords: backpack Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ezyang): * status: new => closed * resolution: => wontfix Comment: I've got a new proposal for something in this space. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10723#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC