[GHC] #10566: Move "package key" generation to GHC

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package | Version: 7.10.1 system | Operating System: Unknown/Multiple Keywords: | Type of failure: None/Unknown Architecture: | Blocked By: Unknown/Multiple | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Currently in GHC 7.10, we have the following situation: 1. Cabal computes a package key, which in practice (since no one is using Backpack in the wild) is a Merkle tree of the versions of each of the dependencies of the package. 2. This package key is passed to GHC via `-this-package-key` 3. GHC handles the package key opaquely Now, in recent Backpack implementation, we need GHC to be able to compute package keys. (The concrete case: you're type-checking an interface file of an indefinite package, where you want to instantiate it with some assignment of its holes: instantiating those holes you need to instantiate any package keys mentioned in the interface, in which case you really want to be able to compute the hash.) So I want to move package key generation to GHC. The primary implication is this: does Cabal continue to generate package keys? If it doesn't, we should revert from `-this-package-key` back to `-package-name` from the previous version (but maybe renamed because this name was bad). GHC then computes a package key based on `-package-name` and the explicitly mentioned `-package` dependencies, and Cabal reads it out with something akin to `--abi-hash`. If it does, we need to ensure GHC and Cabal's package key algorithms stay synchronized. I personally lean towards the first option. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Sounds good in general terms. But up to now GHC has been primary a module-at-a-time compiler. To compute a package key it needs to know all about a package. Do you intend to do this by giving it a Backpack file describing the package? Or what? And how does Cabal get to know what key GHC computed? More detail required! Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang):
Sounds good in general terms. But up to now GHC has been primary a module-at-a-time compiler. To compute a package key it needs to know all about a package. Do you intend to do this by giving it a Backpack file describing the package? Or what?
So, actually, we can do this very nicely: if you compile a GHC module with `-hide-all-packages` and a list of `-package` flags, then GHC can infer the package key by just looking at each exposed package, and including their hash in the computed package key. So, in some since Cabal is still "calculating" many of the important parameters of the package key (the package name, version, and what dependencies are used), but GHC does the actual final calculation in the end. Here's how it looks without Backpack: 1. Cabal calls GHC with `-package-name foo-0.1 -hide-all-packages -package-id bar-0.1-ABCD ...` 2. GHC computes the package state with the package flags, getting a list of exposed packages with `PkgConfig`s 3. Compute the package key by hashing the source package ID and package key of each included exposed package. With Backpack it's a little trickier, because we don't necessarily want to hash the package key of included packages: instead, you want to hash the "version hash" of each package, which is like a package key but minus hole instantiation.
And how does Cabal get to know what key GHC computed?
My initial thought is a new major mode `ghc --package-key -package-name foo-0.1 -hide-all-packages -package bar-0.1-ABCD ...` which outputs the package key. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): Based on conversation with simonpj et al, we have a different plan to accomplish this: 1. We introduce a new concept, the *version hash*, a hash of package name, package version, and the version hashes of all textual dependencies (i.e. packages which were include.) A version hash is a coarse approximation of installed package IDs, which are suitable for inclusion in package keys (you don't want to put an IPID in a package key, since it means the key will change any time the source changes.) Version hashes are calculated by Cabal and passed to GHC, and they are recorded in the installed package database. 2. GHC now accepts a new flag `-version-hash` which Cabal can invoke in order to pass in a version hash. So now we get something like `-version- hash 8TmvWUcS1U1IKHT0levwg3 -hide-all-packages -package-id ...` when we call GHC. GHC takes `-version-hash` and then computes a package key based on it. 3. Cabal computes the version hash by looking at the recorded version hashes in the installed package database of all the external dependencies of the library portion of the package. It then calls GHC's `--package-key` major mode to get the package key that the package will end up having. Cabal tracking bug: https://github.com/haskell/cabal/pull/2685 One minor complication: sometimes, GHC needs to know what the package name of the package currently being built is to give a good error message. Since the version hash is just a hash, this isn't enough information. There are two ways we can get the information we need: 1. Just pass it to GHC. `-package-name` is not a bad flag name for this. 2. Put it into an (inplace) package database and have GHC query that database for the information. This requires some Cabal changes, see https://github.com/haskell/cabal/issues/2710 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): There are many wiki pages about the package system, but the [wiki:Commentary/Packages/Concepts] is bang up to date, and describes version hashes with much more precision; so in reading Edward's comments above, read that page too.
sometimes, GHC needs to know what the package name of the package currently being built is to give a good error message
Examples? Even a list of all the examples. There are probably not many. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang):
Examples? Even a list of all the examples. There are probably not many.
The main situation is when GHC decides that, when printing a type, we have to disambiguate package identity since providing a module is not enough. This happens fairly rarely unless you have multiple versions of a package in scope, since Haskell library authors are fairly good at not having module name collisions; and in any case, one of the packages is going to be in the installed package database in any case. But this capability is more important for Backpack, when we instantiate a package multiple times: then we want to say `p(A -> q():A).M.T` is not equal to `p(A -> r():A).M.T`, rather than say that `1XrbHBg7VzL5pL46pujtdS` is not equal to `ISxd2jw4dc5G3vUrxkRNkV`. So I built in the package name to the definition of a package key. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): Another reason it's important to support `-package-name` separately from `-version-hash` is that they can vary separately. Consider a Backpack file that looks like: {{{ package p where ... package q where ... }}} both of which come from the same Hackage distribution unit `p` (i.e. you can `cabal install p`). Since GHC is responsible for building both p and q at the same time (and Cabal knows nothing about these subpackages), the version hash of these packages has to be the same. However, we still have to distinguish package p and package q, so here we vary the package name. So it has semantic meaning too! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: Phab:D1056 -------------------------------------+------------------------------------- Changes (by ezyang): * differential: => Phab:D1056 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: Phab:D1056 -------------------------------------+------------------------------------- Comment (by ezyang): Corresponding Cabal PR: https://github.com/haskell/cabal/pull/2685 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: Phab:D1056 -------------------------------------+------------------------------------- Comment (by ezyang): I was thinking about this situation some more, and I realized that we can actually avoid having to have Cabal query GHC for the package key, if we adopt a convention: the version hash of a package is the SAME as the package key, if the package is definite. I've repushed the pull request with this in mind. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10566: Move "package key" generation to GHC
-------------------------------------+-------------------------------------
Reporter: ezyang | Owner: ezyang
Type: bug | Status: new
Priority: normal | Milestone: 7.12.1
Component: Package system | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions: Phab:D1056
-------------------------------------+-------------------------------------
Comment (by Edward Z. Yang

#10566: Move "package key" generation to GHC -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: bug | Status: closed Priority: normal | Milestone: 7.12.1 Component: Package system | Version: 7.10.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: Phab:D1056 -------------------------------------+------------------------------------- Changes (by ezyang): * status: new => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10566#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC