
Friends You may remember a recent thread on ghc-devs about GHC and Cabal https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html. In it I say how I feel I lack the "big picture" of how GHC and Cabal interact, and that my mental model is probably faulty. Tom Ellis took pity on me, and together we wrote this big-picture overview about how GHC and Cabal interact https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqn.... Would you like to: - Read it as a consumer. - Does it tell you stuff that is useful? - What else would you like to know? - What is un-clear or missing? - Read it as an expert. - Is it accurate? - Are any bits misleading? - Do the links go to appropriate places? - What other links or resource would be helpful. It is not intended as a replacement for the GHC user guide, nor the Cabal user guide; rather it is littered with links to those guides which give much fuller details. Rather, it is intended to put you (well, me for one!) in a position where you can more easily make sense of those documents. We'd love to have your help in improving it. Simon

My first comment, which applies across the whole document is Don't write "package (unit)". Write unit. Leave the package to be used solely as "A package is the unit of distribution and versioning.", and use "unit" consistently for compilation units, and/or "component" (or more specifically "library" etc). The naming of flags is a history artifact. The key observation is that "package is the unit of distribution" is nowadays only a Cabal concept. Only PackageImports and "imprecise" flags like "-package" (c.f. "-package-id" which ought to be called "-unit-id") in GHC really know or care about that. Second comment, is that be mindful about `cabal-install` and Cabal difference. The "3 Cabal" section is really "3 cabal-install", and e.g. stack does things differently.
Suppose version 2.3.7 of package P, called P-2.3.7, depends on package Q.
Is therefore wrong. You should write "Suppose version 2.3.7 of library P, called "P-2.3.7", depends on library Q". Also libraries can depend on executables: e.g. happy, GHC doesn't care about those dependencies, but Cabal (the library, which does the building) does.
Each unit has a unit-id, looking like
*may* look. The unit identifier is a random string invented by a build tool. It's informative, but it really doesn't matter much.
Q: "installed package" means the same as "unit"
Not exactly.
Q: "package id" means the same as "unit-id"
I think so. And I'd argue to not use "package id" going forward.
recompiling with no change could change the binary (non-determinism). Does that change the unit-id?
A package database can contain many installed versions of the same
documentation for -package does not clearly specify how the name of
It doesn't. Unit-id is invented prior to compilation. Therefore at least *interface determinism* is important. Though, cabal-install v2 *never* re-install units to store database, so determinism is not a hard requirement. package P, or even of a particular version of P, say P-2.4.3, compiled against different dependencies. Even against the same dependencies, even with the same flags, if for some reason the build tool changes the way it computes the unit-id. Also s/package/library/. Re-call, there exist non-main sublibraries. the package is mapped to a unit-id. Important bit to remember about "-package" is that it's a legacy flag, not used by tools anymore. -package-id looks for the unit exactly. -package scans to find a matching one, there may be many (and e.g. in case of the same version, probably non-deterministic choice is made).
This .cabal/store is not a package database.
.cabal/store/<ghc> **is** an ordinary package database.
Rather, cabal will invoke ghc with a long list of -package-id <unit-id> flags
Yes. This is not mutually exclusive. Package database flags tell where, `-package-id` flags tell what units to use.
Can a package contain multiple public libraries?
Yes. public/private doesn't matter for GHC though. Cabal enforce the dependency visibility. I.e. private/public is a Cabal concept. (The visibility is written to interface files, but it's there solely for Cabal to figure out what the visibility was. GHC doesn't or at least shouldn't use that info).
Difference between unit-id and ABI hash?
As far as I remember, unit-id tries to approximate ABI hash. In fact, there was a request to have GHC output something like ABI-hash given the set of flags. Currently Cabal has an ad-hoc implementation to filter out flags which should not affect the ABI of a package (like `-fprint-explicit-foralls`. Side note: it would been clearer if flag name convention would suggest already whether they affect ABI or not. E.g. `-ddump` flags or generally `-d` flags don't, but `-f` flags do, except e.g. `-fprint...` which is kind of `-ddump` like flag). On 16.7.2024 13.20, Simon Peyton Jones wrote:
Friends
You may remember a recent thread on ghc-devs about GHC and Cabal https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html. In it I say how I feel I lack the "big picture" of how GHC and Cabal interact, and that my mental model is probably faulty.
Tom Ellis took pity on me, and together we wrote this big-picture overview about how GHC and Cabal interact https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqn.... Would you like to:
* Read it as a consumer. o Does it tell you stuff that is useful? o What else would you like to know? o What is un-clear or missing? * Read it as an expert. o Is it accurate? o Are any bits misleading? o Do the links go to appropriate places? o What other links or resource would be helpful.
It is not intended as a replacement for the GHC user guide, nor the Cabal user guide; rather it is littered with links to those guides which give much fuller details. Rather, it is intended to put you (well, me for one!) in a position where you can more easily make sense of those documents.
We'd love to have your help in improving it.
Simon
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Thanks Oleg. On Tue, Jul 16, 2024 at 02:35:50PM +0300, Oleg Grenrus wrote:
This .cabal/store is not a package database.
.cabal/store/<ghc> **is** an ordinary package database.
The package database is the thing that contains the .conf files, isn't it? So perhaps you mean .cabal/store/<ghc>/package.db is an ordinary package database? If not it would be good if we can establish precisely what is meant by "package database" and then clarify the document.

package.db directory has only .conf files, but those conf files point to directories with interface and object files. I think every tool puts the interface and object files next to .conf files. ./package.db is the "index", ./aeson-2.2.3.0-3b4881079a148ac154e9112c43bd71f61140bdb9e78660803e826af798de2108 ./aeson-2.2.3.0-9bfa3c3833e85d9e62afa4f60b2820c47e8c3382991299ab624dedb682bb786a ... is the data. Also global package db is somewhat similar with index at e.g. /opt/ghcup/.ghcup/ghc/9.10.1/lib/ghc-9.10.1/lib/package.conf.d and data at /opt/ghcup/.ghcup/ghc/9.10.1/lib/ghc-9.10.1/lib/x86_64-linux-ghc-9.10.1 - Oleg On 16.7.2024 14.43, Tom Ellis wrote:
Thanks Oleg.
This .cabal/store is not a package database. .cabal/store/<ghc> **is** an ordinary package database. The package database is the thing that contains the .conf files, isn't it? So perhaps you mean .cabal/store/<ghc>/package.db is an ordinary
On Tue, Jul 16, 2024 at 02:35:50PM +0300, Oleg Grenrus wrote: package database? If not it would be good if we can establish precisely what is meant by "package database" and then clarify the document. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Tue, Jul 16, 2024 at 02:51:07PM +0300, Oleg Grenrus wrote:
package.db directory has only .conf files, but those conf files point to directories with interface and object files. I think every tool puts the interface and object files next to .conf files.
Right, but in principle I could do something different, couldn't I? I could take some subset of the .conf files in package.db, put them in a directory called /tmp/mynewpackagedatabase, and *that* would be a perfectly valid package database (as long as the files it points to still exist), wouldn't it? Tom

Sure. But let's not be too nitpicky. GHC indeed wants the "package.db" directory when you give it as -package-db flag argument. The directory of .conf files is though not what GHC reads, it reads the cache file in that directory, i.e. package.cache file, so you can hack together a tool (like ghc-pkg) which manages the package.cache file, and then .conf files don't need to be in that directory at all. I don't think it ever made sense to differentiate "cache file" from "just conf files" and/or "conf files with interface and object files". IIRC "ghc-pkg check" checks that interface files exists (or at least the directories exist they should be in). My point was that .cabal/store/<ghc>/package.db (or just .cabal/store/<ghc> as it's not ambiguous here) are ordinary package databases. - Oleg On 16.7.2024 15.19, Tom Ellis wrote:
package.db directory has only .conf files, but those conf files point to directories with interface and object files. I think every tool puts the interface and object files next to .conf files. Right, but in principle I could do something different, couldn't I? I could take some subset of the .conf files in package.db, put them in a
On Tue, Jul 16, 2024 at 02:51:07PM +0300, Oleg Grenrus wrote: directory called /tmp/mynewpackagedatabase, and *that* would be a perfectly valid package database (as long as the files it points to still exist), wouldn't it?
Tom _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Thanks Oleg Don't write "package (unit)". Write unit. OK. But:
Q: "installed package" means the same as "unit" Not exactly.
If a unit is not an installed library, what (precisely) is a unit?
Thanks
Simon
On Tue, 16 Jul 2024 at 12:36, Oleg Grenrus
My first comment, which applies across the whole document is
Don't write "package (unit)". Write unit.
Leave the package to be used solely as "A package is the unit of distribution and versioning.", and use "unit" consistently for compilation units, and/or "component" (or more specifically "library" etc).
The naming of flags is a history artifact.
The key observation is that "package is the unit of distribution" is nowadays only a Cabal concept. Only PackageImports and "imprecise" flags like "-package" (c.f. "-package-id" which ought to be called "-unit-id") in GHC really know or care about that.
Second comment, is that be mindful about `cabal-install` and Cabal difference. The "3 Cabal" section is really "3 cabal-install", and e.g. stack does things differently.
Suppose version 2.3.7 of package P, called P-2.3.7, depends on package Q.
Is therefore wrong. You should write "Suppose version 2.3.7 of library P, called "P-2.3.7", depends on library Q".
Also libraries can depend on executables: e.g. happy, GHC doesn't care about those dependencies, but Cabal (the library, which does the building) does.
Each unit has a unit-id, looking like
*may* look. The unit identifier is a random string invented by a build tool. It's informative, but it really doesn't matter much.
Q: "installed package" means the same as "unit"
Not exactly.
Q: "package id" means the same as "unit-id"
I think so. And I'd argue to not use "package id" going forward.
recompiling with no change could change the binary (non-determinism). Does that change the unit-id?
It doesn't. Unit-id is invented prior to compilation. Therefore at least *interface determinism* is important. Though, cabal-install v2 *never* re-install units to store database, so determinism is not a hard requirement.
A package database can contain many installed versions of the same package P, or even of a particular version of P, say P-2.4.3, compiled against different dependencies.
Even against the same dependencies, even with the same flags, if for some reason the build tool changes the way it computes the unit-id.
Also s/package/library/. Re-call, there exist non-main sublibraries.
documentation for -package does not clearly specify how the name of the package is mapped to a unit-id.
Important bit to remember about "-package" is that it's a legacy flag, not used by tools anymore. -package-id looks for the unit exactly. -package scans to find a matching one, there may be many (and e.g. in case of the same version, probably non-deterministic choice is made).
This .cabal/store is not a package database.
.cabal/store/<ghc> **is** an ordinary package database.
Rather, cabal will invoke ghc with a long list of -package-id <unit-id> flags
Yes. This is not mutually exclusive. Package database flags tell where, `-package-id` flags tell what units to use.
Can a package contain multiple public libraries?
Yes. public/private doesn't matter for GHC though. Cabal enforce the dependency visibility. I.e. private/public is a Cabal concept. (The visibility is written to interface files, but it's there solely for Cabal to figure out what the visibility was. GHC doesn't or at least shouldn't use that info).
Difference between unit-id and ABI hash?
As far as I remember, unit-id tries to approximate ABI hash. In fact, there was a request to have GHC output something like ABI-hash given the set of flags. Currently Cabal has an ad-hoc implementation to filter out flags which should not affect the ABI of a package (like `-fprint-explicit-foralls`. Side note: it would been clearer if flag name convention would suggest already whether they affect ABI or not. E.g. `-ddump` flags or generally `-d` flags don't, but `-f` flags do, except e.g. `-fprint...` which is kind of `-ddump` like flag).
On 16.7.2024 13.20, Simon Peyton Jones wrote:
Friends
You may remember a recent thread on ghc-devs about GHC and Cabal https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html. In it I say how I feel I lack the "big picture" of how GHC and Cabal interact, and that my mental model is probably faulty.
Tom Ellis took pity on me, and together we wrote this big-picture overview about how GHC and Cabal interact https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqn.... Would you like to:
- Read it as a consumer. - Does it tell you stuff that is useful? - What else would you like to know? - What is un-clear or missing? - Read it as an expert. - Is it accurate? - Are any bits misleading? - Do the links go to appropriate places? - What other links or resource would be helpful.
It is not intended as a replacement for the GHC user guide, nor the Cabal user guide; rather it is littered with links to those guides which give much fuller details. Rather, it is intended to put you (well, me for one!) in a position where you can more easily make sense of those documents.
We'd love to have your help in improving it.
Simon
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 16.7.2024 17.08, Simon Peyton Jones wrote:
Thanks Oleg
Don't write "package (unit)". Write unit.
OK. But:
Q: "installed package" means the same as "unit" Not exactly.
If a unit is not an installed library, what (precisely) is a unit?
There are "buts". Open units (backpack) are not really libraries. Executables, test-suites and benchmarks are not libraries, but still compilation units and cabal-install gives them unit ids; and probably even tells it to GHC as "-this-unit-id".
Thanks
Simon
On Tue, 16 Jul 2024 at 12:36, Oleg Grenrus
wrote: My first comment, which applies across the whole document is
Don't write "package (unit)". Write unit.
Leave the package to be used solely as "A package is the unit of distribution and versioning.", and use "unit" consistently for compilation units, and/or "component" (or more specifically "library" etc).
The naming of flags is a history artifact.
The key observation is that "package is the unit of distribution" is nowadays only a Cabal concept. Only PackageImports and "imprecise" flags like "-package" (c.f. "-package-id" which ought to be called "-unit-id") in GHC really know or care about that.
Second comment, is that be mindful about `cabal-install` and Cabal difference. The "3 Cabal" section is really "3 cabal-install", and e.g. stack does things differently.
> Suppose version 2.3.7 of package P, called P-2.3.7, depends on package Q.
Is therefore wrong. You should write "Suppose version 2.3.7 of library P, called "P-2.3.7", depends on library Q".
Also libraries can depend on executables: e.g. happy, GHC doesn't care about those dependencies, but Cabal (the library, which does the building) does.
> Each unit has a unit-id, looking like
*may* look. The unit identifier is a random string invented by a build tool. It's informative, but it really doesn't matter much.
> Q: "installed package" means the same as "unit"
Not exactly.
> Q: "package id" means the same as "unit-id"
I think so. And I'd argue to not use "package id" going forward.
> recompiling with no change could change the binary (non-determinism). Does that change the unit-id?
It doesn't. Unit-id is invented prior to compilation. Therefore at least *interface determinism* is important. Though, cabal-install v2 *never* re-install units to store database, so determinism is not a hard requirement.
> A package database can contain many installed versions of the same package P, or even of a particular version of P, say P-2.4.3, compiled against different dependencies.
Even against the same dependencies, even with the same flags, if for some reason the build tool changes the way it computes the unit-id.
Also s/package/library/. Re-call, there exist non-main sublibraries.
> documentation for -package does not clearly specify how the name of the package is mapped to a unit-id.
Important bit to remember about "-package" is that it's a legacy flag, not used by tools anymore. -package-id looks for the unit exactly. -package scans to find a matching one, there may be many (and e.g. in case of the same version, probably non-deterministic choice is made).
> This .cabal/store is not a package database.
.cabal/store/<ghc> **is** an ordinary package database.
> Rather, cabal will invoke ghc with a long list of -package-id <unit-id> flags
Yes. This is not mutually exclusive. Package database flags tell where, `-package-id` flags tell what units to use.
> Can a package contain multiple public libraries?
Yes. public/private doesn't matter for GHC though. Cabal enforce the dependency visibility. I.e. private/public is a Cabal concept. (The visibility is written to interface files, but it's there solely for Cabal to figure out what the visibility was. GHC doesn't or at least shouldn't use that info).
> Difference between unit-id and ABI hash?
As far as I remember, unit-id tries to approximate ABI hash. In fact, there was a request to have GHC output something like ABI-hash given the set of flags. Currently Cabal has an ad-hoc implementation to filter out flags which should not affect the ABI of a package (like `-fprint-explicit-foralls`. Side note: it would been clearer if flag name convention would suggest already whether they affect ABI or not. E.g. `-ddump` flags or generally `-d` flags don't, but `-f` flags do, except e.g. `-fprint...` which is kind of `-ddump` like flag).
On 16.7.2024 13.20, Simon Peyton Jones wrote:
Friends
You may remember a recent thread on ghc-devs about GHC and Cabal https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html. In it I say how I feel I lack the "big picture" of how GHC and Cabal interact, and that my mental model is probably faulty.
Tom Ellis took pity on me, and together we wrote this big-picture overview about how GHC and Cabal interact https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqn.... Would you like to:
* Read it as a consumer. o Does it tell you stuff that is useful? o What else would you like to know? o What is un-clear or missing? * Read it as an expert. o Is it accurate? o Are any bits misleading? o Do the links go to appropriate places? o What other links or resource would be helpful.
It is not intended as a replacement for the GHC user guide, nor the Cabal user guide; rather it is littered with links to those guides which give much fuller details. Rather, it is intended to put you (well, me for one!) in a position where you can more easily make sense of those documents.
We'd love to have your help in improving it.
Simon
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

What other links or resource would be helpful.
On almost the same topic (only the GHC side) I wrote a big picture note "About units" a few years ago: https://gitlab.haskell.org/ghc/ghc/-/blob/12d3b66cedd3c80e7c1e030238c92d2663... Sylvain On 16/07/2024 12:20, Simon Peyton Jones wrote:
Friends
You may remember a recent thread on ghc-devs about GHC and Cabal https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html. In it I say how I feel I lack the "big picture" of how GHC and Cabal interact, and that my mental model is probably faulty.
Tom Ellis took pity on me, and together we wrote this big-picture overview about how GHC and Cabal interact https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqn.... Would you like to:
* Read it as a consumer. o Does it tell you stuff that is useful? o What else would you like to know? o What is un-clear or missing? * Read it as an expert. o Is it accurate? o Are any bits misleading? o Do the links go to appropriate places? o What other links or resource would be helpful.
It is not intended as a replacement for the GHC user guide, nor the Cabal user guide; rather it is littered with links to those guides which give much fuller details. Rather, it is intended to put you (well, me for one!) in a position where you can more easily make sense of those documents.
We'd love to have your help in improving it.
Simon
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (4)
-
Oleg Grenrus
-
Simon Peyton Jones
-
Sylvain Henry
-
Tom Ellis