[GHC] #12518: Allow customizing immutable package dbs by stacking

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature | Status: new request | Priority: normal | Milestone: Component: Package | Version: 8.0.1 system | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- = Package Selection Across Multiple Package DBs = As explained by ezyang, when there are multiple package dbs, GHC chooses the packages to use in the following manner. I might have misunderstood so please correct me if I am wrong. * If there are multiple packages with different versions the latest non- broken version is chosen. * If the there are multiple packages with the same version the behavior is unspecified. * If there are multiple packages with the same package-id (shadowing) then the one which comes first in `GHC_PACKAGE_PATH` or which comes first on command line (`-package` flag) will be used. = The Problem = The build tool `Stack` implements stacked package databases. It uses a base package database and then stacks another package database on top of it to customise it further without modifying. The behavior is such that the package db on top of the stack completely overrides the ones below. That means you choose a package from top of the stack even if the version is older. Stack implements this by passing explicit package-ids of the packages to GHC. This scheme works well for cabal projects where we know ALL the packages used by the project in advance. But it does not work for scripts run using runghc. In that case we do not know the packages required by the script in advance and therefore cannot pass the package-ids to GHC. That means we cannot make GHC use the packages in the right way. GHC will choose the latest version even though we want it to choose a possibly older version from the top of the db stack. = Proposed Solution = Implement a new CLI option, something like `--stacked-pkg-dbs`. If this option is used GHC will use `GHC_PACKAGE_PATH` or the -package-db options as a stack of dbs. The first db in the path or the first CLI option will be considered the top of the stack. GHC will choose a package from the first db from the top of the stack irrespective of the version of the package. If the package is broken it should report error rather than silently choosing from the next db. This will allow us to modify an immutable package db by stacking another db on top. Implementing this as a separate option will keep the existing behavior so as to remain backward compatible. This has been discussed in a stack issue on github [https://github.com/commercialhaskell/stack/issues/1957 here]. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by pggiarrusso): * cc: pggiarrusso (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by ezyang: @@ -10,3 +10,4 @@ - * If there are multiple packages with the same package-id (shadowing) then - the one which comes first in `GHC_PACKAGE_PATH` or which comes first on - command line (`-package` flag) will be used. + * If there are multiple packages with the same installed package id + (shadowing) then the one which comes first in `GHC_PACKAGE_PATH` or which + comes last on command line (according to `-package-db` flags) will be + used. New description: = Package Selection Across Multiple Package DBs = As explained by ezyang, when there are multiple package dbs, GHC chooses the packages to use in the following manner. I might have misunderstood so please correct me if I am wrong. * If there are multiple packages with different versions the latest non- broken version is chosen. * If the there are multiple packages with the same version the behavior is unspecified. * If there are multiple packages with the same installed package id (shadowing) then the one which comes first in `GHC_PACKAGE_PATH` or which comes last on command line (according to `-package-db` flags) will be used. = The Problem = The build tool `Stack` implements stacked package databases. It uses a base package database and then stacks another package database on top of it to customise it further without modifying. The behavior is such that the package db on top of the stack completely overrides the ones below. That means you choose a package from top of the stack even if the version is older. Stack implements this by passing explicit package-ids of the packages to GHC. This scheme works well for cabal projects where we know ALL the packages used by the project in advance. But it does not work for scripts run using runghc. In that case we do not know the packages required by the script in advance and therefore cannot pass the package-ids to GHC. That means we cannot make GHC use the packages in the right way. GHC will choose the latest version even though we want it to choose a possibly older version from the top of the db stack. = Proposed Solution = Implement a new CLI option, something like `--stacked-pkg-dbs`. If this option is used GHC will use `GHC_PACKAGE_PATH` or the -package-db options as a stack of dbs. The first db in the path or the first CLI option will be considered the top of the stack. GHC will choose a package from the first db from the top of the stack irrespective of the version of the package. If the package is broken it should report error rather than silently choosing from the next db. This will allow us to modify an immutable package db by stacking another db on top. Implementing this as a separate option will keep the existing behavior so as to remain backward compatible. This has been discussed in a stack issue on github [https://github.com/commercialhaskell/stack/issues/1957 here]. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): So, what do you want to happen if a single package database has multiple exposed copies of a package; e.g. package.conf has both p-0.1 and p-0.2 (which are both exposed). Then do you want p-0.2? How about if it has two copies of p-0.2 (with different installed package IDs)? Which one is picked is left unspecified? It's also worth mentioning how the `exposed` flag ties into this. For the initial visibility calculation, you probably don't want not-exposed packages to be considered at all. So if we have pkg-0.1 (exposed), and then on the occluding package db, pkg-0.2 (not exposed), we should use pkg-0.1. (Maybe you don't care, because you never not expose? But the behavior needs to be ironed out.) If so, what should happen if I say `-package pkg`? The current behavior (without occlusion) is to pick the latest version in ANY database, no matter if it's exposed or not. So I guess we would then have to pick out pkg-0.2 from the top database, hiding pkg-0.1 in the process. I am also unsure about this comment:
If the package is broken it should report error rather than silently choosing from the next db.
OK I also need more clarification on this. So if I say `-package base` you still want to pick out `base` from the system package database, right? So do you want something like, "if you request a package `pkgname`, and it is occluded by a broken package, you want this to error. Do you also want this behavior for `-package pkg-1.0`? What if there are two copies of `pkg-1.0` in the top-most package db, one is broken and one isn't? How about if there is `pkg-0.9` which is broken and `pkg-1.0` which is not? Also, suppose there is no `-package` flag specified. If there are broken packages in the top-most database, should we immediately error? (I guess not?) (When you answer these, please edit the ticket description so that it's up-to-date.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by harendra): In general, this ticket is mostly about how we combine multiple databases. The existing behavior wrt a single database scope seems to be fine. Since stack does not install multiple instances with the same package name in the same db it does not matter what is picked in that case. In general I guess picking the latest version and any IPID should be fine. What you described wrt `exposed` makes sense to me. Though it does not matter for my use case. Also, the behavior of `-package` as you described is exactly what is required. The point of erroring out is to make sure we always have a reliable behavior. We should control the behavior rather than GHC doing it automatically without us even knowing about it. If a broken package comes in the way we have to report error. If we say `-package base` we pick it from the topmost db that has it whether it is broken or not. If it is broken we say so. Same for `-package pkg-1.0`. Within a given db scope, again it does not matter for my use case so we can use the usual rule of latest non-broken package. In general one can imagine the following package db policy flags to control the behavior more flexibly: * `--allow-multiple-ipid`: default is to error on detecting ipid conflict * `--ignore-broken-pkgs`: default is to error on encountering broken packages. * `--allow-multiple-versions`: default is to error on detecting multiple versions Defaults can be changed but keeping strict defaults may be better for reliable behavior. These flags in addition to stacking (`--stacked-pkg- dbs`) or unioning combining policy will create a fully flexible system. All the above flags always apply to a single db scope. When stacking is in effect then across dbs stacking rules will apply (i.e. top db prevails) and in a single db scope these flags will apply. When unioning is in effect all the dbs are combined into one and these flags are applied to the combined db. Though I am not sure if it is worth implementing all of this. This scheme answers some of your questions elegantly because now we do not worry about what is the correct default. For example what do we do if a top db has a broken package but a lower one has a working one. If `--ignore-broken-pkgs` is specified then we will ignore the top and go ahead and use the lower one. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by harendra: @@ -36,6 +36,19 @@ - option is used GHC will use `GHC_PACKAGE_PATH` or the -package-db options - as a stack of dbs. The first db in the path or the first CLI option will - be considered the top of the stack. GHC will choose a package from the - first db from the top of the stack irrespective of the version of the - package. If the package is broken it should report error rather than - silently choosing from the next db. + option is used GHC will use `GHC_PACKAGE_PATH` or the `-package-db` + options to specify a stack of dbs. The first db in the path or the last + CLI option will be considered the top of the stack. + + The default behavior of GHC is to union all databases whereas this feature + is about vertically stacking them. The following stacking rules will + apply: + + * GHC will search a package from top to bottom and stop at the first db in + which the package exists. + * If GHC is not looking for a specific version then it will stop at the + first db in which ANY version is found. + * It will ignore the hidden packages when searching, but `-package <pkg>` + will have the effect of unhiding all versions of `<pkg>`. + * If there are multiple versions available in the same db then usual rules + of latest non-broken package will be used to select one. + * If the candidate package(s) found is broken it will not search further. + When a broken package is needed in compilation it will report an error. It + will not report errors in general when a broken package is encountered. New description: = Package Selection Across Multiple Package DBs = As explained by ezyang, when there are multiple package dbs, GHC chooses the packages to use in the following manner. I might have misunderstood so please correct me if I am wrong. * If there are multiple packages with different versions the latest non- broken version is chosen. * If the there are multiple packages with the same version the behavior is unspecified. * If there are multiple packages with the same installed package id (shadowing) then the one which comes first in `GHC_PACKAGE_PATH` or which comes last on command line (according to `-package-db` flags) will be used. = The Problem = The build tool `Stack` implements stacked package databases. It uses a base package database and then stacks another package database on top of it to customise it further without modifying. The behavior is such that the package db on top of the stack completely overrides the ones below. That means you choose a package from top of the stack even if the version is older. Stack implements this by passing explicit package-ids of the packages to GHC. This scheme works well for cabal projects where we know ALL the packages used by the project in advance. But it does not work for scripts run using runghc. In that case we do not know the packages required by the script in advance and therefore cannot pass the package-ids to GHC. That means we cannot make GHC use the packages in the right way. GHC will choose the latest version even though we want it to choose a possibly older version from the top of the db stack. = Proposed Solution = Implement a new CLI option, something like `--stacked-pkg-dbs`. If this option is used GHC will use `GHC_PACKAGE_PATH` or the `-package-db` options to specify a stack of dbs. The first db in the path or the last CLI option will be considered the top of the stack. The default behavior of GHC is to union all databases whereas this feature is about vertically stacking them. The following stacking rules will apply: * GHC will search a package from top to bottom and stop at the first db in which the package exists. * If GHC is not looking for a specific version then it will stop at the first db in which ANY version is found. * It will ignore the hidden packages when searching, but `-package <pkg>` will have the effect of unhiding all versions of `<pkg>`. * If there are multiple versions available in the same db then usual rules of latest non-broken package will be used to select one. * If the candidate package(s) found is broken it will not search further. When a broken package is needed in compilation it will report an error. It will not report errors in general when a broken package is encountered. This will allow us to modify an immutable package db by stacking another db on top. Implementing this as a separate option will keep the existing behavior so as to remain backward compatible. This has been discussed in a stack issue on github [https://github.com/commercialhaskell/stack/issues/1957 here]. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by harendra): We can also add `--include-hidden-pkgs` to this list to address the handling of `exposed` flag that you raised. That will complete the whole story. If we combine these with the `environment file` it will become a powerful way to compose package databases and can be widely useful. Let me know what you think, does it make sense to make it so flexible? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * cc: simonmar (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I haven't thought about this very long, but can't you use environments to achieve what you want? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by harendra): No, the issue is at a lower level than environments. With existing GHC options, it is not possible to express what is required here. We want GHC to choose packages in a way so as to stack the dbs on top of each other (upper dbs in the stack overriding the lower ones) rather than union them which is the default behavior. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Stack implements this by passing explicit package-ids of the packages to GHC. This scheme works well for cabal projects where we know ALL the
#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Could you give a specific example? From the description it looks like environments are exactly what you want: packages used by the project in advance. But it does not work for scripts run using runghc. In that case we do not know the packages required by the script in advance and therefore cannot pass the package-ids to GHC. That means we cannot make GHC use the packages in the right way. GHC will choose the latest version even though we want it to choose a possibly older version from the top of the db stack. Environments are designed for exactly this situation: you're using `runghc` or `ghci`, and you want `GHC` to make a specific set of packages visible. In particular, you're saying you want to deliberately use older versions of some package: you could just omit the newer versions from the environment, and GHC will ignore them. Alternatively you can specifically request the older version using a `-package` flag. One reason I'm pushing on this because I think we should deprecate the idea of DB stacks (see #12485). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by harendra): The intent of this issue was to provide an alternative `stacked` db composing policy in addition to the default `union` composing policy. Though, as you suggested, it can be implemented in terms of existing primitives by first specifying a list of dbs and then explicitly specifying the overriding packages using `-package pkg-x.y.z` flags in an environment file. That might mean specifying a lot of `-package` flags in the environment file though. We will have to explicitly identify and specify all the packages which override other packages in any of the lower dbs in the db stack. I guess it should do the job if we do not want to have a built-in stacked db policy. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12518: Allow customizing immutable package dbs by stacking -------------------------------------+------------------------------------- Reporter: harendra | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Package system | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): I'm currently working on a refactor to fix #13313 which should make it easier to implement this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12518#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC