Re: cabal, UHC, and different backends

Hi Atze, btw, I'm cc-ing the cabal-devel list since this is relevant for the Cabal specification, and other people may like to chime in on the issue. On Tue, 2010-10-19 at 10:13 +0200, Atze Dijkstra wrote:
On 18 Oct, 2010, at 23:38 , Duncan Coutts wrote:
So the approach I would like to take is to have a proper notion of "way". This term comes from GHC where profiling, normal, etc are all different "ways". NHC had several different profiling ways. I think the notion can be generalised and supported in Cabal. It should cover both things like the same backend but ABI-incompatible options (like profiling) and also entirely different backends. Cabal would want to track each package separately for each way. It would need to know which ways are compatible/incompatible so it knows if it can mix packages of different ways (backend obviously cannot, but some forms of profiling or parallel vectorisation are ABI compatible and are expected to be mixed).
So yes we have a notion of "the right thing to do" but there's no current code in Cabal for doing this. Simon has also recently been interested in the issue of tracking profiled libs separately from normal. So perhaps we can cooperate on a detailed design.
I intend to be at BelHac, we can then spend some time on it. However for kicking off immediately: - There are (at least) 2 dimensions in which a compiler must be able to vary: target + variation, so I guess that makes type Way = (Target,Variation). Possibly a third: Platform, when cross compilation would come into play (like jhc does).
The way I'd like to handle this at the level of the Cabal spec (which is supposed to be mostly compiler-agnostic) is to say that the way field in the installed package info is just a string. This is the same as the new installedPackageID field which is also a string (and distinct from the source package id). This allows different compilers to encode different information. For example for hugs and nhc the installedPackageID is a string containing the name and source version, but for ghc it also contains an ABI hash. For the way, we can do something similar. UHC can encode its target, variation etc and other compilers can do different things. The main point is to distinguish different ways. Then interpreting which ways are compatible is compiler-dependent.
- UHC currently organises its internal directory structure like <pkg>/<compiler-variant>/<target>/<variation>, of which <compiler-variant> is UHC specific, but something like this is needed to distinguish. This is something we could standardize on, that is, cabal assumes a fixed way of mapping Way to FilePath.
Right, I suspect the standard layout will still be compiler-dependent but for several compilers it may include the way.
- An issue is whether a compiler knows of such a convention, or not? UHC currently knows for which target + variation it is compiling and adapts its location for generating package content. This could be factored out, or not. I am not sure what is best, although factoring out would make a compiler rely more on cabal, make compiler and cabal more intertwined. However, my guess is, that this part has the greatest impact on cabal.
Ideally, I would like the compilers to have no hard coded knowledge of the layout convention. Instead the compilers should just use the installed package info. This is what GHC does for example. In future I would like to modify the default layout that cabal-install uses for installed packages to use a Nix-style persistent package store. Also, distributions and other custom deployments typically want more control over the layout. Of course people doing this have to be aware of the constraints.
- As 1 end of this design space, when a compiler knows about these locations, changes to cabal can mostly be avoided. For UHC I ran cabal with a --uhc-option=-tjscript for specifying a different target, and all went well apart from the generation of a package descriptor file.
So, since the job of a package manager does involve knowing a fair bit about the layout and organisation of installed packages, I would like to see cabal know something about the "way", just so that it can allow multiple instances of the same package with different ways. The package registration info would also contain the way. In any case, cabal needs to know the way to work out which dependencies can correctly be combined together.
If this would also be done by UHC, cabal's only job (??) would be to pass the appropriate flags around and allow a .cabal file have constraints on target etc. - .cabal files need a way to refer to targets etc for constraints (i.e., package only compiles for certain backend).
Hmm. I'll think about that. Can it perhaps be expressed as a language restriction?
- Should targets and variations be standardized upon, i.e. data Target = ...? Or just allow any String to act as a Target, i.e. type Target = String?
As I mentioned above, the information in the way will be different for different compilers, so I think string is appropriate. Duncan

Hi All, so, the design for the interface to a compiler would be more or less like this: - type Way = String. Cabal is agnostic of any meaning of a Way. - Supported compilers provide the meaning of a Way, similar to compiler specific 'build executable' etc functions: <Compiler>.wayToFilePath :: Way -> FilePath <Compiler>.wayToFlags :: Way -> [String] - Compilers also specify when to use the above. Although it may be desirable to let Cabal do all the stuff depending on a Way, I think - it is a lot of work, actually it may be so much work that it will become a stumbling block, - it will be tricky, because it also would include Way specific runtime system, libraries, etc, - it will take time for compilers to adapt. So I'd prefer to identify a set of points where Cabal does/doesn't do the Way specific stuff. This allows per compiler choice and a more gradual path of adaption. For example: - location of .hi files, library files, etc (this is the main part), which would boil down to passing a destination location which does or does not include 'wayToFilePath'. - location/construction of package descriptor file. For dealing with constraints and the cabal language specification, a Way is (should be?) orthogonal to other constraints. That is, for a Way neutral package, the same dependencies would hold for all Ways. Conceptually, a constraint then is a set/map of constraints, one for each Way, only to be strictly smaller than the largest possible set when a particular Way is not supported. The cabal language would then require 'member' and 'filter' like mechanisms. I am sure I simplify matters (too much? :-)). As for using Nix ideas, this will require a different build system inside Cabal. The basic idea is simple, every artifact is a function result, no side effects. Making it work when no tool behaves without side effects (files are always put into a file system), etc etc, is more difficult, most likely impossible without compromises. I have thought of ways of doing this inside and/or with UHC, but maybe something along the lines of recent work of Neil Mitchell would be an easier starting point. cheers, On 20 Oct, 2010, at 01:05 , Duncan Coutts wrote:
Hi Atze,
btw, I'm cc-ing the cabal-devel list since this is relevant for the Cabal specification, and other people may like to chime in on the issue.
On Tue, 2010-10-19 at 10:13 +0200, Atze Dijkstra wrote:
On 18 Oct, 2010, at 23:38 , Duncan Coutts wrote:
So the approach I would like to take is to have a proper notion of "way". This term comes from GHC where profiling, normal, etc are all different "ways". NHC had several different profiling ways. I think the notion can be generalised and supported in Cabal. It should cover both things like the same backend but ABI-incompatible options (like profiling) and also entirely different backends. Cabal would want to track each package separately for each way. It would need to know which ways are compatible/incompatible so it knows if it can mix packages of different ways (backend obviously cannot, but some forms of profiling or parallel vectorisation are ABI compatible and are expected to be mixed).
So yes we have a notion of "the right thing to do" but there's no current code in Cabal for doing this. Simon has also recently been interested in the issue of tracking profiled libs separately from normal. So perhaps we can cooperate on a detailed design.
I intend to be at BelHac, we can then spend some time on it. However for kicking off immediately: - There are (at least) 2 dimensions in which a compiler must be able to vary: target + variation, so I guess that makes type Way = (Target,Variation). Possibly a third: Platform, when cross compilation would come into play (like jhc does).
The way I'd like to handle this at the level of the Cabal spec (which is supposed to be mostly compiler-agnostic) is to say that the way field in the installed package info is just a string. This is the same as the new installedPackageID field which is also a string (and distinct from the source package id). This allows different compilers to encode different information. For example for hugs and nhc the installedPackageID is a string containing the name and source version, but for ghc it also contains an ABI hash.
For the way, we can do something similar. UHC can encode its target, variation etc and other compilers can do different things. The main point is to distinguish different ways. Then interpreting which ways are compatible is compiler-dependent.
- UHC currently organises its internal directory structure like <pkg>/<compiler-variant>/<target>/<variation>, of which <compiler-variant> is UHC specific, but something like this is needed to distinguish. This is something we could standardize on, that is, cabal assumes a fixed way of mapping Way to FilePath.
Right, I suspect the standard layout will still be compiler-dependent but for several compilers it may include the way.
- An issue is whether a compiler knows of such a convention, or not? UHC currently knows for which target + variation it is compiling and adapts its location for generating package content. This could be factored out, or not. I am not sure what is best, although factoring out would make a compiler rely more on cabal, make compiler and cabal more intertwined. However, my guess is, that this part has the greatest impact on cabal.
Ideally, I would like the compilers to have no hard coded knowledge of the layout convention. Instead the compilers should just use the installed package info. This is what GHC does for example. In future I would like to modify the default layout that cabal-install uses for installed packages to use a Nix-style persistent package store.
Also, distributions and other custom deployments typically want more control over the layout. Of course people doing this have to be aware of the constraints.
- As 1 end of this design space, when a compiler knows about these locations, changes to cabal can mostly be avoided. For UHC I ran cabal with a --uhc-option=-tjscript for specifying a different target, and all went well apart from the generation of a package descriptor file.
So, since the job of a package manager does involve knowing a fair bit about the layout and organisation of installed packages, I would like to see cabal know something about the "way", just so that it can allow multiple instances of the same package with different ways. The package registration info would also contain the way.
In any case, cabal needs to know the way to work out which dependencies can correctly be combined together.
If this would also be done by UHC, cabal's only job (??) would be to pass the appropriate flags around and allow a .cabal file have constraints on target etc. - .cabal files need a way to refer to targets etc for constraints (i.e., package only compiles for certain backend).
Hmm. I'll think about that. Can it perhaps be expressed as a language restriction?
- Should targets and variations be standardized upon, i.e. data Target = ...? Or just allow any String to act as a Target, i.e. type Target = String?
As I mentioned above, the information in the way will be different for different compilers, so I think string is appropriate.
Duncan
_______________________________________________ cabal-devel mailing list cabal-devel@haskell.org http://www.haskell.org/mailman/listinfo/cabal-devel
- Atze - Atze Dijkstra, Department of Information and Computing Sciences. /|\ Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / | \ Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--| \ Fax : +31-30-2513971 .... | Email: atze@cs.uu.nl ............ / |___\
participants (2)
-
Atze Dijkstra
-
Duncan Coutts