
Following discussion on and off this list, we've re-written the proposal for changes to the packages/libraries story. I've included the proposal as plain text below so that it can be quoted easily, but if you would prefer to read it in HTML there's a version here: http://www.haskell.org/~simonmar/packages.html The main differences relative to the previous proposal are: - Packages, with unique package names, are given more emphasis. - Grafting, in particular grafting a library in multiple places, is given less emphasis in this new proposal. Multiple grafting isn't essential: the two motivating examples we had previously (versioning and identifying APIs by GUID) are both served by having unique package names. Motivation ---------- This proposal describes an implementation-independent mechanism, called "packages", that allows a library or other group of Haskell modules to be wrapped up as a single unit. Here is why we need this mechanism: - We want to lower the barrier to shipping a new Haskell library. At present a library author must, for each module M in his library, find a place for M in the single global module hierarchy. Either she make the library inconvenient to use (by using deeply-nested module names) or else risks clashing with "popular" sitse in the tree. - An author is likely to produce multiple versions of a library. If these live in different parts of the global module name space, one has to change every importing module to switch to the new version. If they re-use the same names as the previous version, it's hard to know which version is required, and impossible to build a program that simultaneously needs two different versions. For example, perhaps your program uses version N of an API, but you import a library which depends on version N-1 of the same API). - We want to have some support for abbreviating module names in source code (to avoid very long module names), and being able to move a sub-hierarchy of source modules around in the global hierarchy, without modifying the source code directly. - We want to be able to uniquely identify a library API, for the purposes of expressing source code dependencies, and for the purposes of being able to automatically install dependencies. This would make it possible to automate the business of installing the necessary support packages for a given package. Packages -------- In this proposal, a "package" is the unit of distribution. A package defines a sub-tree of modules; eg. GTK, GTK.Window, GTK.Button, ... However, crucially, the package does not define absolute module names, but instead can be grafted into the module hierarchy at different sites, without recompilation (see "grafting" below). Every package has a "package identifier". A package identifier is a string, eg. "gtkhs". It is the intention that package identifiers are globally unique, but we don't intend to enforce this in any rigorous way. There will probably be a web page which maintains a list of package identifiers, and where one can register a new one. Things will go badly wrong if you try to use two packages with the same identifier. A "package name" is defined as a pair of a package identifier and a version number. For example, "gtkhs-0.4". A package name uniquely identifies an API: that is a set of modules, and the interfaces to those modules. The package web interface might well link to the documentation for each package API, as well as the place where the package can be obtained. Note that because we have a way to uniquely identify an API, GUIDs are not required. A package takes two forms. A "source package" consists of Meta-data that describes the package - The package identifier - Package major and minor version - A default grafting location for this package - Dependencies, expressed as a set of triples (package identifier, version range, grafting location) - Etc (e.g. documentation, installation materials) Payload - Haskell modules (source, object, interface, analysis results) - Associated C header files or other support code A "binary package" is the same, except that - The Haskell modules and other source materials are in compiled, object code, form. - Information about which compiler was used, and which version of that compiler - The dependencies are expressed as a set of package names only. The existence of packages offers new opportunities for encapsulation. For example, the meta-data for a package could expose some, but not all, of the modules in the package, giving the package author the chance to securely hide internal modules. Grafting -------- The modules in a package form a sub-hierarchy. This sub-hierarchy can be mapped into the global module hierarchy at any point when the package is used; this operation is called "grafting". For example, if we have modules Gtk Gtk.Window Gtk.Button in the package "gtkhs-0.4", and this package is grafted onto "Graphics.UI", then these modules would be available to a user of the "gtkhs-0.4" package as Graphics.UI.Gtk Graphics.UI.Gtk.Windwow Graphics.UI.Gtk.Button Note that this provides a simple way to abbreviate module names in source code, as well as providing a way to easily move an entire sub-hierarchy of modules around in the global hieararchy without changing every source file. Installing a package -------------------- Installing a package is the action a client takes to make a new package known to a particular Haskell implementation. A package comes with a default grafting location. Installing the package makes it available at that grafting location, without the need for any command-line flags. GHC calls such packages "auto packages", and we will follow that terminology here. At most one version of any given package can be an auto package, and (by convention) it is always the latest installed version. That is, when installing a package, that package only becomes an auto package (available without flags) if its version is later than any other installed version of that package. It should also be possible to install a package at a site different from its default grafting location. Existing package managers such as RPM don't have a way to specify a grafting locations anyhow, but the Haskell library infrastructure (currently in development) would no doubt have a way to change the grafting location if used directly. Specifying Grafting Locations at compile time --------------------------------------------- Each Haskell implementation should provide a means for specifying packages and grafting locations when compiling Haskell source code. One possibility for GHC is to extend the command-line syntax for -package, eg.: ghc -package gtkhs-0.5:Graphics.UI In that case, the command-line choice for a particular package should override (replace) the install-time choice for that package. For example, if gtkhs-1.7 is installed so that it is available by default, then the command above would *remove* gtkhs-1.7 from the module name space, and instead graft in gtkhs-0.5. Why? Because both specify the same package name "gtkhs". In short, any one compilation should see at most one version of each package. Overlapping Packages -------------------- If two packages are grafted in such a way that they both define the same absolute module in the module hierarchy, then it is an error to import that module. (This is akin to the error that is reported if two import statements in a Haskell program bind the same name.) For example, if "gtkhs-0.4" defines a module "GTK.Misc", and "graph-1.8" defines the module "Misc", and one says ghc -package gtkhs-0.4:Graphics.UI -package graph-1.8:Graphics.UI.GTK then the import declaration import Graphics.UI.GTK.Misc would be an error, because it is defined by both packages. Shipping a new library ---------------------- Joe H. Programmer just wrote a small library and wants to share it with the world. What does he have to do? Under our proposed scheme, it would go something like this: - Make up a package name, and register it using the web interface at haskell.org, to avoid anyone else using the same name. - Decide what the default grafting location for the library should be. There will be some hierarchy layout guidelines on haskell.org for library writers to follow - these won't be set in stone, though. The worst that can happen is that your package will overlap with another common one, and will end up getting turned off by default when installed. - Package up your library using the Haskell library infrastructure, and share it. Implementation -------------- Here's what we have to do for GHC: 1. An entity in a Haskell program was previously uniquely identified by its (module name, identifier) pair, where the module name is the module in which the entity is defined. This now becomes a triple: (package name, relative module name, identifier). 2. Extend the package spec syntax to include grafting locations, and lists of overlapping packages. 3. Extend the -package flag syntax to allow specifying a new grafting location. 4. Change the searching semantics to take into account grafting locations. 5. Implement the "version overriding" semantics, and error checking to do with visibility of overlapping packages. (1) is quite a fundamental changes, but (2-5) are all quite straightforward. I think a similar strategy would work for Hugs & NHC, although Hugs at least will need to also acquire support for packages.

In article
<3429668D0E777A499EE74A7952C382D1CF5C84@EUR-MSG-01.europe.corp.microsoft
.com>,
"Simon Marlow"
Meta-data that describes the package ... - Dependencies, expressed as a set of triples (package identifier, version range, grafting location)
This raises an issue. Someone releases package foo. I release a package bar that uses foo, where foo is grafted at A.B.Foo. Someone else wants to use bar, but they have foo grafted at C.D.Foo, because they also have a completely unrelated otherfoo grafted at A.B.Foo. Does their use of bar force foo to be grafted at the same place I grafted it at? -- Ashley Yakeley, Seattle WA
participants (2)
-
Ashley Yakeley
-
Simon Marlow