
Hi lists, (I hope my cross-posting is okay, but somehow this post seems to apply to all of you, so here goes...) I've recently noticed that folks at GHC HQ are working on a way to resolve the problem of importing two modules with the same name from different packages. There is a proposal[1] on the GHC wiki calling for a syntax extension for 'import' statements in Haskell modules so that the package (and version) to import from can be specified explicitly. There is a second ("extended") proposal[2], which calls for the ability to import (subtrees of) the module hierarchy exposed by a package and attach it to the global module namespace at an arbitrary point, analogous to mounting a filesystem in Unix. This proposal was appearently inspired by a post to the libraries mailing list by Frederik Eaton[3]. I agree with Frederik that it would be very nice to remove the burden of writing out long package or hierarchy prefixes in modules, and just work in some previously defined context. I'd like to propose yet another alternative to the existing two proposals that follows [2] in trying to satisfy [3] but differes from it in the following ways: - It doesn't require an extension of the existing Haskell syntax. - It can be implemented by extending Cabal alone (AFAICT). In particular, I propose to drop the assumption that Haskell modules are closed entities but rather always consider them to be seen in the context of a particular _package_. The package is now made responsible for assembling an appropriate (hierarchical) module namespace to which imports in the packaged modules are taken to refer. To this end, the current 'depends:' entry in a package description would be replaced by a more general "mounting" construct, i.e. "mount package foo at Foo.Bar in the module hierarchy". Optionally, as in [2], only a subtree of "foo" could be selected, or only a specific version of "foo". While I'm at it, I had to evaluate how this proposal would interact with the "ECT" module versioning scheme, I've proposed earlier[4]. I'd like to rework that scheme to this proposal. In ECT, a library author guarantees to his users that their imports will never break by providing different (numbered) variants of the library modules whenever their interface changes. By keeping the old variants as re-exports of updated versions, the author can record compatibilities. This carries the burden for users to annotate each import with a version number. If we lift the principles of ECT to the package level in accord with this proposal, that burden largely evaporates. Keeping the promise of "eternal backwards-compatibility", however, requires an (obvious) extension to the way Cabal deals with version numbers... First of all, package "mounts" must be able to say "this version, or any compatible one". Obviously, that's actually always what one wants, so it can be taken to be the only meaning of specifying "mount foo-1.3 at Foo.Bar". Then the question is just when to consider another version of foo compatible to 1.3. Obviously, only later versions should be considered, but when does compatibility end? Only the author of foo knows! She must specify it somehow. Two possibilities come to mind: 1. Add a field to the package description of foo (v1.4, say) that says "I'm backwards-compatible with 1.3." When building, this relation would have to be inspected to see whether any currently installed version of foo satisfies the dependency specified by the mount. 2. Declare a convention for version numbers to carry compatibility information, like the OpenGL standard, for example: If the new version is backwards-compatible, only the minor version number changes. If it isn't, the major version number must be incremented. I'd personally favour 2, as it would be easier to maintain for the author of foo (no compatiblity-field in the package description to update all the time) and also to implement in the build system (just a simple version comparison instead of a relation traversal to check a dependency). As a (more conservative?) variant of alternative 2, the third (instead of second) version component could be declared as "stays compatible", i.e. if only the third (or later) version component changes, the new version is assumed backwards-compatible, otherwise it's assumed incompatible. This would allow frequent incompatible updates without quickly getting large major version numbers... As for wasting space by keeping old versions around: Of course, if the new one is backwards-compatible, any old version can be removed. Otherwise, if it's "half-way" compatible, the author of foo can release a new backwards-compatible revision of the old version that reimplements (part of) the functionality in terms of the new operations, so the old version can be replaced by this leaner revision. Comments? Best regards, Sven [1]: http://cvs.haskell.org/trac/ghc/wiki/GhcPackages [2]: http://cvs.haskell.org/trac/ghc/wiki/GhcPackageNamespaces [3]: http://www.haskell.org/pipermail/libraries/2005-June/004009.html [4]: http://www.haskell.org/tmrwiki/EternalCompatibilityInTheory

| Subject: Package "mounting" proposal | | Hi lists, | | (I hope my cross-posting is okay, but somehow this post seems to apply to | all of you, so here goes...) I think 'libraries' is the list that anyone interested in this will hang out, so I'll reply there alone Simon

| In particular, I propose to drop the assumption that Haskell modules are | closed entities but rather always consider them to be seen in the context | of a particular _package_. The package is now made responsible for | assembling an appropriate (hierarchical) module namespace to which imports | in the packaged modules are taken to refer. To this end, the current | 'depends:' entry in a package description would be replaced by a more | general "mounting" construct, i.e. "mount package foo at Foo.Bar in the | module hierarchy". Optionally, as in [2], only a subtree of "foo" could be | selected, or only a specific version of "foo". In fact what Sven describes is very close indeed to what Simon and I propose, which is great. Just as the imports of a module establish the name-space for entities mentioned inside the module, so the Cabal file (plus the installation setup) establishes the name-space for modules specified in import statements (both unqualified 'import M' and package-qualified 'import "gtk" M). So the analogy is that the package dependencies of the Cabal file corresponds to the import statements of a module We propose to use this analogy consistently: ** In a module you can mention an entity both unqualified 'f' (if that is ambiguous) and qualified 'M.f' (to disambiguate); so in an import you should be able to mention the module both unqualified 'import M' (if that is umambiguous) and qualified 'import "gtk" M" (to disambiguate) ** In a module you can import another module 'qualified', so that its entities MUST be referred to qualified (import qualified M(f);....M.f...). So, in the Cabal file, you should be able to say that the exposed modules of a package are only available by qualified import ('import "gtk" M). ** In an import statement you can give an alias ('import M as Q"). So, in a Cabal file, when giving the dependency on a package, you should be able to specify an alias (needs new syntax). For a start, when depending on "gtk-4.2.3" you probably want to give it an alias "gtk", so that the source code of the package mentions only "gtk" not the exact version". ** In an export statement of a module you can say what is exported. Cabal already has that, via the list of 'exposed' modules. ** In an 'import' statement you can bring into scope a subset of the entities exported by a module. Such a selective import would make perfect sense at the package level too, but it's not so clear to us that it's worth the bother. How does this differ from what you propose? Only in the following way: instead of allowing a qualified import (i.e. one specifying a particular package, or package alias), you build a single module name space, and only allow unqualified imports. Instead of import "gtk" M you would establish "GTK." as the prefix for package "gtk", and then say import GTK.M. This isn't a very big difference (which is good). Personally, I like it less, because it conflates *provenance* (where the module comes from) with *purpose* (what it's for) in the single name. Furthermore, the qualified/unqualified analogy lets us import modules unqualified when that's unambiguous, and qualified when it matters. Incidentally, everywhere I say "Cabal" I also mean "GHC's command line" (or Hugs or nhc etc). All Cabal does is invoke the compiler with suitable arguments. Simon
participants (2)
-
Simon Peyton-Jones
-
Sven Moritz Hallberg