
| First of all I want to thank Isaac Jones for his work on Cabal. | | There are a few issues I would like to raise. Seconded -- Isaac you are doing a great job. As are lots of other contributors. One rather vexed question is the issue of whether a single program can contain two modules with the same name. Currently that is absolutely ruled out, and as a result packages are fundamentally non-modular: every package must use a distinct space in the global namespace. Simon Marlow and I have gradually become convinced that we have to fix this, and the only sensible way to fix it is to relax the language design so that a module name must be unique within its package (only) That means that module A.B.C could exist *both* in package P1 and in P2. And both packages could be linked into the same program. Some kind of 'grafting' or 'mounting' scheme would be needed to bring a package into the package namespace. One might say ghc -c Foo.hs -package gtk-2.3 Graphics.GTK to mount the gtk-2.3 package at Graphics.GTK in the name space. Outside the package one would need to import Graphics.GTK.M, but *within* the package one just imports M. That way the entire package can be mounted elsewhere in the namespace, if desired, without needing to change or recompile the package at all. The exact details of the mounting scheme, and whether it is done at build time, at install time, or at compilation time, or all of the above, are open to debate. We don't have a very fixed view. However, the fundamental thing GHC needs to do is to include the package name into the names of entities the package defines. That means that when compiling a module M you must say what package it is part of: ghc -C M.hs -package-name my-package-2.3 Then M.o will contain symbols like "my-package-2.3.M.f" etc. In effect, the "original name" of a function f in module M of package P is
. The purpose of this message is a) To advertise our willingness to implement this in GHC b) To ask whether doing so would be welcome, and if so to invite the Cabal afficionados to come up with a specific design for the programmer's interface c) To ask what the Hugs story might be. That is, are any of the Hugs folk willing to implement the necessary? The thing about (c) is that it'd be a Big Pity if new-style packages couldn't be used with Hugs because they don't obey the unique-module constraint. Ditto nhc. Simon

On Tue, 2005-08-16 at 16:31 +0100, Simon Peyton-Jones wrote:
| First of all I want to thank Isaac Jones for his work on Cabal. | | There are a few issues I would like to raise.
Seconded -- Isaac you are doing a great job.
Yes, indeed.
As are lots of other contributors.
One rather vexed question is the issue of whether a single program can contain two modules with the same name. Currently that is absolutely ruled out, and as a result packages are fundamentally non-modular: every package must use a distinct space in the global namespace.
Simon Marlow and I have gradually become convinced that we have to fix this, and the only sensible way to fix it is to relax the language design so that
a module name must be unique within its package (only)
That means that module A.B.C could exist *both* in package P1 and in P2. And both packages could be linked into the same program.
I think it might be possible to have something slightly weaker. So that modules which are used merely as an implementation detail may overlap with modules of the same name in another package or the top level program. The thing that really annoys people is when some package they're using internally uses some random utility module eg "C2HS.hs" then they cannot use that same module name in their program. This is the case even when the functions from that little utility module is not exposed in the interface of the package. Another nice example of this is when Cabal used to use a couple modules from the util-1.0 package. These modules were not exposed through the Cabal interface, they were purely an implementation detail and yet it meant that no program that used any of the cabal libs could have a GetOpt module (or one or tow other modules). On the other hand, when a module re-exports entities from other modules then those modules are in some sense exposed in the interface of the module/package. In this case it is much more reasonable that the global module namespace has been populated. As an analogy from C, Gtk+ uses libpng as a private implementation detail but it exposed the cairo api directly to user programs. Now I believe that with the ELF linking format it is possible to set things up such that if you link with Gtk+ (which links with libpng) you can still use symbols in your top level program that are exported by libpng and you will not get linker errors. What it does is link your program to Gtk but not to libpng, even though Gtk+ itself links to libpng. So in effect it draws a division in the collection of symbols in the program so that the same name may appear in two places so long as they are not directly linked. This only works because Gtk+ does not expose libpng in it's interface. Now on the other hand, unlike libpng which is a private dependency of Gtk+, cairo is a public dependency of Gtk+. Programs which use Gtk+ can directly use the cairo interface. So the ELF setup for the link between cairo and Gtk+ is different to that between Gtk+ and libpng. So you cannot reuse names from cairo without getting linker errors. I suggest that we can use the same concept (and possibly the same tricks with ELF linkage) for ghc's library packages. So we would need to distinguish between public and private package dependencies. This can be automatically checked by making sure no symbols from the private dependency are re-exported. They must be used only in the implementation. So the point is that we would never need to qualify names with the package they're from since it would never be possible to have both instances of the same name in scope at the same place (or if you do, it's because you directly imported it, in which case it's your fault). So my point is that I think we can eliminate the vast majority of the annoying module name clashes without having to go for a full "qualify everything with it's package" approach. We just need a concept of which module names a module or package exposes and make it possible to limit the modules which become exposed to those from which things are re-exported. (There is a technical difficulty with this which is that things can become re-exported due to inlining. It remains to be seen how the difficulty might compare to some grafting solution, which I suspect is not likely to play well with existing linker technology.) Duncan

"Simon Peyton-Jones"
One rather vexed question is the issue of whether a single program can contain two modules with the same name. ... the fundamental thing GHC needs to do is to include the package name into the names of entities the package defines.
I agree that, ultimately, the compiled name of an entity must be the triple ( package, module, entity name ) to enable packages to be truly modular.
Some kind of 'grafting' or 'mounting' scheme would be needed to bring a package into the package namespace.
I disagree that this necessarily mandates a grafting or mounting scheme. Whilst it might be useful, even desirable, it is a separate question. For instance: package P contains a module A package Q contains a module A package R depends on package P There is no dependency of package Q on P or R. A user wishes to use both R and Q in her program U. No problem. There is only one module called A visible to U, namely that contained in Q, and grafting is not required. What if she wants to use packages P and Q? Again no problem, provided she does not import A at all. By analogy with the lazy resolution of imported entities that is already a feature of Haskell'98, there is no conflict, and no resolution mechanism is needed. So what if she wants to use packages P and Q, /and/ to import a module A in program U? Well, let us split program U into two sub-programs U1 and U2. U1 imports A, and is compiled against only the single package required to provide A (either P or Q). U2 does not import A, but can safely import U1, and be compiled against both P and Q. OK, so this manual process is slightly tedious, and maybe upsets the current model for ghc --make (which I think currently adds all packages in the dependency graph to the namespace for all modules?). But it shows that the more minimal extension is feasible and does not outlaw any combination of packages. I do also worry a little that grafting could introduce worse namespace problems that we haven't thought of, simply because we haven't had a chance to play with it yet.
The thing about (c) is that it'd be a Big Pity if new-style packages couldn't be used with Hugs because they don't obey the unique-module constraint. Ditto nhc.
I'd be willing to to implement naming triples
in nhc98, on that fabled day when I finally implement a Cabal-style package registration tool for nhc98, (which unfortunately won't be anytime soon). The grafting scheme may turn out to be straightforward too, but as I've indicated, I'm a bit less keen on it. Regards, Malcolm

OK, so this manual process is slightly tedious, and maybe upsets the current model for ghc --make (which I think currently adds all packages in the dependency graph to the namespace for all modules?). But it shows that the more minimal extension is feasible and does not outlaw any combination of packages.
Also, in my original (too long for even me to read) package mounting proposal I described the way that I decided to think about these things, and why I think a modules-only solution won't suffice: "Modules and packages are quite distinct constructs, modules are needed for namespace partitioning and packages are needed to delineate administrative boundaries and sources of change. Both are necessary and both deserve special consideration ..."
I do also worry a little that grafting could introduce worse namespace problems that we haven't thought of, simply because we haven't had a chance to play with it yet.
Yes, this is a concern of mine as well. If people start referring to packages with different names in their code, as grafting will allow them to do, then it will be harder for them to share code with each other. This is one reason why I think default grafting locations will be important.
The thing about (c) is that it'd be a Big Pity if new-style packages couldn't be used with Hugs because they don't obey the unique-module constraint. Ditto nhc.
I'd be willing to to implement naming triples
in nhc98, on that fabled day when I finally implement a Cabal-style package registration tool for nhc98, (which unfortunately won't be anytime soon). The grafting scheme may turn out to be straightforward too, but as I've indicated, I'm a bit less keen on it.
You should totally be keen on it. Libraries will be so much more elegant if their namespaces don't have to contain their names. Imagine, for instance, not having to think of the name of something before you start writing it. Frederik
participants (4)
-
Duncan Coutts
-
Frederik Eaton
-
Malcolm Wallace
-
Simon Peyton-Jones