RE: Libraries and hierarchies

[Nothing new in this message: just a summary, with some useful terminology.] | Interestingly, this is another way that source code can unambiguously | refer to libraries: by package name, site, and module name (with the | former two properties being specified out-of-band in some additional | matter that comes with the source code - perhaps a package specification | or similar). This is rather less ugly than using GUIDs. We can also | provide GUIDs, of course, because doing so is cheap. Suppose someone makes a package called glut-6.0 with modules Graphics, Graphics.GLUT, and so on. Suppose Graphics.GLUT includes a function foo. Then: The "original name" of function foo is glut6.0:Graphics.GLUT.foo That is <package-name>:<relative-path-within-package> Grafting the package glut6.0 into the module hierarchy does not change the original name of anything in the package. This is important, because the binaries generated for the package use original names for cross-module references, and we don't want those to change when we graft. All this does is change the need for *module names* to be globally unique into the requirement for *package names* to be unique. This problem is much easier because a) There are fewer packages, so some central help-yourself global registry is feasible, as Simon M suggests. b) Package names can be longish and clunky (e.g. including a version number), because we provide a way to avoid mentioning them in source code. I personally think that using GUIDs for package names would be a mistake. They are essentially opaque to people, so there needs to be some separate infrastructure to describe what a GUID names, and the extra layer of indirection seems to buy little. It's not hard to have unique package names (I claim) and that'll do the job nicely. Indeed a package name, as suggested here, then *is* a globally unique ID, or GUID, only with a comprehensible name. The "graft package into tree" mechanism can be seen as a method to implement (b). Installing a package means you can name modules in that package using A.B.C module notation, without explicitly mentioning the package itself. As Simon's message above said, source code therefore only makes sense given some set of graftings (= package -> module prefix mapping), and one might want to make that part of the package (source-code) description. Ganesh asks | Would the following be possible under your proposal: | | module M1.A imports Collections.Foo and only works with version 1 | module M2.B imports Collections.Foo and only works with version 2 | module C imports M1.A and M2.B Yes, this is ok, as Simon indicated earlier. Presumably there are two packages coll-1.0 and coll-2.0. We graft them in at (say) Collections, and Old.Collections and away we go. Or, we invoke the compiler for M1.A with a command line flag to graft in coll-1.0 at Collections, and then compile M2.B grafting coll-2.0 at Collections. Either way, a value constructed by package coll-1.0 will be type-incompatible with functions in coll-2.0. The former will have original names "coll-1.0:Foo.T" while the latter will have "coll-2.0:Foo.T". Trying to provide transparent type upgrade is too hard. Manuel didn't like the fact that an import could mean "relative or absolute". But presumably we want to continue to write little programs with three modules A, B, Main, and have Main just say 'import A'. So presumably the current directory is always implicitly grafted into the module hierarchy at the root -- and that is all we need for making internal references within a package work out. Simon

a) There are fewer packages, so some central help-yourself global registry is feasible, as Simon M suggests.
Note that URIs make pretty good unique package names. When used this way, URIs can take several forms: 1) The URI can be a pointer to an actual file to be downloaded like http://haskell.org/greencard/gc-3.01.tar.gz This is a good choice because it often exists for publicly released software and following the pointer produces something useful. The pointer usually contains a version number and so will be unique. 2) The URI is of the form http://organization/local-package-name/version and does not point to an actual file you can download. The organization is the URI to some group that is handing out locally unique package names. Some example organizations might be: haskell.org sourceforge.org reid-consulting-uk.ltd.uk microsoft.com/~simonpj microsoft.com <- an independent organization from ~simonpj The local package name should be unique within the organization but package names from other organizations can overlap. The version number can be whatever the individual author feels like using - as long as it is unique. Although the URI doesn't point at the actual package, it is probably good practice if it points at something which tells you about the package and where to download it. For example, Tweaking the form of the URI slightly, we could have: http://haskell.org/packages#greencard-3.01 where http://haskell.org/packages might be a web page that tells you how to install packages for Debian, FreeBSD, Windows, etc. and the anchor #greencard-3.01 doesn't point to anything at all. -- Alastair Reid

"Simon Peyton-Jones"
Manuel didn't like the fact that an import could mean "relative or absolute". But presumably we want to continue to write little programs with three modules A, B, Main, and have Main just say 'import A'. So presumably the current directory is always implicitly grafted into the module hierarchy at the root -- and that is all we need for making internal references within a package work out.
I presume that paths specified by -i (isn't it?) also will be grafted into the root of the tree. How do you deal with a file "Foo.lhs" declaring "module Bar.Zot.Foo", say, in the current directory? Is it legal to import it as "import Foo"? Or "import Bar.Zot.Foo"? Both? Is it really necessary and desirable to specify the whole path as part of the module declaration? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

[Nothing new in this message: just a summary, with some useful terminology.]
hello, Simon Peyton-Jones wrote: thanks -- this was really helpful. i was kind of confused on the issue of distributing a package in source and binary format, but i think i understand the issues now.
... I personally think that using GUIDs for package names would be a mistake. They are essentially opaque to people, so there needs to be some separate infrastructure to describe what a GUID names, and the extra layer of indirection seems to buy little. It's not hard to have unique package names (I claim) and that'll do the job nicely. Indeed a package name, as suggested here, then *is* a globally unique ID, or GUID, only with a comprehensible name. i completely agree with this.
Either way, a value constructed by package coll-1.0 will be type-incompatible with functions in coll-2.0. The former will have original names "coll-1.0:Foo.T" while the latter will have "coll-2.0:Foo.T". Trying to provide transparent type upgrade is too hard. that seems resonable. presumably to upgrade a binary package to use a new library, i simply need to specify a new mapping in its grafting file, and recompile the package. is that correct?
Manuel didn't like the fact that an import could mean "relative or absolute". But presumably we want to continue to write little programs with three modules A, B, Main, and have Main just say 'import A'. So presumably the current directory is always implicitly grafted into the module hierarchy at the root -- and that is all we need for making internal references within a package work out. am i right in assuming that "current directory" refers to the directory in which the file being compiled is located? e.g. if module Main has a declartion "import M" ghc A/B/Main.hs will look for "A/B/M.hs"
bye iavor -- ================================================== | Iavor S. Diatchki, Ph.D. student | | Department of Computer Science and Engineering | | School of OGI at OHSU | | http://www.cse.ogi.edu/~diatchki | ==================================================

On Tue, Aug 05, 2003 at 02:18:46PM +0100, Simon Peyton-Jones wrote:
[...] All this does is change the need for *module names* to be globally unique into the requirement for *package names* to be unique. This problem is much easier because
a) There are fewer packages, so some central help-yourself global registry is feasible, as Simon M suggests.
b) Package names can be longish and clunky (e.g. including a version number), because we provide a way to avoid mentioning them in source code. [...] The "graft package into tree" mechanism can be seen as a method to implement (b). Installing a package means you can name modules in that package using A.B.C module notation, without explicitly mentioning the package itself. As Simon's message above said, source code therefore only makes sense given some set of graftings (= package -> module prefix mapping), and one might want to make that part of the package (source-code) description.
Wouldn't one still want the module prefix + relative module name combination to produce unique module names? So presumably the prefix should include the package name (though without a version number). BTW, I notice that the current de facto method for avoiding module name clashes is to use names of the form allocated prefix + package name + whatever you want (e.g. Graphics.UI.GLUT.Window)
participants (5)
-
Alastair Reid
-
Iavor Diatchki
-
ketil@ii.uib.no
-
Ross Paterson
-
Simon Peyton-Jones