RE: Libraries and hierarchies

Modules in other packages can be imported only by uttering their full path names in the global hierarchy (of the compiler that is compiling the package). i don't quite understand this part. how is a module going to import modules from another package, if it does not know where these packages are going to be installed? if i am writing a module M, which imports A.B.X.N which is module X.N from a package P that was installed at A.B, doesn't that force everyone who is trying to use my package to install P at the same place: A.B?
Well, this is the key issue. Standing back a moment, I see three alternatives to organising the hierarchy: 1. We have a global registry for module names. The hierarchy is always populated with the same modules on every Haskell system. We have a mechanism for naming libraries after email addresses or domain names, or GUIDs, to ensure that everyone can get a non-clashing module name when they want one. This suffers from (a) needing a global registry for ordinary names (i.e. the non-GUID ones) and (b) overly long module names, (c) no library versioning. 2. We have no global registry for module names. Everyone is free to name library modules whatever they choose. In practice, groups of Haskell programmers will get together and maintain collections of non-clashing libraries, such as the fptools/libraries collection. This suffers from (a) possible name clashes between libraries, (b) no way to unambiguously refer to a specifiy library in Haskell source, (c) still no way to abbreviate long module names, and (d) still no GUIDs or versioning. 3. As (2), but the full name of a module is not fixed in the source code, and can be mapped into the hierarchy at multiple places. This is the method proposed in the message that started this thread. Because modules can be placed in the hierarchy in multiple places, it gives a good way to add library versioning and GUIDs to libraries. This suffers from (a) extra complication in the implementation and specification of the language, (b) needing unique package names, (c) referring unambiguously to a specific library must be done by GUID. So I guess what we're saying is that we're proposing (3) as a solution to the problems of (2). It probably isn't the best solution - indeed, I'm still worried about the implementation difficulties and the fact that we need unique package names, but perhaps solutions to these can be found. To answer the question, to unambiguously refer to another library under this story, you must import it by GUID. Cheers, Simon

On Mon, 4 Aug 2003 11:50:43 +0100, "Simon Marlow"
2. We have no global registry for module names. Everyone is free to name library modules whatever they choose.
In practice, groups of Haskell programmers will get together and maintain collections of non-clashing libraries, such as the fptools/libraries collection.
This suffers from (a) possible name clashes between libraries, (b) no way to unambiguously refer to a specifiy library in Haskell source, (c) still no way to abbreviate long module names, and (d) still no GUIDs or versioning.
3. As (2), but the full name of a module is not fixed in the source code, and can be mapped into the hierarchy at multiple places.
This is the method proposed in the message that started this thread. Because modules can be placed in the hierarchy in multiple places, it gives a good way to add library versioning and GUIDs to libraries.
It also gives the advantage that modules can be moved in the hierarchy (by a command-line flag or whatever) specifically to support whatever a particular program happens to require. Would the following be possible under your proposal: module M1.A imports Collections.Foo and only works with version 1 module M2.B imports Collections.Foo and only works with version 2 module C imports M1.A and M2.B Could we compile/use M1.A with Graphics.UI mapped to Graphics.UI.v1 and M2.B with Graphics.UI mapped to Graphics.UI.v2 and then correctly compile module C? Being able to do this would cause problems for the type-equivalence you mentioned elsewhere, of course. Cheers, Ganesh

3. As (2), but the full name of a module is not fixed in the source code, and can be mapped into the hierarchy at multiple places.
[...]
This suffers from (a) extra complication in the implementation and specification of the language,
I wonder if the extra complication is really that high. Suppose that every module somehow has a unique name (128 bit hash, global registry, whatever) and that we have setup two mappings to the same thing: Foo.Bar -> Reid.Consulting.Bar.version27 Baz.Bar -> Reid.Consulting.Bar.version27 to implement this in Hugs, it is _almost_ enough to simply run all source code through a preprocessor which replaces occurences of Foo.Bar and Baz.Bar with Reid.Consulting.Bar.version27. Suppose a module imports both Foo.Bar and Baz.Bar. The Haskell compiler will see this as two imports of Reid.Consulting.Bar.version27 and will recognise that any types they define are the same type. This simple minded lexical substitution approach would fail in the usual places that applying a preprocessor fails: inside strings, when a module name is also used as a constructor name, etc. It would also interact with qualified import. For this reason, the substitution needs to happen inside the parser or early parts of the frontend. We'd also want to generate good error messages so there would be a small change there as well. Overall, I think it would be quite a small change.
(b) needing unique package names,
The original proposal tied this feature strongly to packages. I wonder if this is really necessary. Putting that aside, it seems that unique package names are fairly easy to come by. Some possibilities are: 1) Use the URI of the primary download site. From choice, the URI would be for a tarfile or whatever but all that really matters is that it is unique. This will typically include a version number but, if not, the user could add one if they care. 2) In the likely event that we have a few big repositories and many small ones, simply prefix the name of the package by the repository name or, if not available, name of author. This is effectively the same as (1) but avoids using URIs. 3) Directory name on your system. This would be a little like the current situation with Hugs where you add -P options to the command line to add each package to the searchpath.
(c) referring unambiguously to a specific library must be done by GUID.
Rephrasing this as 'can be done by GUID', I would list this as a strength. -- Alastair Reid

hello, Alastair Reid wrote:
...
3) Directory name on your system.
This would be a little like the current situation with Hugs where you add -P options to the command line to add each package to the searchpath.
i like this, as i've wanted something like this, and it also seems simple, although i may be missing something. if i download a package, i simply put it on my system somewhere, and add this place to the compiler search path (via a flag, or a compiler configuration flag). then we can have relative and absolute imports: absolute imports would be searched for in the compiler search path, while absolute imports will be searched for in the same directory as the importing module. alternatively one could only have one kind of import, that is always searched for in the compiler path. if i want to use one version of a library or another, i simply need to adjust the path when i compile my program. thus "import A.B" would mean i want module B that is in a subdirectory A of one of the directories in the path. ambiguous imports should be reported as errors. bye iavor -- ================================================== | Iavor S. Diatchki, Ph.D. student | | Department of Computer Science and Engineering | | School of OGI at OHSU | | http://www.cse.ogi.edu/~diatchki | ==================================================
participants (4)
-
Alastair Reid
-
Ganesh Sittampalam
-
Iavor Diatchki
-
Simon Marlow