RE: Module Holism (was RE: exposed packages and cabal depends)

On 12 April 2005 01:28, S. Alexander Jacobson wrote:
On Mon, 11 Apr 2005, Simon Marlow wrote:
The problem is, we don't want to import two modules, only to discover that somewhere in their dependencies they each use the same module name to refer to conflicting module implementations.
This is the problem that the overlap restriction leads to, yes.
No, you have this problem even with atomic modules. It is a result simply of not allowing two modules to share the same name in the same program and has nothing to do with package overlap.
When I say "overlap restriction" I mean the restriction that prevents having two modules with the same name in the same program.
Therefore, we really want to say that no two modules we might want to import into our programs (either directly or indirectly) should share the same name. And, in particular, we don't want a packaging or versioning system that encourages it!
No, you've drawn a bogus conclusion again. We most definitely want the ability to choose between multiple instances of a particular module in programs.
My point is that the choice of instance should be made at compile/build/run times and not at packaging time.
Dependencies must be expressed at package time, otherwise they are untracked dependencies. Right? A package must state that it needs implementation P of module M rather than implementation Q, otherwise there's a chance that the guy building the package will get the wrong one. Build-depends lets you select an implementation from a range (currently it's just a version range, but we could make the language more expressive and include alternatives, I doubt it would be that useful in practice, though). So dependencies are selected at compile time. Selecting dependencies at run-time (I assume this is what you mean by late binding) is another kettle of fish: at least, it would mean that GHC couldn't do its aggressive cross-module optimisation against the library that you're late-binding to. But I think it's worthwhile investigating to what extent this is possible, so that we can have upgradable shared libraries.
For example, if I have two versions of a package installed, say P-1 and P-2, I want to be able to compile my old code that depends on P-1 while still being able to write new code against P-2. And I want to be able to use other packages that still depend, for the time being, on P-1. When P-3 comes out, I don't want to be forced to uninstall P-1 and P-2 before I can use it.
Now what happens when you want to use one package that depends on P-1 and another that depends on P-2 at the same time?
Of course, you can't do that. That's what the "overlap restriction" prevents. As I've explained. [ the next three paragraphs, which I deleted, all complain about scenarios which we can't handle because of the overlap restriction. I don't think I need to comment any further. ] [ more stuff deleted, this thread is too long already ] Cheers, Simon

Simon, My main goal here is to free the user from the hassle of manual package installation. A module written in one location should "just work" when moved to another. Right now, we have that location transparency within a single box or networked file system. I am saying we should also have it for moves accross the Internet. So, the meaning of an import statements should not depend on its context (in effect, module identifiers in import statements should be a form of URL) and compilers/interpreters should be able to resolve those URLs to module implementations without user intervention. And, if the meaning of module names is parametrized by package then package names should be part of import statements. If packages are purely adminisistrative, then they shouldn't. Do we agree on at least this much? -Alex- PS In my last few posts I assumed you had rejected removing the overlap restriction because that is what I interperted you to have said e.g. here: We're not going to support this, at least for the forseeable future. It's a pretty big change: every entity in the program becomes parameterised by the package name as well as the module name, because module names can overlap. http://www.haskell.org//pipermail/haskell/2005-March/015597.html And here Also, the Haskell module hierarchy is supposed to reflect functionality, whereas package names are purely administrative. This is a reason for not including package names in source code. http://www.haskell.org//pipermail/libraries/2005-April/003513.html Did you change your mind or am I misinterpreting? -Alex- ______________________________________________________________ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com Note, in my last few posts have I assumed that you had rejected removing the overlap restriction: We're not going to support this, at least for the forseeable future. It's a pretty big change: every entity in the program becomes parameterised by the package name as well as the module name, because module names can overlap. This means a change to the language: there might be multiple types called M.T in the program, which are not compatible (they might have different representations). You can't pass a value of type M.T that you got from version 1.0 of the package to a function expecting M.T in version 2. And that you wanted packages to be administrative entiries without functional consequence e.g. I feel like you ha The rest of this discussion has been detail that depends on whether or not You appear to be Earlier in this discussion you had said that packages were purely administrative: and that you would not So, if package name is part of the module identifier, then that means a standard way to resolve package names to URLs. If not, then that means a standard way to revolve module names to URLs. Either way, my point is that the meaning of solution of import statements should be global/universal and not local to a particular Haskell installation. effectively module identifiers should resolve to URLs and not locations within a local filesystem. without explicit user intervention. Achieving this level of location transparency requires that In effect Achieving this level of location transparency requires that we have a universal way to identify module functionality for use in import statements. In other words, compilers/interpreters should be able to resolve the module identifiers in import statements to module implementation URLs (or URLs of packages from which implementations can be built). You said: implementations have a way to resolve module identifiers in import statements to package or module URLs. Note, in my last few posts, I've been assuming that you were NOT planning to make the package name part of the module identifier, because you said so: Did you change your mind or am I misinterpreting what you are saying here? Either way, the only way to eliminate untracked dependencies is if import statements contain enough information to identify sufficiently the exact functionality required by the module. As you say packages should be entirely administrative: On Tue, 12 Apr 2005, Simon Marlow wrote:
On 12 April 2005 01:28, S. Alexander Jacobson wrote:
On Mon, 11 Apr 2005, Simon Marlow wrote:
The problem is, we don't want to import two modules, only to discover that somewhere in their dependencies they each use the same module name to refer to conflicting module implementations.
This is the problem that the overlap restriction leads to, yes.
No, you have this problem even with atomic modules. It is a result simply of not allowing two modules to share the same name in the same program and has nothing to do with package overlap.
When I say "overlap restriction" I mean the restriction that prevents having two modules with the same name in the same program.
Therefore, we really want to say that no two modules we might want to import into our programs (either directly or indirectly) should share the same name. And, in particular, we don't want a packaging or versioning system that encourages it!
No, you've drawn a bogus conclusion again. We most definitely want the ability to choose between multiple instances of a particular module in programs.
My point is that the choice of instance should be made at compile/build/run times and not at packaging time.
Dependencies must be expressed at package time, otherwise they are untracked dependencies. Right?
A package must state that it needs implementation P of module M rather than implementation Q, otherwise there's a chance that the guy building the package will get the wrong one.
Build-depends lets you select an implementation from a range (currently it's just a version range, but we could make the language more expressive and include alternatives, I doubt it would be that useful in practice, though).
So dependencies are selected at compile time. Selecting dependencies at run-time (I assume this is what you mean by late binding) is another kettle of fish: at least, it would mean that GHC couldn't do its aggressive cross-module optimisation against the library that you're late-binding to. But I think it's worthwhile investigating to what extent this is possible, so that we can have upgradable shared libraries.
For example, if I have two versions of a package installed, say P-1 and P-2, I want to be able to compile my old code that depends on P-1 while still being able to write new code against P-2. And I want to be able to use other packages that still depend, for the time being, on P-1. When P-3 comes out, I don't want to be forced to uninstall P-1 and P-2 before I can use it.
Now what happens when you want to use one package that depends on P-1 and another that depends on P-2 at the same time?
Of course, you can't do that. That's what the "overlap restriction" prevents. As I've explained.
[ the next three paragraphs, which I deleted, all complain about scenarios which we can't handle because of the overlap restriction. I don't think I need to comment any further. ]
[ more stuff deleted, this thread is too long already ]
Cheers, Simon

On Tue, Apr 12, 2005 at 02:57:48PM -0400, S. Alexander Jacobson wrote:
My main goal here is to free the user from the hassle of manual package installation. A module written in one location should "just work" when moved to another. Right now, we have that location transparency within a single box or networked file system. I am saying we should also have it for moves accross the Internet.
So, the meaning of an import statements should not depend on its context (in effect, module identifiers in import statements should be a form of URL) and compilers/interpreters should be able to resolve those URLs to module implementations without user intervention.
And, if the meaning of module names is parametrized by package then package names should be part of import statements. If packages are purely adminisistrative, then they shouldn't.
Do we agree on at least this much?
What you describe sounds to me like a nightmare. It means I'd have to have a different import statement (protected by #ifdefs) for every version of each module I want to support--since in practice every version will change the interface. If all users are always connected to the internet, and all packages work with every version of every compiler, then yes, having a global namespace for modules where every interface change results in a module name change sounds great. But in a world where some users use older compilers, and some users aren't always connected to the internet, one would like to be able to change module interfaces without having to change all code that uses that module. -- David Roundy
participants (3)
-
David Roundy
-
S. Alexander Jacobson
-
Simon Marlow