Re: Revamping the module hierarchy

24 Jun 2009

      On 19/06/2009 14:44, Edward Kmett wrote:
...
On Fri, Jun 19, 2009 at 6:06 AM, wren ng thornton
mailto:wren@community.haskell.org> wrote:
I agree with Maurico that what we really need is to have the tools
    to be able to rearrange the tree at will. The Haskell language has
    no business dealing with the provenance of where modules come
    from--- and forcing modules to be named after their packages would
    make it do so. Currently, ghc-pkg (or whatever) handles the
    provenance of making sure that packages are visible to have their
    modules be loaded. As it stands, this provenance mechanism
    automatically roots all packages at the same place, but there's no
    reason it needs to. We just have to come up with the right DSL for
    scripting ghc-pkg (or equivalently, the right CLI) to be able to
    play around with the module namespace in a more intelligent way.
    For instance, let's assume we have:
     > ghc-pkg describe libfoo-0.0.0
        ...
        exposed-modules: Data.Foo Control.Bar Control.Bar.Baz
        ...
    Now, if we say:
        ghc-pkg expose libfoo-0.0.0
    Then any Haskell programs can now load the modules mentioned above,
    by the names mentioned above. If instead we said something like:
        ghc-pkg expose libfoo-0.0.0 at Zot
    Then Haskell programs would be able to load the modules by the names
    Zot.Data.Foo, Zot.Control.Bar, and Zot.Control.Bar.Baz instead. And
    if we wanted to rebase subtrees then we could say something like:
        ghc-pkg expose libfoo-0.0.0:Control.Bar as Quux
    Which would make the modules Quux and Quux.Baz available for
    loading, and would effectively hide libfoo-0.0.0:Data.Foo from being
    loadable.
    To implement this we need to update not only ghc-pkg, but also the
    Cabal format. Rather than just specifying which dependent packages
    must be exposed, we also need to specify *where* the package expects
    them to be exposed in the module namespace. Assuming this is
    implemented sanely, then all of the renaming for changing the root
    and for rebasing subtrees can be boiled out and undone during the
    linking phase (that is, when GHC is "linking" things to follow
    imports etc; not when ld is actually linking things). An import
    declaration is a reference to an actual compiled module, the name is
    just a proxy to know where to find it, the name doesn't have any
    meaning in itself.
    Since every package gets their own local view of the module
    namespace, every package can choose their own names for things.
    Moreover, since every package must specify their local view, if one
    wants to have some crazy jumbled view then the burden is on them to
    specify how to do it. Since every package exposes a view of its
    exposed module namespace, this serves as the default view. Since it
    takes work for people to rearrange things, there will still be a
    force to give things good names in the first place. Only we would no
    longer be stuck with bad decisions.
+1
I really like this proposal.
I agree that I much prefer the current orthogonality of modules provided
to package names. It lets you refactor packages into several smaller
chunks, and this would not even be possible under the other namespacing
schemes I've seen bandied about without breaking other code.
The biggest problem that I have with the current scheme is the inability
to work with packages with conflicting namespaces (i.e. to support both
the mtl and one of its competitors that overlap it). This quite
elegantly works around that restriction.
There's a little-known extension in GHC called PackageImports that lets 
you do this:

import "monads-tf" Control.Monad.State

we use this to implement the base3-compat overlay.  I'm not claiming 
this is something we want to advertise widely or start using to resolve 
conflicts, just pointing out its existence.

wren's proposal above actually requires a good deal of effort to 
implement.  It would decouple the compile-time namespace of module names 
from the actual module names used in the compiled package, and that is a 
deep change.  However, having made that change, lots of things become 
possible.

I should point out that there have been many proposals of this kind in 
the past (search for "grafting" and "mounting" in the mailing-list 
archives).  To my mind the reason we haven't done anything like this so 
far is because there hasn't been a single proposal that stands out as 
being the right thing, and with good power-to-weight ratio.  In the past 
it has been hard to predict what we actually *need* in the way of module 
namespace manipulation when we start scaling up to thousands of 
packages, but that is now changing, so it might well be time to think 
about this again.

Cheers,
	Simon

Re: Revamping the module hierarchy

Simon Marlow