Packages and .NET assemblies

I just took a deeper look at how .NET handles packages, since they have thought through many of the same issues and have a scheme that scales well. There's a great deal of similarity with the schemes we're thinking about for Haskell. The executive summary: - A .NET assembly fulfills the same function as a Haskell package: an assembly is the unit of versioning and distribution, and carries with it a bunch of metadata (author, copyright, etc.). - An assembly is uniquely identified by a triple (name,version,culture). - An assembly travels in a single file (.DLL) - An assembly may define multiple namespaces (like A.B.C), and entities within namespaces. Much like modules in Haskell. - Assemblies are not named directly in source code. C# code (for example) refers to namespaces and entities only. - When compiling C# code, you name the DLLs that contain assemblies to bring into scope (in fact, the transitive closure) using command-line options. - The C# compiler brings a set of assemblies into scope by default (== exposed packages in GHC). - The binary has baked into it the exact identity of the assemblies it was compiled against. - At runtime, the binary might be rebound to different versions of the assemblies it depends on, via a complicated structure of configuration files. So there's no grafting, or even assembly-qualified imports. How do they get away with this? Well, one reason is that namespaces are generally globally unique, because the convention is to use names beginning with an organisation (eg. Microsoft.System). So when you bring a bunch of assemblies into scope, their namespaces generally don't conflict. Interestingly, the assembly story is almost identical to what GHC supports right now, and when we change module identities to include the package name (which I'm working on now), it will be even closer. Not that I necessarily think we can get away with doing nothing to the language, but at lesat this seems to argue for being conservative. Cheers, Simon

Hi, As another data-point, you may want to take a look at what maven2 does with POM files. Each project has a POM, produces one artifact, and then the artifact and POM go around together. Projects can be arranged into a hierachy, so that bulding a project will cause all of its child projects to be built. An artifact could be a .jar, .war, .so or anything else you build. The POM stores meta-data about the artifact, including dependencies, where the source-code repository is, the URL of the web site and so on. Each POM has a name,version,group where the group will normally identify the organisation and/or project that makes the artifact. There's a mechanism for tracking the transient dependencies between artifacts and for matching up compattible version ranges. This is all integrated with a web-hosted repository so that when you build/run with maven2, any binary dependencies required are got automatically. Lastly, there are two styles of version names. For releases, the normal dotted notation is used. During development, SNAPSHOT is appended. Snapshot versions are eagerly checked for updates, so support collaborative development of hot code. Dotted versions are lazily checked for updates, and it's assumed that if you have a copy of it, there's no point replacing it untill the version changes. Can I put in a +1 vote for things built by haskell carrying around a digest/signature of the source code (or parse tree or perhaps just the externally visible symbols?), so that even if the versions match, we can get a message early during compilation or run-time telling us if there has been a clearly incompattible change? With propper package-management and multi-package build, this could even trigger the right dependencies to be recompiled. Sorry if this sounds like a "You don't have my favorite language feature X" rant, but I've spent nearly 2 weeks chasing module dependencies, darc URLs, fiddling build orders and so on, most of which is stuff that could be automated if the cabal files where a little bit richer and the compiler a bit more paranoid, and if there was some global repository of cabals of the sw that people wish to make available to the community. Matthew ps my experience is that prefixing modules by a dotted organisation designator ala Java/.NET seems to work very well in practice On Tuesday 11 July 2006 13:51, Simon Marlow wrote:
I just took a deeper look at how .NET handles packages, since they have thought through many of the same issues and have a scheme that scales well. There's a great deal of similarity with the schemes we're thinking about for Haskell.
The executive summary:
- A .NET assembly fulfills the same function as a Haskell package: an assembly is the unit of versioning and distribution, and carries with it a bunch of metadata (author, copyright, etc.).
- An assembly is uniquely identified by a triple (name,version,culture).
- An assembly travels in a single file (.DLL)
- An assembly may define multiple namespaces (like A.B.C), and entities within namespaces. Much like modules in Haskell.
- Assemblies are not named directly in source code. C# code (for example) refers to namespaces and entities only.
- When compiling C# code, you name the DLLs that contain assemblies to bring into scope (in fact, the transitive closure) using command-line options.
- The C# compiler brings a set of assemblies into scope by default (== exposed packages in GHC).
- The binary has baked into it the exact identity of the assemblies it was compiled against.
- At runtime, the binary might be rebound to different versions of the assemblies it depends on, via a complicated structure of configuration files.
So there's no grafting, or even assembly-qualified imports. How do they get away with this? Well, one reason is that namespaces are generally globally unique, because the convention is to use names beginning with an organisation (eg. Microsoft.System). So when you bring a bunch of assemblies into scope, their namespaces generally don't conflict.
Interestingly, the assembly story is almost identical to what GHC supports right now, and when we change module identities to include the package name (which I'm working on now), it will be even closer. Not that I necessarily think we can get away with doing nothing to the language, but at lesat this seems to argue for being conservative.
Cheers, Simon _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
participants (2)
-
Matthew Pocock
-
Simon Marlow