
On 2 maj 2008, at 11.27, apfelmus wrote:
Duncan Coutts wrote:
Thomas Schilling wrote:
For example, if we write a program that uses the function 'Foo.foo' contained in package 'foo' and we happen to have used 'foo-0.42' for testing of our program. Then, given the knowledge that 'Foo.foo' was introduced in 'foo-0.23' and changed semantics in 'foo-2.0' then we know that 'foo >= 0.23 && < 2.0' is the correct and complete dependency description.
I would go even further and simply use "my program 'bar' compiles with foo-0.42" as dependency description. In other words, whether the package foo-0.23 can be used to supply this dependency or not will be determined when somebody else tries to compile Bar with it.
In both cases, the basic idea is that the library user should *not* think about library versions, he just uses the one that is in scope on his system. Figuring out which other versions can be substituted is the job of the library author. In other words, the burden of proof is shifted from the user ("will my program compile with foo-1.1?") to the author ("which versions of my library are compatible?"), where it belongs.
I think we mean the same thing. If I write a program and test it against a specific version of a library then my program's source code and knowledge about which specific versions of libraries I used, most of the time, contains *all* the information necessary to determine which other library versions it can be built with. From the source code we need information about what is imported, from the library author we need a *formal* changelog. This changelog describes for each released version what part of the interface and semantics have changed. The problem here is, of course, that this is a lot of information to provide. Furthermore, I think we need information about imports from the library user, if we ignore this, then the PVP is *exactly* what we need. The PVP describes when things *could* break, but it does so in an extremely pessimistic way. If we have information about what exactly changed and what is used by a particular library, we can find out what the exact version range is. For example, if we build our package against foo-0.42 and bar-2.3 and both packages follow the PVP then the following will trivially be true: build-depends: foo-0.42.*, bar-2.3.* where "-X.Y.*" is a shortcut for ">= X.Y && < X.(Y+1)". The problem is that this is extremely pessimistic, so we have to manually check whenever a new version of a dependency comes out and update the "known-to-work-with"-range. With more information (obtained mostly by tools) we can automate this process, and, in fact, both approaches can co-exist.
However it would only help for the development _history_, we still have no solution for the problem of packages being renamed (or modules moving between packages) breaking other existing packages. Though similarly we have no solution to the problem of modules being renamed. Perhaps it's just that we have not done much module renaming recently so people don't see it as an issue.
With the approach above, it's possible to handle package/module renaming. For instance, if the package 'foo' is split into 'f-0.1' and 'oo-0.1' at some point, we can still use the union of these two to fulfill the old dependency 'foo-0.42'.
This is kind of the same like using a "virtual package" that is simply a re-export of other packages. This would help a lot with our current problems with the base split (which will continue, as base will be split up even further).
In other words, the basic model is that a module/package like 'bar' with a dependency like 'foo-0.42' as just a function that maps a value of the same type (= export list) as 'foo-0.42' to another value (namely the set of exports of 'bar'). So, we can compile for instance
bar (foo-0.42)
or
bar (f-0.1 `union` oo-0.1)
Of course, the problems are
1) specifying the types of the parameters, 2) automatically choosing good parameters.
For 1), one could use a very detailed import list, but I think that this feels wrong. I mean, if I have to specify the imports myself, why did I import foo-0.42 in the first place? Put differently, when I say 'import Data.Map' I want to import both its implementation and the interface. So, I argue that the goal is to allow type specifications of the form 'same type as foo-0.42'.
Problem 2) exists because if I have foo-0.5 on in scope on my system and a package lists foo-0.42 as a dependency, the compiler should somehow figure out that he can use foo-0.5 as argument. Of course, it will be tricky/impossible to figure out that f-0.1 `union` oo-0.1 is a valid argument, too.
So, the task would be to develop a formalism, i.e. some kind of "lambda calculus for modules" that can handle problems 1) and 2). The formalism should be simple to understand and use yet powerful, just like our beloved lambda calculus.
A potential pitfall to any solution is that name and version number don't identify a compiled package uniquely! For instance,
foo-0.3 (bytestring-1.1)
is very different from
foo-0.3 (bytestring-1.2)
if foo exports the ByteString type. That's the diamond import problem. In other words, foo-0.3 is always the same function, but the evaluated results are not.
I think a formal changelog can also help with renaming (even of exported entities), but, I agree, for all this to work we need to formalise it first, and then build tools to automate most of the work. / Thomas -- My shadow / Change is coming. / Now is my time. / Listen to my muscle memory. / Contemplate what I've been clinging to. / Forty-six and two ahead of me.