Re: [Haskell] Re: Trying to install binary-0.4
Udo Stenzel wrote:
Simon Marlow wrote:
So a package that depends on 'base' (with no upper version bound) *might* be broken in GHC 6.8.1, depending on which modules from base it actually uses. Let's look at the other options:
- if we rename base, the package will *definitely* be broken
- if the package specified an upper bound on its base dependency, it will *definitely* be broken
- if you provide a 'base' configuration that pulls in the stuff that used to be in base, the package will work
I don't know of a way to do that. The name of the package is baked into the object files at compile time, so you can't use the same compiled module in more than one package.
I hate betting, but I'd like to know if...
- it is okay to give GHC 6.4/6.6 a castrated configuration of the base libraries to remove the conflict with recent ByteString?
Sure, that's what I suggested before. Moving modules of an existing package from 'exposed-modules' to 'hidden-modules' is safe (I wouldn't recommend removing them entirely).
- when GHC 6.8 comes out containing base-3, will it be possible to additionally install something calling base-2 with both being available to packages?
In theory yes - the system was designed to allow this. In practice we've never tried it, and base-2 might not compile unmodified with GHC 6.8.
- If so, will existing Cabal packages automatically pick up the compatible base-2 despite base-3 being available?
Only if they specify an upper bound on the base dependency, which most don't, but it's an easy change to make. Cheers, Simon
On Tuesday 16 October 2007 21:16, Simon Marlow wrote:
- when GHC 6.8 comes out containing base-3, will it be possible to additionally install something calling base-2 with both being available to packages?
In theory yes - the system was designed to allow this. In practice we've never tried it, and base-2 might not compile unmodified with GHC 6.8.
- If so, will existing Cabal packages automatically pick up the compatible base-2 despite base-3 being available?
Only if they specify an upper bound on the base dependency, which most don't, but it's an easy change to make.
It seems more sensible to me for dependencies to always have an upper bound of the next major version. foo-3.y.z won't satisfy foo-2.3.4. If it so happens that a package depends on the subset of foo's interface that was retained from foo-2.3.4 through to foo-3.0.0 then the dependency can be changed to foo-2.3.4,3.0.0 (modulo syntax) once it has been tested. If dependencies on foo often end up like this due to use of a distinct subset of the interface it's probably a good sign that foo is too coarse-grained. If a major version increment, by definition, implies a removal of functionality from a package then having no upper bound on the dependency pushes work out to the user that would be better done by the maintainer. With an upper bound users are still able to try and get the package going with a later version of a dependency if they want. Dan
Several good points have been raised in this thread, and while I might not agree with everything, I think we can all agree on the goal: things shouldn't break so often. So rather than keep replying to individual points, I'd like to make some concrete proposals so we can make progress. 1. Document the version numbering policy. We should have done this earlier, but we didn't. The proposed policy, for the sake of completeness is: x.y where: x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical further sub-versions may be added after the x.y, their meaning is package-defined. Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest. 2. Precise dependencies. As suggested by various people in this thread: we change the convention so that dependencies must specify a single x.y API version, or a range of versions with an upper bound. Cabal or Hackage can refuse to accept packages that don't follow this convention (perhaps Hackage is a better place to enforce it, and Cabal should just warn, I'm not sure). Yes, earlier I argued that not specifying precise dependencies allows some packages to continue working even when dependencies change, and that having precise dependencies means that all packages are guaranteed to break when base is updated. However, I agree that specifying precise dependencies is ultimately the right thing, we'll get better errors when things break, There's lots more to discuss, but I think the above 2 proposals are a step in the right direction, agreed? Cheers, Simon
From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Simon Marlow
x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical
Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest.
Just a minor point, but would mind explaining exactly what lexicographic ordering implies? It appears to me that e.g. version 9.3 of a package would be preferred over version 10.0. That strikes me as counter-intuitive. Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. *****************************************************************
On 10/16/07, Bayley, Alistair
Just a minor point, but would mind explaining exactly what lexicographic ordering implies? It appears to me that e.g. version 9.3 of a package would be preferred over version 10.0. That strikes me as counter-intuitive.
I believe the intent is "lexicographic" in the sense that a version number is a dot-separated sequence of integers. So if you interpret "9.3" as [9, 3] and "10.0" as [10, 0], then Prelude> max [9, 3] [10, 0] [10,0] and Prelude> max [1, 9] [1, 10] [1,10] work in the expected way. Stuart
Bayley, Alistair wrote:
From: libraries-bounces@haskell.org [mailto:libraries-bounces@haskell.org] On Behalf Of Simon Marlow
x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical
Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest.
Just a minor point, but would mind explaining exactly what lexicographic ordering implies? It appears to me that e.g. version 9.3 of a package would be preferred over version 10.0. That strikes me as counter-intuitive.
The lexicographical ordering would make 10.0 > 9.3. In general, A.B > C.D iff A > C or A == C && B > D. When we say the "latest" version we mean "greatest", implying that version numbers increase with time. Does that help? Cheers, Simon
From: Simon Marlow [mailto:simonmarhaskell@gmail.com]
The lexicographical ordering would make 10.0 > 9.3. In general, A.B > C.D iff A > C or A == C && B > D. When we say the "latest" version we mean "greatest", implying that version numbers increase with time. Does that help?
Sort of. It's what I'd expect from a sensible version comparison. It's just not something I'd ever choose to call lexicographic ordering. IMO, lexicographgic ordering is a basic string comparision so e.g. max "10.0" "9.3" = "9.3" I'd call what you're doing numeric ordering. Does it have a better name, like version-number-ordering, or section-number-ordering (e.g. Section 3.2.5, Section 3.2.6)? Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. *****************************************************************
Bayley, Alistair wrote:
From: Simon Marlow [mailto:simonmarhaskell@gmail.com]
The lexicographical ordering would make 10.0 > 9.3. In general, A.B > C.D iff A > C or A == C && B > D. When we say the "latest" version we mean "greatest", implying that version numbers increase with time. Does that help?
Sort of. It's what I'd expect from a sensible version comparison. It's just not something I'd ever choose to call lexicographic ordering. IMO, lexicographgic ordering is a basic string comparision so e.g.
max "10.0" "9.3" = "9.3"
I'd call what you're doing numeric ordering. Does it have a better name, like version-number-ordering, or section-number-ordering (e.g. Section 3.2.5, Section 3.2.6)?
I've heard it called lexicographical ordering before, but I'm happy to call it by whatever name induces the least confusion! Cheers, Simon
On Oct 16, 2007, at 9:01 , Bayley, Alistair wrote:
From: Simon Marlow [mailto:simonmarhaskell@gmail.com]
The lexicographical ordering would make 10.0 > 9.3. In general, A.B > C.D iff A > C or A == C && B > D. When we say the "latest" version we mean "greatest", implying that version numbers increase with time. Does that help?
Sort of. It's what I'd expect from a sensible version comparison. It's just not something I'd ever choose to call lexicographic ordering. IMO, lexicographgic ordering is a basic string comparision so e.g.
max "10.0" "9.3" = "9.3"
I'd call what you're doing numeric ordering. Does it have a better name, like version-number-ordering, or section-number-ordering (e.g. Section 3.2.5, Section 3.2.6)?
"Lexicographic ordering", to me, means ordering by the collation sequence for individual characters. I'd call this multi-field numeric ordering with "." as the field separator. "Version number ordering" is a bit trickier: it's used by Linux/*BSD package systems that need to deal with versions like "1.2a3_4,1" (which in FreeBSD means package version 1.2a3 (which is defined by the package originator and usually means the alpha-3 release of version 1.2), FreeBSD package version 4 thereof, with an epoch of 1 to force higher sorting because at some point a new version was retracted (say, 1.2a4 was packaged, then turned out to have major bugs that caused a rollback to 1.2a3, so the epoch is bumped to indicate that this 1.2a3 is actually later than the 1.2a4). RPM and APT have similar mechanisms, although syntactically different. (I don't *think* we need to care about this. Unfortunately, while Cabal version numbers are fairly clearly only the upstream part of it, and defined such that we don't need to determine whether 1.2a4 sorts before or after 1.2 (a rat's nest pretty much every OS distribution packaging system needs to fight with), I can imagine Hackage needing something like an epoch to handle regressions while allowing cabal-install to do the right thing.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH
On Tue, 2007-10-16 at 14:01 +0100, Bayley, Alistair wrote:
From: Simon Marlow [mailto:simonmarhaskell@gmail.com]
The lexicographical ordering would make 10.0 > 9.3. In general, A.B > C.D iff A > C or A == C && B > D. When we say the "latest" version we mean "greatest", implying that version numbers increase with time. Does that help?
Sort of. It's what I'd expect from a sensible version comparison. It's just not something I'd ever choose to call lexicographic ordering. IMO, lexicographgic ordering is a basic string comparision so e.g.
max "10.0" "9.3" = "9.3"
I'd call what you're doing numeric ordering. Does it have a better name, like version-number-ordering, or section-number-ordering (e.g. Section 3.2.5, Section 3.2.6)?
It's lexicographic ordering on the list of numbers, not on the string representation. ie it's [10, 0] > [9, 3] not "10.0" > "9.3" Internally we represent version numbers as lists of integers and use the default Ord instance. Duncan
Simon Marlow wrote:
Several good points have been raised in this thread, and while I might not agree with everything, I think we can all agree on the goal: things shouldn't break so often.
I have another concrete proposal to avoid things breaking so often. Let us steal from something that works: shared library versioning on unixy systems. On Max OS X, I note that I have, in /usr/lib:
lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.2.dylib -> libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.3.0.0.dylib -> libcurl.3.dylib -rwxr-xr-x 1 root wheel 201156 Aug 17 17:14 libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.dylib -> libcurl.3.dylib
The above declaratively expresses that libcurl-3.3.0 provides the version 3 API and the version 2 API. This is the capability that should be added to Haskell library packages. Right now a library can only declare a single version number. So if I update hsFoo from 2.1.1 to 3.0.0 then I cannot express whether or not the version 3 API is a superset of (backward compatible with) the version 2 API. Once it is possible to have cabal register the hsFoo-3.0.0 also as hsFoo-2 it will be easy to upgrade to hsFoo. No old programs will fail to compile. Who here knows enough about the ghc-pkg database to say how easy or hard this would be? -- Chris
ChrisK wrote:
Simon Marlow wrote:
Several good points have been raised in this thread, and while I might not agree with everything, I think we can all agree on the goal: things shouldn't break so often.
I have another concrete proposal to avoid things breaking so often. Let us steal from something that works: shared library versioning on unixy systems.
On Max OS X, I note that I have, in /usr/lib:
lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.2.dylib -> libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.3.0.0.dylib -> libcurl.3.dylib -rwxr-xr-x 1 root wheel 201156 Aug 17 17:14 libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.dylib -> libcurl.3.dylib
The above declaratively expresses that libcurl-3.3.0 provides the version 3 API and the version 2 API.
This is the capability that should be added to Haskell library packages.
Right now a library can only declare a single version number. So if I update hsFoo from 2.1.1 to 3.0.0 then I cannot express whether or not the version 3 API is a superset of (backward compatible with) the version 2 API.
Certainly, this is something we want to support. However, there's an important difference between shared-library linking and Haskell: in Haskell, a superset of an API is not backwards-compatible, because it has the potential to cause new name clashes.
Once it is possible to have cabal register the hsFoo-3.0.0 also as hsFoo-2 it will be easy to upgrade to hsFoo. No old programs will fail to compile.
Who here knows enough about the ghc-pkg database to say how easy or hard this would be?
It could be done using the tricks that Claus just posted and I followed up on. You'd need a separate package for hsFoo-2 that specifies exactly which bits of hsFoo-3 are re-exported. Given some Cabal support and a little extension in GHC, this could be made relatively painless for the library maintainer. Cheers, Simon
On Tue, Oct 16, 2007 at 01:57:01PM +0100, Simon Marlow wrote:
Certainly, this is something we want to support. However, there's an important difference between shared-library linking and Haskell: in Haskell, a superset of an API is not backwards-compatible, because it has the potential to cause new name clashes.
This is the case on Unixy .so systems too, because the namespace is flat. If libreadline suddenly starts exporting a symbol named SDL_init, programs which use both readline and sdl will break. I have not seen this happen in practice. (Which might have something to do with the aforementioned name mangling :)) Stefan
I have another concrete proposal to avoid things breaking so often. Let us steal from something that works: shared library versioning on unixy systems.
indeed!-) there are established workarounds that are needed to make that system work as it does, so it is a good idea to check whether cabal has the means to cover those situations.
The above declaratively expresses that libcurl-3.3.0 provides the version 3 API and the version 2 API.
This is the capability that should be added to Haskell library packages.
Right now a library can only declare a single version number. So if I update hsFoo from 2.1.1 to 3.0.0 then I cannot express whether or not the version 3 API is a superset of (backward compatible with) the version 2 API.
Certainly, this is something we want to support. However, there's an important difference between shared-library linking and Haskell: in Haskell, a superset of an API is not backwards-compatible, because it has the potential to cause new name clashes.
yes, one would need to define what it means for one api to be compatible with another. even so, i think that permitting a single package to act as a provider for multiple versions of an api is a necessary feature, even more so if loose dependency specs like 'base', or 'base >= 1.0' are going to be discouraged.
It could be done using the tricks that Claus just posted and I followed up on. You'd need a separate package for hsFoo-2 that specifies exactly which bits of hsFoo-3 are re-exported. Given some Cabal support and a little extension in GHC, this could be made relatively painless for the library maintainer.
are those tricks necessary in this specific case? couldn't we have a list/range of versions in the version: field, and let cabal handle the details? aside: what happens if we try to combine two modules M and N that use the same api A, but provided by two different packages P1 and P2? say, M was built when P1 was still around, but when N was built, P2 had replaced P1, still supporting A, but not necessarily with the same internal representation as used in P1. claus
Claus Reinke wrote:
It could be done using the tricks that Claus just posted and I followed up on. You'd need a separate package for hsFoo-2 that specifies exactly which bits of hsFoo-3 are re-exported. Given some Cabal support and a little extension in GHC, this could be made relatively painless for the library maintainer.
are those tricks necessary in this specific case? couldn't we have a list/range of versions in the version: field, and let cabal handle the details?
I don't understand what you're proposing here. Surely just writing version: 1.0, 2.0 isn't enough - you need to say what the 1.0 and 2.0 APIs actually *are*, and then wouldn't that require more syntax? I don't yet see a good reason to do this in a single .cabal file instead of two separate packages. The two-package way seems to require fewer extensions to Cabal.
aside: what happens if we try to combine two modules M and N that use the same api A, but provided by two different packages P1 and P2? say, M was built when P1 was still around, but when N was built, P2 had replaced P1, still supporting A, but not necessarily with the same internal representation as used in P1.
Not sure what you mean by "try to combine". A concrete example? Cheers, Simon
are those tricks necessary in this specific case? couldn't we have a list/range of versions in the version: field, and let cabal handle the details?
I don't understand what you're proposing here. Surely just writing
version: 1.0, 2.0
isn't enough - you need to say what the 1.0 and 2.0 APIs actually *are*, and then wouldn't that require more syntax? I don't yet see a good reason to do this in a single .cabal file instead of two separate packages. The two-package way seems to require fewer extensions to Cabal.
yes, and no. cabal is currently not symmetric in this: providers specify apis (at the level of exposed modules), clients only specify api numbers as dependencies. the idea was for the cabal file to specify a single provided api, but to register that as sufficient for a list of dependency numbers. so the package would implement the latest api, but could be used by clients expecting either the old or the new api.
aside: what happens if we try to combine two modules M and N that use the same api A, but provided by two different packages P1 and P2? say, M was built when P1 was still around, but when N was built, P2 had replaced P1, still supporting A, but not necessarily with the same internal representation as used in P1.
Not sure what you mean by "try to combine". A concrete example?
lets see - how about this: -- package P-1, Name: P, Version: 0.1 module A(L,f,g) where newtype L a = L [a] f a (L as) = elem a as g as = L as -- package P-2, Name: P, Version: 0.2 module A(L,f,g) where newtype L a = L (a->Bool) f a (L as) = as a g as = L (`elem` as) if i got this right, both P-1 and P-2 support the same api A, right down to types. but while P-1's A and P-2's A are each internally consistent, they can't be mixed. now, consider module M where import A m = g [1,2,3] module N where import A n :: Integer -> A.L Integer -> Bool n = f so, if i install P-1, then build M, then install P-2, then build N, wouldn't N pick up the "newer" P-2, while M would use the "older" P-1? and if so, what happens if we then add module Main where import M import N main = print (n 0 m) i don't seem to be able to predict the result, without actually trying it out. can you?-) i suspect it won't be pretty, though. claus
Claus Reinke wrote:
the idea was for the cabal file to specify a single provided api, but to register that as sufficient for a list of dependency numbers. so the package would implement the latest api, but could be used by clients expecting either the old or the new api.
I don't see how that could work. If the old API is compatible with the new API, then they might as well have the same version number, so you don't need this. The only way that two APIs can be completely compatible is if they are identical. A client of an API can be tolerant to certain changes in the API, but that is something that the client knows about, not the provider. e.g. if the client knows that they use explicit import lists everywhere, then they can be tolerant of additions to the API, and can specify that in the dependency.
aside: what happens if we try to combine two modules M and N that use the same api A, but provided by two different packages P1 and P2? say, M was built when P1 was still around, but when N was built, P2 had replaced P1, still supporting A, but not necessarily with the same internal representation as used in P1.
Not sure what you mean by "try to combine". A concrete example?
lets see - how about this:
-- package P-1, Name: P, Version: 0.1 module A(L,f,g) where newtype L a = L [a] f a (L as) = elem a as g as = L as
-- package P-2, Name: P, Version: 0.2 module A(L,f,g) where newtype L a = L (a->Bool) f a (L as) = as a g as = L (`elem` as)
if i got this right, both P-1 and P-2 support the same api A, right down to types. but while P-1's A and P-2's A are each internally consistent, they can't be mixed. now, consider
module M where import A m = g [1,2,3]
module N where import A n :: Integer -> A.L Integer -> Bool n = f
so, if i install P-1, then build M, then install P-2, then build N, wouldn't N pick up the "newer" P-2,
while M would use the "older" P-1? and if so, what happens if we then add
module Main where import M import N main = print (n 0 m)
You'll get a type error - try it. The big change in GHC 6.6 was to allow this kind of construction to occur safely. P-1:A.L is not the same type as P-2:A.L, they don't unify.
i don't seem to be able to predict the result, without actually trying it out. can you?-) i suspect it won't be pretty, though.
Sure. We have a test case in our testsuite for this very eventuality, see http://darcs.haskell.org/testsuite/tests/ghc-regress/typecheck/bug1465 that particular test case arose because someone discovered that the type error you get is a bit cryptic (it's better in 6.8.1). Cheers, Simon
the idea was for the cabal file to specify a single provided api, but to register that as sufficient for a list of dependency numbers. so the package would implement the latest api, but could be used by clients expecting either the old or the new api.
I don't see how that could work. If the old API is compatible with the new API, then they might as well have the same version number, so you don't need this. The only way that two APIs can be completely compatible is if they are identical.
if that is your definition of compatible, you can never throw any packages away, because they can never be subsumed by newer versions of themself. alternatively, it would require perpetual updates of dependencies in package descriptions, which we'd like to avoid, right? a few examples, of the top of my head: - consider the base split in reverse: if functionality is only repackaged, the merged base would also provide for the previously separate sub-package apis (that suggests a separate 'provides:' field, though, as merely listing version numbers wouldn't be sufficient) - consider the base split itself: if there was a way for the base split authors to tell cabal that the collection of smaller packages can provide for clients of the the old big base, those clients would not run into trouble when the old big base is removed - consider adding a new monad transformer to a monad transformer library, or a new regex variant to a regex library - surely the new package version can still provide for clients of the old version - consider various packages providing different implementations of an api, say edison's - surely any of the implementations will do for clients who depend only on the api, not on specifics the reason this could work when updating packages is that packages written against the old api were not aware of the new features in the new api. when compiling a package against a dependency providing multiple versions, there should be a warning if the client does not refer to the newest version - but it would still build, and there'd be a clear hint as to what needs to be changed.
A client of an API can be tolerant to certain changes in the API, but that is something that the client knows about, not the provider. e.g. if the client knows that they use explicit import lists everywhere, then they can be tolerant of additions to the API, and can specify that in the dependency.
that is the very issue i'd like to see reversed. you're right the first time round: when the client is first written, it is the client's responsibility to specify a useable dependency version; but keeping the responsibility this way round causes nothing but trouble after the client has been released, if its dependencies develop faster than the client (in the current case, if a major change in base shakes the foundation all other packages were built on; a more typical example would be: someone writes a useful package P, depending on X, Y, Z, then leaves academia to hack web pages for a living, leaving users and clients of P frustrated as X, Y, and Z move on). so, i'd like to see two stages, before and after publishing: 1. author of client package specifies precise dependencies 2. users of client packages can continue using it unchanged even if its dependencies move on one way to assure 2 is to keep all old package versions around forever somewhere (we can't do a whole-web garbage collection, so we never know when there are no more pointers). another way is to allow authors of package dependencies to take on part of the burden, thereby helping to reduce breakage and garbage for their clients. so the authors of mtl-9.0 could note that it still provides all the modules of earlier versions, so the package manage would only have to keep the latest version around, and clients of earlier versions would not notice any breakage.
You'll get a type error - try it. The big change in GHC 6.6 was to allow this kind of construction to occur safely. P-1:A.L is not the same type as P-2:A.L, they don't unify.
i did (just before pressing send;-). the message (in 6.6.1) was of the kind 'A.L Integer' does not match 'A.L Integer'. i see you've now added both package and version to the error message - that should reduce the confusion. since package sources may not be available, this is still not ideal, but i don't see what to do about that. claus
"Claus Reinke"
if that is your definition of compatible, you can never throw any packages away
Is this a problem?
alternatively, it would require perpetual updates of dependencies in package descriptions, which we'd like to avoid, right?
I think the whole point of all this is that a package that used to work should continue to work without modification, as long as its dependencies are satisfied.
- consider [..]
All this complexity seems to arise from the perceived need to be able to fulfil dependencies from other packages than the specified ones. Isn't it just much easier to keep old versions around? Isn't this what everybody else does - e.g. using branches in the VC or similar? Branching support in the build system sounds like entirely the wrong approach to me. (Or am I just misunderstanding completely?)
A client of an API can be tolerant to certain changes in the API, but that is something that the client knows about, not the provider.
that is the very issue i'd like to see reversed.
you're right the first time round: when the client is first written, it is the client's responsibility to specify a useable dependency version; but keeping the responsibility this way round causes nothing but trouble after the client has been released, if its dependencies develop faster than the client
Yes.
so, i'd like to see two stages, before and after publishing: 1. author of client package specifies precise dependencies 2. users of client packages can continue using it unchanged even if its dependencies move on
one way to assure 2 is to keep all old package versions around forever somewhere (we can't do a whole-web garbage collection, so we never know when there are no more pointers).
Sounds good to me. If you're careful about tagging your darcs repos, any trunk version (including common bugfixes) should be extractable.
another way is to allow authors of package dependencies to take on part of the burden, thereby helping to reduce breakage and garbage for their clients. so the authors of mtl-9.0 could note that it still provides all the modules of earlier versions,
I'm still at a loss as to why they could not name it mtl-8.n+1 instead. And if there's a good answer to that, why they couldn't just have two separate cabal files in their distribution and some hackery to build and install either or both of 8.n+1 and 9.0. What happens when the branches diverge, say there's a bugfix that only applies to 8.x?
so the package manage would only have to keep the latest version around, and clients of earlier versions would not notice any breakage.
So the trade off is some disk space versus a more complicated build system and possibly more manual intervention? Not that I don't have faith in maintainers' interest and capability of precisely specifying compatibility issues in the libraries they develop, but for an old bit-rotted application, I'm fairly sure the safest thing is to build against the libraries it was written against -- or at a branch of same with as few non-bugfix modifications as possible. -k -- If I haven't seen further, it is by standing in the footprints of giants
if that is your definition of compatible, you can never throw any packages away
Is this a problem?
apparently, yes. no two versions of base with ghc, only the newest set of core packages with each ghc release. and how much time do you want to spend on finding, re-building, and re-installing old packages everytime you move to a new machine? it isn't (just) about space on a disk, it is about downloads and management, not to mention sanity of mind;-) a simpler way of putting the responsibility issue: - every package writer is responsible for not reducing the usability of his/her package at every update; as with a function type, that works both ways: use the simplest/newest dependencies available, and keep your package useable by existing clients so we're not just talking about packages at certain points in their lifetime, we're talking about the lifetime of packages in the context of their usage contexts/ package databases. perhaps we should treat package databases as distributed revision control system repos with interlinked dependencies? then, just as darcs did, we could focus on collections of patches that create consistent new repos. instead of "this is package P-2.11, deal with it", it would be something like "add package P-2.11; replace uses of P-2.{0-10} with uses of P-2.11; remove any packages in that range; rebuild all packages that used P-2.{0-7} as internal types have changed; keep all packages P-1.* as this is not a drop-in replacement for them". claus
Claus Reinke wrote:
a few examples, of the top of my head: - consider the base split in reverse: if functionality is only repackaged, the merged base would also provide for the previously separate sub-package apis (that suggests a separate 'provides:' field, though, as merely listing version numbers wouldn't be sufficient) - consider the base split itself: if there was a way for the base split authors to tell cabal that the collection of smaller packages can provide for clients of the the old big base, those clients would not run into trouble when the old big base is removed
These two cases could be solved by re-exports, no extra mechanism is required.
- consider adding a new monad transformer to a monad transformer library, or a new regex variant to a regex library - surely the new package version can still provide for clients of the old version
This case doesn't work - if you add *anything* to a library, I can write a module that can tell the difference. So whether your new version is compatible in practice depends on the client.
- consider various packages providing different implementations of an api, say edison's - surely any of the implementations will do for clients who depend only on the api, not on specifics
Yes, and in this case we should have another package that just re-exports one of the underlying packages. You seem to want to add another layer of granularity in addition to packages, and I think that would be unnecessary complexity. Cheers, Simon
I disagree with Simon Marlow here. In practice I think Claus' definition of compatible is good enough: Simon Marlow wrote:
Claus Reinke wrote:
- consider adding a new monad transformer to a monad transformer library, or a new regex variant to a regex library - surely the new package version can still provide for clients of the old version
This case doesn't work - if you add *anything* to a library, I can write a module that can tell the difference. So whether your new version is compatible in practice depends on the client.
One can write such a module. But that is only a problem if the old client accidentally can tell the difference. As far as I can see, the only two things that can go wrong are name conflicts and new instances. New names can only cause compilation to fail, and this can be fixed by using a mix of (1) adding an explicit import list to the old import statement, or (2) adding/expanding a hiding list to the old import statement, or (3) using module qualified names to remove the collision Fixing this sort of compile error is easy; nearly simple enough for a regexp script. And the fix does not break using the client with the old library. Adding things to the namespace should not always force a new API version number. The new instances normally can do harm as they overlap/duplicate instances imported/defined elsewhere. So new instances of pre-existing classes should trigger new a API version number. <brown bag>I accidentally caused a new release of regex-base/regex-posix to drag along an instance which caused such a conflict. I rolled back that change. -- Chris
ChrisK wrote:
I disagree with Simon Marlow here. In practice I think Claus' definition of compatible is good enough:
I don't think you're disagreeing with me :-) In fact, you agreed that extending an API can break a client:
One can write such a module. But that is only a problem if the old client accidentally can tell the difference. As far as I can see, the only two things that can go wrong are name conflicts and new instances.
New names can only cause compilation to fail, and this can be fixed by using a mix of (1) adding an explicit import list to the old import statement, or (2) adding/expanding a hiding list to the old import statement, or (3) using module qualified names to remove the collision Fixing this sort of compile error is easy; nearly simple enough for a regexp script. And the fix does not break using the client with the old library. Adding things to the namespace should not always force a new API version number.
Yes the errors can be fixed, but it's too late - the client already failed to compile out of the box against the specified dependencies. New instances are ruled out in the definition of an extended API in the version policy proposal, incedentally: http://haskell.org/haskellwiki/Package_versioning_policy And I agree with you that name clashes are rare, which is why that page recommends specifying dependencies that are insensitive to changes in the minor version number (i.e. API extensions). But that still leaves the possibility of breakage if the client isn't using import lists, and Claus argued for a system with no uncertainty. So if you want no uncertainty in your dependencies, you either have to (a) not depend on API versions (including minor versions) that you haven't tested, or (b) use explicit import lists and allow minor version changes only. Incedentally, this reminds me that GHC should have a warning for not using explicit import lists (perhaps only for external package imports). Cheers, Simon
On Oct 18, 2007, at 4:57 , Simon Marlow wrote:
depend on API versions (including minor versions) that you haven't tested, or (b) use explicit import lists and allow minor version changes only. Incedentally, this reminds me that GHC should have a warning for not using explicit import lists (perhaps only for external package imports).
Which reminds me that it would be nice to be able to ask for a list of what imports I need to specify (i.e. what names from the module are actually used). A case in point would be the example of "non- monadic I/O" I sent to the list the other day: I wanted to specify minimal imports, but couldn't think of a way to do it aside from specifying very small import lists and iteratively adding things as the compile failed. (This may actually already exist and I don't know enough ghc options to do it....) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH
Which reminds me that it would be nice to be able to ask for a list of what imports I need to specify (i.e. what names from the module are actually used). A case in point would be the example of "non- monadic I/O" I sent to the list the other day: I wanted to specify minimal imports, but couldn't think of a way to do it aside from specifying very small import lists and iteratively adding things as the compile failed.
(This may actually already exist and I don't know enough ghc options to do it....)
funny that you should send that to me:-) my haskellmode for vim plugins do just that when you hit _ie on an implicit import line. from the help file: *_ie* _ie On an 'import <module>' line, in a correctly loadable module, temporarily comment out import and use :make 'not in scope' errors to explicitly list imported identifiers. http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/Vim/ claus
"Claus Reinke"
Incedentally, this reminds me that GHC should have a warning for not using explicit import lists (perhaps only for external package imports).
for package-level imports/exports, that sounds useful.
Isn't there a secret key combination in haskell-mode for Emacs that populates the import lists automatically? -k -- If I haven't seen further, it is by standing in the footprints of giants
Ketil Malde wrote:
"Claus Reinke"
writes: Incedentally, this reminds me that GHC should have a warning for not using explicit import lists (perhaps only for external package imports).
for package-level imports/exports, that sounds useful.
Isn't there a secret key combination in haskell-mode for Emacs that populates the import lists automatically?
No emacs command that I know of, but GHC has the -ddump-minimal-imports flag. Cheers, Simon
On Oct 19, 2007, at 8:18 , Simon Marlow wrote:
Ketil Malde wrote:
Incedentally, this reminds me that GHC should have a warning for not using explicit import lists (perhaps only for external package imports). for package-level imports/exports, that sounds useful. Isn't there a secret key combination in haskell-mode for Emacs that
"Claus Reinke"
writes: populates the import lists automatically? No emacs command that I know of, but GHC has the -ddump-minimal- imports flag.
I think Ketil may be thinking of a command in Shim. Unfortunately, haskell-mode is iffy in xemacs, and Shim essentially unusable... and my fingermacros have been tuned for xemacs for so long that I trip over myself trying to use fsfemacs. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH
These two cases could be solved by re-exports, no extra mechanism is required.
yes, good support for re-export would be nice to have. the reason it has so many applications is that it is a way to explain connections between providers, apis, and clients to the package manager.
- consider adding a new monad transformer to a monad transformer library, or a new regex variant to a regex library - surely the new package version can still provide for clients of the old version
This case doesn't work - if you add *anything* to a library, I can write a module that can tell the difference. So whether your new version is compatible in practice depends on the client.
of course you can. but then you're writing new code. my concern was with old code that was written to the old api and that would still work with an extended api. i was suggesting a warning if code was compiled against an older api in any case. you're right, if you can design new code that will notice the additions, then there's a chance that old code might stumble over them. if that breaks compilation, the programmer would see the error message preceded by an old-api warning, so it should be obvious what to do - either adapt the package to the new api, or make sure that it cannot see anything but the old api. unfortunately, packages are specified entirely at the level of module names. as far as the package manager is concerned, anything that exposes the same modules is "compatible", the breakage comes when the package manager calls the compiler. so we can't say that the old package should import only specific items from the old api, because that api is specified only in the haddock comments, not in any package spec. and then there are the nasty cases, where adding instances might change compilation results without breaking compilation.
- consider various packages providing different implementations of an api, say edison's - surely any of the implementations will do for clients who depend only on the api, not on specifics
Yes, and in this case we should have another package that just re-exports one of the underlying packages.
what you're suggesting is to make an intermediate package act as an api spec: instead of clients depending directly on providers, they depend on the api spec package, and the api spec package "knows" which providers can implement it.
You seem to want to add another layer of granularity in addition to packages, and I think that would be unnecessary complexity.
well, what is ever necessary?-) but i thought the discussion had already made clear that cabal specs do not provide sufficient information to - guarantee package compatibility - delimit usage of apis in clients - find providers implementing apis - keep old packages working with new providers, instead of preserving/packaging all packages forever which means that there's more need to update cabal specs and to use other means (central repo at hackage with autobuild reports, etc) to ensure that everything keeps working. that means more work than necessary, more breakage and less safety than desirable (for a formal take on package safety, refer to Standard ML's system, with structures, functors, and interfaces as a statically typed functional package and module composition language; just add an rcs for package versions). claus
ChrisK
Once it is possible to have cabal register the hsFoo-3.0.0 also as hsFoo-2 it will be easy to upgrade to hsFoo. No old programs will fail to compile.
Who here knows enough about the ghc-pkg database to say how easy or hard this would be?
Ignoring disk space, I suppose the motivation is that it will ease the user experience by only having to download, compile and install a single package? And perhaps ease the maintenance a bit for the library author, too. One way to do this would be to have multiple .cabal files in the package, with small differences like different version numbering. You can use a Makefile or other hack to automate switching. -k -- If I haven't seen further, it is by standing in the footprints of giants
On Wednesday 17 October 2007 01:32, ChrisK wrote:
Simon Marlow wrote:
Several good points have been raised in this thread, and while I might not agree with everything, I think we can all agree on the goal: things shouldn't break so often.
I have another concrete proposal to avoid things breaking so often. Let us steal from something that works: shared library versioning on unixy systems.
On Max OS X, I note that I have, in /usr/lib:
lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.2.dylib -> libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.3.0.0.dylib -> libcurl.3.dylib -rwxr-xr-x 1 root wheel 201156 Aug 17 17:14 libcurl.3.dylib lrwxr-xr-x 1 root wheel 15 Jul 24 2005 libcurl.dylib -> libcurl.3.dylib
The above declaratively expresses that libcurl-3.3.0 provides the version 3 API and the version 2 API.
This is the capability that should be added to Haskell library packages.
Right now a library can only declare a single version number. So if I update hsFoo from 2.1.1 to 3.0.0 then I cannot express whether or not the version 3 API is a superset of (backward compatible with) the version 2 API.
If 3.0.0 is a superset of 2.1.1 why was it necessary to bump to 3.0.0? Why not 2.2.0?
I wanted to generate some random table data, and decided to use quickcheck to do this. I didn't want to be checking properties, I actually wanted to output the examples that quickcheck came up with using arbitrary. In this case, I wanted to generate lists of lists of strings. In case this is of use to anyone else here's an example... One thing I don't understand is the purpose of the first argument to generate. If it's zero it's always the same data, so I made it a larger number (10000). Seems ok, but it would be nice to understand why. Or if there is a better bway to accomplish this. t. {-# OPTIONS -fno-monomorphism-restriction #-} module GenTestData where import Test.QuickCheck import Control.Monad import System.Random import Test.QuickCheck import Misc import ArbitraryInstances f >>=^ g = f >>= return . g infixl 1 >>=^ rgenIntList = rgen (arbitrary :: Gen [Int]) :: IO [Int] rgenInt = rgen (arbitrary :: Gen Int) :: IO Int rgenFoo = rgen (arbitrary :: Gen Foo ) :: IO Foo rgenFoos = rgen (arbitrary :: Gen [Foo]) :: IO [Foo] rgenString' = rgen (arbitrary :: Gen [Char]) :: IO [Char] rgenString len = rgenString' >>=^ take len rgenStringRow' = rgen (arbitrary :: Gen [[Char]]) :: IO [[Char]] rgenStringRow maxlenstr maxcols = do rgenStringRow'
=^ take maxcols =^ map ( take maxlenstr ) rgenStringTable' = rgen (arbitrary :: Gen [[[Char]]]) :: IO [[[Char]]] rgenStringTable maxlenstr maxcols maxrows = do rgenStringTable' =^ take maxrows =^ map ( take maxcols ) =^ ( map . map ) (take maxlenstr)
rgen gen = do sg <- newStdGen return $ generate 10000 sg gen module ArbitraryInstances where import Test.QuickCheck import Data.Char import Control.Monad instance Arbitrary Char where arbitrary = choose ('\32', '\128') coarbitrary c = variant (ord c `rem` 4) -- joel reymont's example I think data Foo = Foo Int | Bar | Baz deriving Show instance Arbitrary Foo where coarbitrary = undefined arbitrary = oneof [ return Bar , return Baz , liftM Foo arbitrary --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
* Simon Marlow wrote:
further sub-versions may be added after the x.y, their meaning is package-defined. Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest.
x.y.z should be ordered numerically, if possible.
As suggested by various people in this thread: we change the convention so that dependencies must specify a single x.y API version, or a range of versions with an upper bound. Cabal or Hackage can refuse to accept packages that don't follow this convention (perhaps Hackage is a better place to enforce it, and Cabal should just warn, I'm not sure).
Ack. Hackage is a good place to reject.
1. Document the version numbering policy.
agreed. just making everybody's interpretation explicit has already exposed subtle differences, so documenting common ground will help.
We should have done this earlier, but we didn't. The proposed policy, for the sake of completeness is: x.y where:
x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical
further sub-versions may be added after the x.y, their meaning is package-defined. Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest.
referring to a haskell function to compute ordering, or to parse version strings into lists of numbers, might remove ambiguities here. for instance, some people use patch-levels as sub-versions, some use dates. also, compare Simon's (S) with Daniel's (D) version: | If the convention for modifying package versions of form x.y.z is: | - increment z for bugfixes/changes that don't alter the interface | - increment y for changes that consist solely of additions to the interface, | parts of the interface may be marked as deprecated | - increment x for changes that include removal of deprecated parts of the | interface version D gives us strictly more information from a version number: just from number differences, we can tell what kind of changes happened to the api. i like that. version S is closer to current practice, which is less informative but psychologically motivated:-) if one does a substantial rewrite without changing the api, or if one adds fundamentally new features without breaking backwards compatibility, one likes to bump the leading number (that is no doubt inspired by commercialism: paying customers are said to prefer higher version numbers, and to focus on new features). corollary: after fixing the version numbering policy (policies?), the implications on usage need to be investigated (sorting wrt dates? does a version number tell us anything about which version can stand in for which dependency?).
2. Precise dependencies.
As suggested by various people in this thread: we change the convention so that dependencies must specify a single x.y API version, or a range of versions with an upper bound. Cabal or Hackage can refuse to accept packages that don't follow this convention (perhaps Hackage is a better place to enforce it, and Cabal should just warn, I'm not sure).
Yes, earlier I argued that not specifying precise dependencies allows some packages to continue working even when dependencies change, and that having precise dependencies means that all packages are guaranteed to break when base is updated. However, I agree that specifying precise dependencies is ultimately the right thing, we'll get better errors when things break,
agreed. please note, however, that this is likely to flush out issues that have so far been swiped under the carpet. this is a good thing, as it will lead to proposals for making cabal deal with these issues properly (replacing unspecified user complaints with concrete bugs and fixes). but it will increase the noise!-) claus
On Tue, Oct 16, 2007 at 01:08:49PM +0100, Simon Marlow wrote:
So rather than keep replying to individual points, I'd like to make some concrete proposals so we can make progress.
1. Document the version numbering policy.
We should have done this earlier, but we didn't. The proposed policy, for the sake of completeness is: x.y where:
x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical
further sub-versions may be added after the x.y, their meaning is package-defined.
This should be required for at least the GHC boot packages (and encouraged for others). I would make "API extended only" a bit more precise: any module that uses explicit import lists will not be affected by the changes. So one can add classes, types and functions, but not instances (except where either the class or the type is new). You probably can't add data constructors or fields, and have to be careful with new methods. I'd also prefer that major versions used two numbers, because that's common now, it supports the experimental versions 0.x apfelmus mentioned, and it makes it easier to leave room for development versions (possibly using an odd-even scheme). If you make your development repository available, and it contains API changes, you'll want its version number to have a larger major number.
Ross Paterson wrote:
I would make "API extended only" a bit more precise: any module that uses explicit import lists will not be affected by the changes. So one can add classes, types and functions, but not instances (except where either the class or the type is new).
okay
You probably can't add data constructors or fields, and have to be careful with new methods.
If they're exported and new members of existing classes/datatypes, then you can't add them, because they might be imported with "class/typename(..)". (right?) What about semantic changes to the API? Including, adding a default to a class method changes the default from 'undefined', which someone might have relied on as the default (although it seems unlikely). Isaac
I've written down the proposed policy for versioning here: http://haskell.org/haskellwiki/Package_versioning_policy It turned out there was a previous page written by Bulat that contained essentially this policy, but it wasn't linked from anywhere which explains why it was overlooked. I took the liberty of rewriting the text. I took into account Ross's suggestions that the major version should have two components, and that we need to be more precise about what it means to extend an API. After a round of editing, we can start to link to this page from everywhere, and start migrating packages to this scheme where necessary. Cheers, Simon
On Wed, Oct 17, 2007 at 12:54:12PM +0100, Simon Marlow wrote:
I've written down the proposed policy for versioning here:
This says: If [...] instances were added or removed, then the new A.B must be greater than the previous A.B. This presumably includes changing module imports, or depending on a newer version of a package, which results in the visible instances changing? I think this should be spelt out in the policy. The example: build-depends: mypkg == 2.1.1 should be: build-depends: mypkg >= 2.1.1, mypkg < 2.1.2 with the current dependency syntax/semantics. Thanks Ian
On Wed, Oct 17, 2007 at 12:54:12PM +0100, Simon Marlow wrote:
I've written down the proposed policy for versioning here:
http://haskell.org/haskellwiki/Package_versioning_policy
It turned out there was a previous page written by Bulat that contained essentially this policy, but it wasn't linked from anywhere which explains why it was overlooked. I took the liberty of rewriting the text.
You wrote:
A client that wants to specify that they depend on a particular version of the API can specify a particular A.B.C and be sure of getting that API only. For example, build-depends: mypkg-2.1.1
Are you proposing an extension along the lines of that proposed by Thomas (and Bulat, and others), i.e. this would be equivalent to build-depends: mypkg >= 2.1.1 && < 2.1.2 ? The current syntax of mypkg == 2.1.1 would match the initial release but not subsequent patch releases.
Ross Paterson wrote:
On Wed, Oct 17, 2007 at 12:54:12PM +0100, Simon Marlow wrote:
I've written down the proposed policy for versioning here:
http://haskell.org/haskellwiki/Package_versioning_policy
It turned out there was a previous page written by Bulat that contained essentially this policy, but it wasn't linked from anywhere which explains why it was overlooked. I took the liberty of rewriting the text.
You wrote:
A client that wants to specify that they depend on a particular version of the API can specify a particular A.B.C and be sure of getting that API only. For example, build-depends: mypkg-2.1.1
Are you proposing an extension along the lines of that proposed by Thomas (and Bulat, and others), i.e. this would be equivalent to build-depends: mypkg >= 2.1.1 && < 2.1.2 ?
Yes, I should have mentioned that, thanks. I'll update the page shortly with this suggestion and others. Cheers, Simon
On Thursday 18 October 2007 00:54, Simon Marlow wrote:
I've written down the proposed policy for versioning here:
Is there technical reason for the major version number to consist of 2 components? Why not 3, 17 or (my preference) 1? Using major.minor instead of A.B.C, and interpreting MUST, SHOULD, MAY as specified by whatever RFC it is that specifies them, I'd write the change rules as: 1. If any entity was removed, or the types of any entities or the definitions of datatypes or classes were changed, or instances were added or removed, then the new major MUST be greater than the previous major (other version components MAY change). 2. Otherwise, if only new bindings, types or classes were added to the interface, then major MUST remain the same and the new minor MUST be greater than the old minor (other version components MAY change). 3. Otherwise, major.minor MUST remain the same (other version components MAY change). Why? - It gives the reader of the version numbers more information, which in turn may allow hackage to do more automated enforcement/testing/upgrading. - To safely specify dependencies you must use an upperbound of the next major version. The stricter change rules make it less likely that a package will miss out on the use of a new version of a dependency that is actually compatible but had it's version bumped anyway. The proposal isn't clear on whether this is allowed or not, but I think sets of version bounds are needed. Using A.B.C <==> major.minor.patch and interval notation for brevity: build-depends: foo [2.1, 3) U [3.3, 3.4) Dan
Daniel McAllansmith
3. Otherwise, major.minor MUST remain the same (other version components MAY change).
Is it an option to say SHOULD rather than MUST here? There are other reasons for a version bump than breaking compatibility. -k -- If I haven't seen further, it is by standing in the footprints of giants
On Thursday 18 October 2007 21:15, you wrote:
Daniel McAllansmith
writes: 3. Otherwise, major.minor MUST remain the same (other version components MAY change).
Is it an option to say SHOULD rather than MUST here?
Of course, SHOULD is an option just like MAY is. But both SHOULD and MAY reduce what you can reliably infer from a version number in the same way. If the rule is SHOULD or MAY, and the freedom is exercised, compatible versions of a package will differ in major[.minor] and dependent packages will be unable to benefit from their release. You'll need more maintenance work on package dependencies if you want to use the latest and greatest versions. In a similar way, if packages are being retained for a 'long time' to ensure dependent packages remain buildable, you are losing garbage collection opportunities. I'm pretty certain SHOULD will be far more socially acceptable than MUST. I can appreciate the fact that people are accustomed to incrementing version numbers in liberal ways. But if you look at version numbers dispassionately in the context of "The goal of a versioning system is to inform clients of a package of changes to that package that might affect them..." MUST seems a better choice. Maybe the Right Way of informing clients is full-on metadata and typing of packages and maybe we'll have that soon, so maybe a socially acceptable, weaker versioning scheme is acceptable.
There are other reasons for a version bump than breaking compatibility.
Technical reasons? In some cases a major bump would just be devolving to a minor bump. Dan
Daniel McAllansmith
There are other reasons for a version bump than breaking compatibility.
Technical reasons?
Well - say I refactor everything, and use algorithms with different run-time complexities, and possibly introduce different bugs than the ones the applications have come to rely on/work around. Even if the interface is type-level compatible, a conservative application would still prefer to link with the old version. -k -- If I haven't seen further, it is by standing in the footprints of giants
simonmarhaskell:
Several good points have been raised in this thread, and while I might not agree with everything, I think we can all agree on the goal: things shouldn't break so often.
So rather than keep replying to individual points, I'd like to make some concrete proposals so we can make progress.
1. Document the version numbering policy.
We should have done this earlier, but we didn't. The proposed policy, for the sake of completeness is: x.y where:
x changes ==> API changed x constant but y changes ==> API extended only x and y constant ==> API is identical
further sub-versions may be added after the x.y, their meaning is package-defined. Ordering on versions is lexicographic, given multiple versions that satisfy a dependency Cabal will pick the latest.
2. Precise dependencies.
As suggested by various people in this thread: we change the convention so that dependencies must specify a single x.y API version, or a range of versions with an upper bound. Cabal or Hackage can refuse to accept packages that don't follow this convention (perhaps Hackage is a better place to enforce it, and Cabal should just warn, I'm not sure).
I agree. >= 1.0 isn't viable in the long term. Rather, a specific list, or bounded range of tested versions seems likely to be more robust. -- Don
Hi
I agree. >= 1.0 isn't viable in the long term. Rather, a specific list, or bounded range of tested versions seems likely to be more robust.
In general, if it compiles and type checks, it will work. It is rare that an interface stays sufficiently similar that the thing compiles, but then crashes at runtime. Given that, shouldn't the tested versions be something a machine figures out - rather than something each library author has to tend to with every new release of every other library in hackage? Thanks Neil
Neil Mitchell wrote:
Hi
I agree. >= 1.0 isn't viable in the long term. Rather, a specific list, or bounded range of tested versions seems likely to be more robust.
In general, if it compiles and type checks, it will work. It is rare that an interface stays sufficiently similar that the thing compiles, but then crashes at runtime.
True.. GoboLinux's package system records the exact set of versions something compiles with (just for reference), and uses min version bounds (and max bounds where needed) for dependencies. It's always possible for Haskell library implementation-bug-fixes to change relied-on behavior, as discussed in the original ECT description. I agree that compiling and type-checking is a pretty good sign of working. Passing tests (e.g. QuickCheck) could be tested too, where available. If optimizations and unsafePerformIO interact differently, different compiler versions could also affect whether something works correctly, but still compiles... But, the issue here is much more limited: we assume that there were some set of versions of these libraries that DID work, and, that every version of each library, on its own (or with only the libraries it depends on), works. So it might be valuable to record subjectively-working exact version sets, somewhere. Isaac
Neil Mitchell wrote:
Hi
I agree. >= 1.0 isn't viable in the long term. Rather, a specific list, or bounded range of tested versions seems likely to be more robust.
In general, if it compiles and type checks, it will work. It is rare that an interface stays sufficiently similar that the thing compiles, but then crashes at runtime. Given that, shouldn't the tested versions be something a machine figures out - rather than something each library author has to tend to with every new release of every other library in hackage?
The only reasonable way we have to test whether a package compiles with a new version of a dependency is to try compiling it. To do anything else would be duplicating what the compiler does, and risks getting it wrong. But you're right that tools could help a lot: for example, after a base version bump, Hackage could try to build all its packages against the new base to figure out which ones need source code modifications and which can probably just have their .cabal files tweaked to allow the new version. Hackage could tentatively fix the .cabal files itself and/or contact the maintainer. We'll really need some tool to analyse API changes too, otherwise API versioning is too error-prone. Anyone like to tackle this? It shouldn't be too hard using the GHC API.. Cheers, Simon
Hi
In general, if it compiles and type checks, it will work. It is rare that an interface stays sufficiently similar that the thing compiles, but then crashes at runtime. Given that, shouldn't the tested versions be something a machine figures out - rather than something each library author has to tend to with every new release of every other library in hackage?
The only reasonable way we have to test whether a package compiles with a new version of a dependency is to try compiling it. To do anything else would be duplicating what the compiler does, and risks getting it wrong.
Agreed - that's what I meant by checking it. Ideally with multiple compilers.
But you're right that tools could help a lot: for example, after a base version bump, Hackage could try to build all its packages against the new base to figure out which ones need source code modifications and which can probably just have their .cabal files tweaked to allow the new version. Hackage could tentatively fix the .cabal files itself and/or contact the maintainer.
Even better, after any new package release hackage could compute all packages which depend on that one and try them. That way you can always guarantee that every hackage package will work with the latest hackage dependencies. Another way of saying this is that every package on hackage should compile out the box with cabal-install on a fresh system.
We'll really need some tool to analyse API changes too, otherwise API versioning is too error-prone. Anyone like to tackle this? It shouldn't be too hard using the GHC API..
You can also do it with haddock and the --hoogle flag, to some extent. Thanks Neil
Following is a summary of my thoughts on the matter, in large part so I can figure out what I'm thinking... apologies if it's a bit of a ramble. All comments welcome. Basically - version numbering which differs from Simon's proposal - precise dependencies, I think the same as Simon is proposing - 'permanent' availability of compatible package versions - never a need to update working cabal files - a cabal file installs exactly one version of a package 1) Package version numbers are of the form x.y.z 2) There are version-segment ordering functions cmpX, cmpY, and cmpZ. cmpX and cmpY are globally defined and operate over non-negative integers. Perhaps cmpZ is globally defined, or could be defined per package, or be lexicographic, or... something else. cmpZ could even be a partial ordering I suppose. 3) A cabal file specifies how to build a single version of a package. name: foo version: 2.12.5 This cabal file will build version 2.12.5 of package foo. 4) The dependencies in a cabal file define baseline versions of required packages. depends: bar [3.4] baz [1.2.6, 3] Version 2.12.5 of foo requires a version of bar that is API-compatible with 3.4.0 and a version of baz that is API-compatible with 1.2.6 _or_ API-compatible with 3.0.0. Note that this doesn't imply that baz 3.0.0 is API-compatible with baz 1.2.6 (by definition it is not), it implies that foo is using a subset of the intersection of those two baz APIs. Note that baz 2.y.z would not satisfy the dependency. Perhaps a function was removed with the bump to 2 and restored only with the bump to 3. 5) Package version numbers encode whether one version of a package is API-compatible with another version of the package. Given two versions x.y.z and i.j.k of a package: - x == i && y == j ==> x.y.z is API-identical (hence API-compatible) with i.j.k, cmpZ can be used to determine preferred version - x == i && y > j ==> x.y.z is API-compatible with i.j.k, it has undergone compatibility-preserving changes, x.y.z is preferred to i.j.k - x > i ==> x.y.z is not API-compatible with i.j.k, it has undergone non-compatibility-preserving changes - otherwise ==> x.y.z is not API-compatible with i.j.k, it is a lower version that has less functionality 6) A compatibility-preserving change is generally a change which just adds to the API. Ross Paterson points out adding extra data constructors or instances may not be compatibility-preserving. A non-compatibility-preserving change is generally a change which includes the removal of some part of the API. It might also include changes which leave the API unmodified but significantly degrade usability, e.g. worse time or space performance. 7) Once a version of a package is building successfully it remains available for a 'long time'. If sufficient versions of a package remain available then API-compatible versions of required packages are always available, so the building of packages should never break. An uploaded cabal file should never need to be changed, regardless of what happens to the packages it depends upon. 8) If a version of a package is discovered to have security flaws or serious bugs it should remain available in a quarantined state until a fixed API-compatible version is available. 9) Something (hackage?) could enforce adherence to version numbering policy. At the least any new version uploaded that claims to be API-compatible can be test compiled against packages which depend on it. Something (hackage?) could assist package maintainers in releasing a new version of their package with updated dependency information. Hackage could attempt to compile against non API-compatible versions and report the outcome, for example foo 2.12.5 compiles with the new baz 3.0.0 but not the latest baz 2.y.z Dan
[would it be possible to pick a single list to discuss this on please, so there is no danger of some people missing some subthreads if they aren't on all the lists, or getting messages 3 times if they are?] On Tue, Oct 16, 2007 at 01:08:49PM +0100, Simon Marlow wrote:
2. Precise dependencies.
While not directly related to this, I have the impression some people want precise dependencies so that things work properly when multiple versions of a library are installed. Personally I'm not a fan of that, as if I have package foo: module Foo where data T package bar: module Bar where bar :: T package baz: module Baz where baz :: T -> () then baz bar might be a type error if I have multiple versions of foo installed and bar and baz have been compiled against different versions. Thanks Ian
- if you provide a 'base' configuration that pulls in the stuff that used to be in base, the package will work
I don't know of a way to do that. The name of the package is baked into the object files at compile time, so you can't use the same compiled module in more than one package.
i've been wrong about this before, so check before you believe,-) but here is a hack i arrived at the last time we discussed this: [using time:Data.Time as a small example; ghc-6.6.1] 1. create, build, and install a package QTime, with default Setup.hs -- QTime.cabal Name: QTime Version: 0.1 Build-depends: base, time Exposed-modules: QTime.Data.Time -- QTime/Data/Time.hs module QTime.Data.Time(module Data.Time) where import Data.Time 2. create, build, and install a package Time2, with default Setup.hs -- Time2.cabal Name: Time2 Version: 0.1 Build-depends: base, QTime Exposed-modules: Data.Time -- Data/Time.hs module Data.Time(module QTime.Data.Time) where import QTime.Data.Time 3. write and build a client module -- Main.hs import Data.Time main = print =<< getCurrentTime $ ghc -hide-all-packages -package base Main.hs Main.hs:1:0: Failed to load interface for `Data.Time': it is a member of package Time2-0.1, which is hidden $ ghc -hide-all-packages -package base -package Time2 Main.hs $ ./main.exe 2007-10-16 11:09:05.859375 UTC $ rm main.exe Main.hi Main.o $ ghc -hide-all-packages -package base -package time Main.hs $ ./main.exe 2007-10-16 11:09:29.34375 UTC as i said, i've misinterpreted such symptoms before, but it seems to me that Time2's Data.Time acts as a drop-in replacement for time's Data.Time here. doesn't it? it is rather tedious, having to do something for every module in the package, twice (once to get a package-qualified name that differs from the original name, the second time to re-expose it under its original name), but that could be automated. and there would be an extra QBase package. but until cabal supports such renamings directly, it might be a workaround for the current base issue? claus
Claus Reinke wrote:
- if you provide a 'base' configuration that pulls in the stuff that used to be in base, the package will work
I don't know of a way to do that. The name of the package is baked into the object files at compile time, so you can't use the same compiled module in more than one package.
i've been wrong about this before, so check before you believe,-) but here is a hack i arrived at the last time we discussed this:
[using time:Data.Time as a small example; ghc-6.6.1]
1. create, build, and install a package QTime, with default Setup.hs ... 2. create, build, and install a package Time2, with default Setup.hs ... 3. write and build a client module
Ok, when I said above "I don't know a way to do that", I really meant there's no way to do it by modifying the package database alone, which I think is what Udo was after. Your scheme does work, and you have discovered how to make a package that re-exports modules from other packages (I made a similar discovery recently when looking into how to add support to Cabal for this). As you can see, it's rather cumbersome, in that you need an extra dummy package, and two stub modules for each module to be re-exported. One way to make this easier is to add a little extension to GHC, one that we've discussed before: module Data.Time (module Base1.Data.Time) where import "base-1.0" Data.Time as Base1.Data.Time the extension is the "base-1.0" package qualifier on the import, which GHC very nearly supports (only the syntax is missing). Now you don't need the dummy package, and only one stub module per module to be re-exported. Cabal could generate these automatically, given some appropriate syntax. Furthermore, this is better than doing something at the package level, because you're not stuck with module granularity, you can re-export just parts of a module, which is necessary if you're trying to recreate an old version of an API. I was going to propose this at some point. Comments? Cheers, Simon
Simon Marlow wrote:
Claus Reinke wrote:
- if you provide a 'base' configuration that pulls in the stuff that used to be in base, the package will work
I don't know of a way to do that. The name of the package is baked into the object files at compile time, so you can't use the same compiled module in more than one package.
i've been wrong about this before, so check before you believe,-) but here is a hack i arrived at the last time we discussed this:
[using time:Data.Time as a small example; ghc-6.6.1]
1. create, build, and install a package QTime, with default Setup.hs ... 2. create, build, and install a package Time2, with default Setup.hs ... 3. write and build a client module
Ok, when I said above "I don't know a way to do that", I really meant there's no way to do it by modifying the package database alone, which I think is what Udo was after.
Your scheme does work, and you have discovered how to make a package that re-exports modules from other packages (I made a similar discovery recently when looking into how to add support to Cabal for this). As you can see, it's rather cumbersome, in that you need an extra dummy package, and two stub modules for each module to be re-exported.
Ah, I should add that due to technical limitations this scheme can't be used to make a base-2 that depends on base-3. Base is special in this respect, GHC only allows a single package called base to be linked into any given executable. The reason for this is that GHC can be independent of the version of the base package, and refer to it as just "base"; in theory it's possible to upgrade the base package independently of GHC. So we're restricted at the moment to providing only completely independent base-2 and base-3 in the same installation, and essentially that means having (at least) two copies of every package, one that depends on base-2 and one that depends on base-3. Perhaps we should revisit this decision, it would be better for GHC to depend explicitly on base-3, but allow a separate backwards-compatible base-2 that depends on base-3 to be installed alongside. OTOH, this will still lead to difficulties when you try to mix base-2 and base-3. Suppose that the Exception type changed, so that base-2 needs to provide its own version of Exception. The base-2:Exception will be incompatible with the base-3:Exception, and type errors will ensue if the two are mixed. If the base-3:Exception only added a constructor, then you could hide it in base-2 instead of defining a new type. However, if base-3 changed the type of a constructor, you're stuffed. Ah, I think we've discovered a use for the renaming feature that was removed in Haskell 1.3! Cheers, Simon
participants (17)
-
Bayley, Alistair -
Brandon S. Allbery KF8NH -
ChrisK -
Claus Reinke -
Daniel McAllansmith -
Don Stewart -
Duncan Coutts -
Ian Lynagh -
Isaac Dupree -
Ketil Malde -
Lutz Donnerhacke -
Neil Mitchell -
Ross Paterson -
Simon Marlow -
Stefan O'Rear -
Stuart Cook -
Thomas Hartman