
Simon PJ and I had a talk about the build system earlier today, I thought I'd float the idea we discussed (I should admit that the idea was mine, lest Simon PJ think I'm attributing bad ideas to him :-). This is not completely thought through, but I'm pretty sure a solution exists along these lines that would improve things for us. Ok, the starting point is this: - Cabal has code to generate Makefiles. Almost nobody uses it except for the GHC build system. It essentially duplicates the build system for compiling Haskell source (but not for installation, haddocking, registration, configuration, etc.) - Cabal is a library I propose we do this: - Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system. This means we still get to use 'make', we still get to use the .cabal files as metadata, but the build system is more private to GHC, more extensible, and hopefully more understandable and modifiable. We can express dependencies that Cabal currently doesn't know about. It would let us avoid the current uncomfortable situation where we have to feed all kinds of configuration information from the GHC build system into Cabal - Cabal would be essentially just a mechanism for translating the .cabal file into Makefile bindings and package metadata for ghc-pkg. There will undoubtedly be some sticking points where we have to tradeoff duplicating things from Cabal against re-using parts of Cabal which might require modifying Cabal itself. For instance, we could use Cabal for installation, but that means that our build system has to leave everything in the places that Cabal's installation code expects, so it might be more feasible to do installation ourselves, but that means duplicating parts of Cabal. It will probably mean that we have a tighter dependency on Cabal, because we use it as a library rather than a black box; but hopefully we can keep our branch of Cabal more stable and not have to update it so often. Anyway, this is an idea that I think is interesting. Obviously it needs a lot more fleshing out to be a real proposal, but I'm interested in whether anyone thinks this idea is worth persuing, or whether there are better alternatives. Cheers, Simon

On Tue, 2008-08-12 at 11:11 +0100, Simon Marlow wrote:
I propose we do this:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system.
As you know, I've been trying to get rid of that code ever since it arrived :-)
It will probably mean that we have a tighter dependency on Cabal, because we use it as a library rather than a black box; but hopefully we can keep our branch of Cabal more stable and not have to update it so often.
If you don't need to update so often it would make life easier for Cabal hackers and Manuel would be pleased :-)
Anyway, this is an idea that I think is interesting. Obviously it needs a lot more fleshing out to be a real proposal, but I'm interested in whether anyone thinks this idea is worth persuing, or whether there are better alternatives.
Right, so probably the crucial thing is how much you end up having to duplicate and of how much of said duplicated infrastructure has to be kept in sync. For example if the path layout is different does that make Cabal's haddocking support not work forcing that to be duplicated too? Duncan

Duncan Coutts:
On Tue, 2008-08-12 at 11:11 +0100, Simon Marlow wrote:
I propose we do this:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system.
As you know, I've been trying to get rid of that code ever since it arrived :-)
It will probably mean that we have a tighter dependency on Cabal, because we use it as a library rather than a black box; but hopefully we can keep our branch of Cabal more stable and not have to update it so often.
If you don't need to update so often it would make life easier for Cabal hackers and Manuel would be pleased :-)
Yes!
Anyway, this is an idea that I think is interesting. Obviously it needs a lot more fleshing out to be a real proposal, but I'm interested in whether anyone thinks this idea is worth persuing, or whether there are better alternatives.
I think this is definitely an interesting idea. At the moment, it seems to me that all the metadata handling of Cabal is what's most useful to GHC, whereas the actual build procedure and its inflexibility causes a lot of grief, especially if you want to do something non-standard. The proposed idea would pick the best of both worlds (Cabal's metadata handling and make's build flexibility plus the fact that many more people know how to tweak makefiles even if it is a pain, but its pretty well understood pain). Manuel

Simon Marlow
This means we still get to use 'make', we still get to use the .cabal files as metadata, but the build system is more private to GHC, more extensible, and hopefully more understandable and modifiable.
This is essentially the same approach that nhc98 currently takes to building libraries. The Cabal file holds all the metadata, but the build system is Makefile-driven. There is a small separate tool (CabalParse) that extracts metadata from the cabal file. The Cabal *library* could be used to implement that extraction tool, but currently ours is hand-rolled. (One of the benefits of open specifications of file formats is that you can have multiple implementations for different purposes.) Here is an example of how it works: CABALFILE = $(shell ls *.cabal | head -n 1 ) READ = $(CABALPARSE) $(CABALFILE) -quiet MAP = $(LOCAL)map THISPKG = $(shell $(READ) name | cut -c2- ) VERSION = $(shell $(READ) version) SEARCH = $(shell $(READ) build-depends | $(MAP) "echo -package" ) \ $(shell $(READ) include-dirs | $(MAP) "echo -i" | cut -c1,2,4-) \ $(shell $(READ) hs-source-dir | $(MAP) "echo -I" | cut -c1,2,4-) \ $(shell $(READ) hs-source-dirs | $(MAP) "echo -I" | cut -c1,2,4-) CINCLUDES = $(shell $(READ) include-dirs | $(MAP) "echo -I" | cut -c1,2,4-) SRCS = $(shell $(READ) -slash exposed-modules) EXTRA_SRCS = $(shell $(READ) -slash other-modules) SRCS_C = $(shell $(READ) c-sources) DIRS = $(shell $(READ) -slash exposed-modules other-modules \ | $(MAP) dirname | sort | uniq ) EXTRA_C_FLAGS = $(shell $(READ) cc-options) EXTRA_H_FLAGS = $(shell $(READ) nhc98-options) Regards, Malcolm

Malcolm Wallace wrote:
Simon Marlow
wrote: This means we still get to use 'make', we still get to use the .cabal files as metadata, but the build system is more private to GHC, more extensible, and hopefully more understandable and modifiable.
This is essentially the same approach that nhc98 currently takes to building libraries.
Right, I was aware that nhc98 uses this method but forgot to mention it. Thanks for pointing it out. I think it makes a lot more sense for us to re-use parts of Cabal than to re-implement the whole thing, although the balance is probably different for nhc98. Cabal generates the InstalledPackageInfo from the .cabal file, for example, and this is certainly something we don't want to re-implement. Cheers, Simon

Simon PJ and I had a talk about the build system earlier today, I thought I'd float the idea we discussed... I propose we do this:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system.
This means we still get to use 'make', we still get to use the .cabal files as metadata, but the build system is more private to GHC, more extensible, and hopefully more understandable and modifiable...
... I'm interested in whether anyone thinks this idea is worth persuing, or whether there are better alternatives.
Simon, This direction sounds very promising. I hope you will keep us posted. Norman

On 12/08/2008, at 20:11, Simon Marlow wrote:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system.
Sounds good. It would be nice if the .cabal parser from Cabal could be made into a separate, stable library which ghc (and nhc?) could use. This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system. Roman

Roman Leshchinskiy wrote:
On 12/08/2008, at 20:11, Simon Marlow wrote:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system.
Sounds good. It would be nice if the .cabal parser from Cabal could be made into a separate, stable library which ghc (and nhc?) could use.
This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system.
So packages would be required to provide their own build system? That sounds like it would make it a lot harder for people to just create a package that others can use. The ease of making a Cabal package has I think a lot to do with the wealth of software available on Hackage. GHC is a special case: we already need a build system for other reasons. It was a design decision early on with Cabal that we didn't want to rely on the target system having a Unix-like build environment. You might disagree with this, but it certainly has some value: a Windows user can download GHC and immediately start building and installing external packages without having to install Cygwin. Cheers, Simon

On 13/08/2008, at 17:47, Simon Marlow wrote:
Roman Leshchinskiy wrote:
On 12/08/2008, at 20:11, Simon Marlow wrote:
- Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system. Sounds good. It would be nice if the .cabal parser from Cabal could be made into a separate, stable library which ghc (and nhc?) could use. This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system.
So packages would be required to provide their own build system? That sounds like it would make it a lot harder for people to just create a package that others can use. The ease of making a Cabal package has I think a lot to do with the wealth of software available on Hackage.
Of course there should be a standard build system for simple packages. It could be part of Cabal or a separate tool (for which Cabal could, again, act as a preprocessor).
GHC is a special case: we already need a build system for other reasons.
I agree. I just don't think that adding a full-fledged build system to Cabal is the solution. In my experience, huge monolithic tools which try to do everything never work well. I much prefer small, modular tools. A Haskell-based build system is an interesting project but why does it have to be a part of Cabal?
It was a design decision early on with Cabal that we didn't want to rely on the target system having a Unix-like build environment. You might disagree with this, but it certainly has some value: a Windows user can download GHC and immediately start building and installing external packages without having to install Cygwin.
I agree with this decision but IIUC, this only really works for simple (wrt building) packages which don't even use configure. Making Cabal into a modular preprocessor and providing a thin wrapper for ghc -- make which can act as a target for Cabal would achieve this just as well. Roman

Roman Leshchinskiy wrote:
Of course there should be a standard build system for simple packages. It could be part of Cabal or a separate tool (for which Cabal could, again, act as a preprocessor).
GHC is a special case: we already need a build system for other reasons.
I agree. I just don't think that adding a full-fledged build system to Cabal is the solution. In my experience, huge monolithic tools which try to do everything never work well. I much prefer small, modular tools. A Haskell-based build system is an interesting project but why does it have to be a part of Cabal?
Hmm, but you said above "there should be a standard build system for simple packages. It could be part of Cabal...". Cabal has two parts: some generic infrastructure, and a "simple" build system (under Distribution.Simple) that suffices for most packages. We distribute them together only because it's convenient; you don't have to use the simple build system if you don't want to. I think perhaps you're objecting to the fact that the "simple" build system isn't so simple, and we keep adding more functionality to it. This is true, but the alternative - forcing some packages to provide their own build system - seems worse to me. Cheers, Simon

On Wed, 2008-08-13 at 11:34 +0100, Simon Marlow wrote:
Cabal has two parts: some generic infrastructure, and a "simple" build system (under Distribution.Simple) that suffices for most packages. We distribute them together only because it's convenient; you don't have to use the simple build system if you don't want to.
The two parts also have different degrees of stability. In particular there are lots of tools that rely on the declarative parts, the types and parsers so we try not to break those so often. Roman asks for a "separate, stable library which ghc (and nhc?) could use" but since this part of Cabal is fairly stable I don't think it needs to be separate. I don't think it'd be more stable by being in a different package. The reasons to change it are usually to add new fields, and that usually does not affect clients that do not need to know about the new fields.
I think perhaps you're objecting to the fact that the "simple" build system isn't so simple, and we keep adding more functionality to it. This is true, but the alternative - forcing some packages to provide their own build system - seems worse to me.
As Isaac used to say, it's not the Simple build system because it's simple. It's because it does complex things to simple packages. The Make build type was supposed to let people wrap existing make-based build systems. Unfortunately it's not used much so is not well developed and for ghc is really the wrong way round. I think your approach of exporting the info in make syntax makes more sense for ghc, since it's not trying to pretend it's a cabal package anyway (which is what the Make build type was for, wrapping things up so people building packages didn't need to know what was used underneath). Duncan

On 13/08/2008, at 20:34, Simon Marlow wrote:
Roman Leshchinskiy wrote:
GHC is a special case: we already need a build system for other reasons. I agree. I just don't think that adding a full-fledged build system to Cabal is the solution. In my experience, huge monolithic tools which try to do everything never work well. I much prefer small, modular tools. A Haskell-based build system is an interesting
Of course there should be a standard build system for simple packages. It could be part of Cabal or a separate tool (for which Cabal could, again, act as a preprocessor). project but why does it have to be a part of Cabal?
Hmm, but you said above "there should be a standard build system for simple packages. It could be part of Cabal...".
On second thought, it shouldn't be part of Cabal :-)
Cabal has two parts: some generic infrastructure, and a "simple" build system (under Distribution.Simple) that suffices for most packages. We distribute them together only because it's convenient; you don't have to use the simple build system if you don't want to.
My impression of Cabal is that it is a build system with a bit of generic infrastructure. In particular, a large part of the .cabal syntax is specified in terms of this build system and some of it only really makes sense for this build system.
I think perhaps you're objecting to the fact that the "simple" build system isn't so simple, and we keep adding more functionality to it. This is true, but the alternative - forcing some packages to provide their own build system - seems worse to me.
Cabal packages do provide their own build system; it's just that they use Cabal syntax instead of, say, make. The advantage of doing this is, of course, that Cabal's syntax is simpler. Adding things to the "simple" build system erodes this advantage. Complex projects will still have complex build systems - the complexity will be in the .cabal files. If Cabal's goal is to be able to build any project it will basically have to duplicate the functionality of autoconf, automake, libtool, make and a couple of other tools *and* be just as flexible. I think this is neither realistic nor necessary. So where do we stop? And what about the packages that Cabal won't support when we stop? IMO, we should have stopped some time ago. A .cabal file should describe a package, not how to build it. Building should be handled by different tools with a clear interface between them and Cabal. If the build system of choice needs additional information, then that information should be provided in a separate file and not in the package description. Again, I'm not arguing against a build system written in Haskell. I'd just like it to be completely separated from Haskell's packaging system. In particular, "polluting" a package description with build information seems wrong to me. Roman

On Wed, 2008-08-13 at 22:47 +1000, Roman Leshchinskiy wrote:
Again, I'm not arguing against a build system written in Haskell. I'd just like it to be completely separated from Haskell's packaging system. In particular, "polluting" a package description with build information seems wrong to me.
There is a huge overlap of course. The things needed to build a package tend to be the dependencies. The ability to automatically extract the dependencies from a package description is crucial as it is what enables automatic package management either directly or by conversion to distro packages. Tools like automake + autoconf do not give us that. There is of course some separation possible, which in Cabal roughly corresponds to the stuff under Distribution.Simple vs everything else. We could split those two aspects into separate packages but it's not clear to me that we'd gain much by doing that. There is still the Make build type which we could improve if people want it. That allows the declarative stuff to be given in the .cabal file (so that package managers can do their thing) and all the building is delegated to make. People have not shown any interest in this so it's never been improved much. The obvious disadvantage of using it is that you have to do a lot of work to make your build system do all the things that users expect. Duncan

On 14/08/2008, at 06:32, Duncan Coutts wrote:
On Wed, 2008-08-13 at 22:47 +1000, Roman Leshchinskiy wrote:
Again, I'm not arguing against a build system written in Haskell. I'd just like it to be completely separated from Haskell's packaging system. In particular, "polluting" a package description with build information seems wrong to me.
There is a huge overlap of course. The things needed to build a package tend to be the dependencies. The ability to automatically extract the dependencies from a package description is crucial as it is what enables automatic package management either directly or by conversion to distro packages. Tools like automake + autoconf do not give us that.
Right. Dependencies are part of a package description. That's what Cabal should do. It should provide a nice clean interface to the dependencies stuff for the build system to use. I don't think it does that at the moment; IIUC, it is all done by Distribution.Simple.
There is of course some separation possible, which in Cabal roughly corresponds to the stuff under Distribution.Simple vs everything else. We could split those two aspects into separate packages but it's not clear to me that we'd gain much by doing that.
My point isn't really about distribution, it's about coupling. My concern is that the syntax of .cabal files is increasingly based on what Distribution.Simple needs. This effectively makes all other build systems second class. It also loses us clean package descriptions which is what .cabal files should be. It's not too bad at the moment but will get worse as Distribution.Simple gets more complex since it will need more and more information. Just as an example, consider something like ld-options. This is obviously not a dependency and is basically only documented by how it is used by Distribution.Simple. It shouldn't be in .cabal, IMO. If a build system needs this information, it should be provided somewhere else.
There is still the Make build type which we could improve if people want it. That allows the declarative stuff to be given in the .cabal file (so that package managers can do their thing) and all the building is delegated to make. People have not shown any interest in this so it's never been improved much. The obvious disadvantage of using it is that you have to do a lot of work to make your build system do all the things that users expect.
But that is precisely my (other) point. A lot of that work is really unnecessary and could be done by Cabal since it only or mostly depends on the package information. Instead, it is implemented somewhere in Distribution.Simple and not really usable from the outside. For instance, a lot of the functionality of setup sdist, setup register and so on could be implemented generically and used by a make-based build system as well. Also, there is no easy way for build systems to work with the declarative stuff because a lot of that functionality is, again, part of Distribution.Simple. IMO, this is a direct result of the tight coupling between the package management and build system parts of Cabal. The other problem, of course, is that it isn't clear what exactly a build system should provide. IIUC, that's what "Building and installing a package" in the Cabal manual defines but there, we have things like this: setup test Run the test suite specified by the runTests field of Distribution.Simple.UserHooks. See Distribution.Simple for information about creating hooks and using defaultMainWithHooks. As a matter of fact, a lot of Cabal is documented in terms of what Distribution.Simple does. Again, this effectively shuts out other build systems. I'm sorry if this all sounds too negative, it shouldn't really. I think you guys have done a great job in implementing a system which is obviously very important to the community. I just somewhat disagree with the direction in which it is heading now. Roman

Roman Leshchinskiy wrote:
But that is precisely my (other) point. A lot of that work is really unnecessary and could be done by Cabal since it only or mostly depends on the package information. Instead, it is implemented somewhere in Distribution.Simple and not really usable from the outside. For instance, a lot of the functionality of setup sdist, setup register and so on could be implemented generically and used by a make-based build system as well.
That's exactly what I'm proposing we do in GHC: re-use Cabal's setup register and some of the other parts of the simple build system in a make-based build system for packages. It might require a bit of refactoring of Cabal, but I don't expect it to be a major upheaval at all. I think what you're proposing is mostly a matter of abstracting parts of Cabal with cleaner and more modular APIs, which is absolutely a good thing, but doesn't require a fundamental redesign. The tight coupling and lack of separation between Cabal's generic parts and the simple build system is somewhat accidental (lazy implementors :-), and is actually a lot better than it used to be thanks to the work Duncan has put in. I'm sure it'll improve further over time. The other part of your complaint is that the BuildInfo is in the .cabal file along with the PackageDescription (the types are pretty well separated internally). Again I don't think there's anything fundamental here, and in fact some packages have separate .buildinfo files. Cheers, Simon

On 14/08/2008, at 18:01, Simon Marlow wrote:
Roman Leshchinskiy wrote:
But that is precisely my (other) point. A lot of that work is really unnecessary and could be done by Cabal since it only or mostly depends on the package information. Instead, it is implemented somewhere in Distribution.Simple and not really usable from the outside. For instance, a lot of the functionality of setup sdist, setup register and so on could be implemented generically and used by a make-based build system as well.
That's exactly what I'm proposing we do in GHC: re-use Cabal's setup register and some of the other parts of the simple build system in a make-based build system for packages. It might require a bit of refactoring of Cabal, but I don't expect it to be a major upheaval at all.
Ah! I hadn't realised that you are going to reuse Cabal functionality. You wrote "Extract the code from Cabal that generates Makefiles" so I thought you won't be really using anything from Cabal.
I think what you're proposing is mostly a matter of abstracting parts of Cabal with cleaner and more modular APIs, which is absolutely a good thing, but doesn't require a fundamental redesign. The tight coupling and lack of separation between Cabal's generic parts and the simple build system is somewhat accidental (lazy implementors :-), and is actually a lot better than it used to be thanks to the work Duncan has put in. I'm sure it'll improve further over time.
IMO, getting this right is absolutely crucial for Cabal's usability and should be the primary short-term goal. Then again, I guess I should contribute code instead of opinions already :-)
The other part of your complaint is that the BuildInfo is in the .cabal file along with the PackageDescription (the types are pretty well separated internally). Again I don't think there's anything fundamental here, and in fact some packages have separate .buildinfo files.
Well, it is fundamental in the sense that this is how Cabal is used (and is supposed to be used) at the moment. It is good that Cabal separates these things internally but the separation should be enforced in the external interface, as well. Roman

On Wed, Aug 13, 2008 at 01:31:55PM +1000, Roman Leshchinskiy wrote:
This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system.
I can't tell you how much I agree with this. the fact that cabal wants to be my build system as well as my configuration system means it is pretty much unusable to me in my projects. Features are something that _hurts_ a system such as this. between a build system, a configuration manager, a packaging system, etc, it is rare for any large project that at least one isn't imposed on you by some external constrant or just a better choice for the job. I would much rather see cabals functionality split among a variety of different programs so the pieces can be used when appropriate, not as an all or nothing thing. (bring back hmake! :) ). John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham
(bring back hmake! :) ).
It never went away... http://www.cs.york.ac.uk/fp/hmake I even have the idea to allow hmake to read the .cabal file format for configuration data (although that is waiting for a delivery of round tuits). Regards, Malcolm

On Wed, 2008-08-27 at 03:04 -0700, John Meacham wrote:
On Wed, Aug 13, 2008 at 01:31:55PM +1000, Roman Leshchinskiy wrote:
This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system.
I can't tell you how much I agree with this. the fact that cabal wants to be my build system as well as my configuration system means it is pretty much unusable to me in my projects.
Features are something that _hurts_ a system such as this. between a build system, a configuration manager, a packaging system, etc, it is rare for any large project that at least one isn't imposed on you by some external constrant or just a better choice for the job. I would much rather see cabals functionality split among a variety of different programs so the pieces can be used when appropriate, not as an all or nothing thing. (bring back hmake! :) ).
People are of course still free to use autoconf and make to implement their own build system and have it still be a Cabal package (which has the advantage of presenting the same meta-data and command interface to packaging tools). It's been that way since the original design. Quite a few packages to use autoconf though the use seems to be slightly on the decline as people try and make their packages portable to Windows. Very few packages use make as it involves re-implementing their own build system which is a lot of work. That's partly a self-fulfilling prophecy of course because nobody uses that interface so it does not get improved so nobody uses it etc. Also, as far as I'm aware hmake still works, at least for nhc, I've not used it recently for building with ghc. So there's nothing stopping people from using that (except hard work), even as part of a cabal package. The different parts of the system are relatively separated. The declarative bits that deal with package meta-data (.cabal files) are available through the Cabal library (Distribution.*) and many tools make use of this. Then the 'Simple' build system is in the same library but fairly cleanly separated (Distribution.Simple.*). As I mentioned, you do not have to use the 'Simple' build system, but the vast majority of packages do. Then there are the packaging tools like the tools for converting to native packages and cabal-install which use the Cabal library and the command line interface that Cabal packages present. I'm not saying it's perfect, but it's not as monolithic as some would suggest. Duncan

The problem with the way cabal wants to mix with make/autoconf is that it is the wrong way round. make is very good at managing pre-processors, dependency tracking and calling external programs in the right order, in parallel, and as needed. cabal is generally good at building a single library or executable given relatively straightforward haskell source. (I know it _can_ do more, but this is mainly what it is good at). The way this should work is that make determines what haskell libraries need to be built, and what haskell files need to be generated to allow cabal to run and calls cabal to build just the ones needed. cabal as a build tool that make calls is much more flexible and in tune with each tools capabilities. The other issue is with cabal files themselves which are somewhat conflicted in purpose. on one hand, you have declarative stuff about a package. name, version, etc... information you want before you start to build something. but then you have build-depends, which is something that you cannot know until after your configuration manager (whatever it may be, autoconf being a popular one) is run. What packages you depend on are going to depend on things like what compiler you have installed, your configuration options, which packages are installed, what operating system you are running on, which kernel version you are running, which c libraries you have installed. etc. things that cannot be predicted before the configuration is actually run. Then you have cabal as a packaging system (or perhaps hackage/cabal considered together). Which has its own warts, if it is meant to live in the niche of package managers such as rpm or deb, where are the 'release' version numbers that rpms and debs have for one example? If it is meant to be a tarball like format, where is the distinction between 'distribution' and 'source' tarballs? For instance, jhc from darcs for developers requires perl,ghc,DrIFT,pandoc,autotools, and happy. however the jhc tarball requires _only_ ghc. nothing else. This is because the make dist target is more interesting than just taring up the source. (and posthooks/prehooks don't really help. they are sort of equivalent to saying 'write your own build system'.) One of the biggest sources of conflict arise from using cabal as a configuration manager. A configuration managers entire purpose is to examine the system and figure out how to adapt your programs build to the system. this is completely 100% at odds with the idea of users having to 'upgrade' cabal. Figuring out how to adapt your build to whatever cabal is installed or failing gracefully if you can't is exactly the job of the configuration manager. something like autoconf. This is why _users_ need not install autoconf, just developers. since autoconf generates a portable script is so that users are never told to upgrade their autoconf. if a developer wants to use new features, he gets the new autoconf and reruns 'autoreconf'. The user is never asked to update anything that isn't actually needed for the project itself. This distinction is key fora configuration manager and really conflicts with cabal wanting to also be a build system and package manager. It is also what is needed for forwards and backwards compatibility. All in all, I think these conflicting goals of cabal make it hard to use in projects and have led to very odd design choices. I think external tools should not be the exception but rather the rule. Not that cabal shouldn't come with a full set of said tools. But as long as they are integrated I don't see cabal's design problems being fixed, meerly augmented with various work-arounds. John -- John Meacham - ⑆repetae.net⑆john⑈

On Wed, 2008-08-27 at 06:13 -0700, John Meacham wrote:
The problem with the way cabal wants to mix with make/autoconf is that it is the wrong way round. make is very good at managing pre-processors, dependency tracking and calling external programs in the right order, in parallel, and as needed. cabal is generally good at building a single library or executable given relatively straightforward haskell source. (I know it _can_ do more, but this is mainly what it is good at).
The way this should work is that make determines what haskell libraries need to be built, and what haskell files need to be generated to allow cabal to run and calls cabal to build just the ones needed. cabal as a build tool that make calls is much more flexible and in tune with each tools capabilities.
I'd say if you're using make for all that, then use it to build the haskell modules too. That gives the advantage of incremental and parallel builds, which Cabal does not do yet (though we've got a GSoC project just coming to an end which does this).
The other issue is with cabal files themselves which are somewhat conflicted in purpose. on one hand, you have declarative stuff about a package. name, version, etc... information you want before you start to build something. but then you have build-depends, which is something that you cannot know until after your configuration manager (whatever it may be, autoconf being a popular one) is run.
Ah, but that's where the autoconf and Cabal models part ways.
What packages you depend on are going to depend on things like what compiler you have installed, your configuration options, which packages are installed, what operating system you are running on, which kernel version you are running, which c libraries you have installed. etc. things that cannot be predicted before the configuration is actually run.
So Cabal takes the view that the relationship between features and dependencies should be declarative. autoconf is essentially a function from a platform environment to maybe a configuration. That's a very flexible approach, the function is opaque and can do whatever feature tests it likes. The downside is that it is not possible to work out what the dependencies are. It might be able to if autoconf explained the result of its decisions, but even then, it's not possible to work out what dependencies are required to get a particular feature enabled. With the Cabal approach these things are explicit. The conditionals in a .cabal file can be read in either direction so it is possible for a package manager to automatically work out what deps would be needed for that optional libcurl feature, or GUI. The other principle is that the packager, the environment is in control over what things the package 'sees'. With autoconf, the script can take into account anything it likes, even if you'd rather it did not. Eg it's important to be able to build a package that does not have that optional dependency, even though the C lib is indeed installed on the build machine, because I may be configuring it for a machine without the C lib. Sure, some good packages allow those automagic decisions to be overridden, but many don't and of course there is no easy way to tell if it's picking up deps it should not. So one of the principles in Cabal configuration is that all decisions about how to configure the package are transparent to the packager and can be overridden. Now currently, Cabal only has a partial implementation of the concept because when it tries to find a configuration that works in the current environment (which it only does if the configuration is not already fully specified by the packager) it only considers dependencies on haskell packages. Obviously there are a range of other dependencies specified in the .cabal file and it should use them all, in particular external C libs. So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
Then you have cabal as a packaging system (or perhaps hackage/cabal considered together). Which has its own warts, if it is meant to live in the niche of package managers such as rpm or deb, where are the 'release' version numbers that rpms and debs have for one example? If it is meant to be a tarball like format, where is the distinction between 'distribution' and 'source' tarballs?
Right, it's supposed to be the upstream release format, tarballs. Distro packages obviously have their additional revision numbers.
For instance, jhc from darcs for developers requires perl,ghc,DrIFT,pandoc,autotools, and happy. however the jhc tarball requires _only_ ghc. nothing else. This is because the make dist target is more interesting than just taring up the source. (and posthooks/prehooks don't really help. they are sort of equivalent to saying 'write your own build system'.)
Right. Cabal does that too (or strictly speaking, the Simple build system can do this). For pre-processors that are platform independent (like alex, happy etc) it puts the pre-processed source into the release tarball. It's also possible to make tarballs without the pre-generated files if it's important.
One of the biggest sources of conflict arise from using cabal as a configuration manager. A configuration managers entire purpose is to examine the system and figure out how to adapt your programs build to the system.
Well, that's the autoconf view. It's not the only way of looking at it as I explained above (perhaps not very clearly). I'd say a configuration manager should negotiate between the package and the packager/user/environment to find a configuration that is satisfactory to all (which requires information flow in both directions).
this is completely 100% at odds with the idea of users having to 'upgrade' cabal. Figuring out how to adapt your build to whatever cabal is installed or failing gracefully if you can't is exactly the job of the configuration manager. something like autoconf. This is why _users_ need not install autoconf, just developers. since autoconf generates a portable script is so that users are never told to upgrade their autoconf. if a developer wants to use new features, he gets the new autoconf and reruns 'autoreconf'. The user is never asked to update anything that isn't actually needed for the project itself. This distinction is key fora configuration manager and really conflicts with cabal wanting to also be a build system and package manager. It is also what is needed for forwards and backwards compatibility.
I suppose in principle it'd be possible to ship the build system in every package like autoconf/automake does. Perhaps we should allow that as an option. It's doable since the Setup.hs can import local modules.
All in all, I think these conflicting goals of cabal make it hard to use in projects and have led to very odd design choices. I think external tools should not be the exception but rather the rule. Not that cabal shouldn't come with a full set of said tools. But as long as they are integrated I don't see cabal's design problems being fixed, meerly augmented with various work-arounds.
One issue, with a pick and mix approach is what is the top level interface that users/package managers use? The current choice (which I'm not at all sure is the right one) is a Setup.hs file that imports its build system from a library that's already on the system (or a custom one implemented locally). So a system that uses make underneath still has to present the Setup.hs interface so that package managers can use it in a uniform way. You mention at the top that you think the make/cabal relationship is the wrong way round, but the Cabal/Setup.hs interface has to be the top level one (at least at the moment) so you'd have Setup.hs call make and make call it back again to build various bits like libs etc? Do you think that separating the Simple build system from the declarative part of Cabal would help? It'd make it more obvious that the build system part really is replaceable which currently is not so obvious since they're in the same package. I'm not averse to splitting them if it'd help. They're already completely partitioned internally. Duncan

On Wed, Aug 27, 2008 at 10:18:59PM +0100, Duncan Coutts wrote:
On Wed, 2008-08-27 at 06:13 -0700, John Meacham wrote:
The problem with the way cabal wants to mix with make/autoconf is that it is the wrong way round. make is very good at managing pre-processors, dependency tracking and calling external programs in the right order, in parallel, and as needed. cabal is generally good at building a single library or executable given relatively straightforward haskell source. (I know it _can_ do more, but this is mainly what it is good at).
The way this should work is that make determines what haskell libraries need to be built, and what haskell files need to be generated to allow cabal to run and calls cabal to build just the ones needed. cabal as a build tool that make calls is much more flexible and in tune with each tools capabilities.
I'd say if you're using make for all that, then use it to build the haskell modules too. That gives the advantage of incremental and parallel builds, which Cabal does not do yet (though we've got a GSoC project just coming to an end which does this).
So, don't use cabal at all? that is the solution I have been going with so far and am trying to remedy.
The other issue is with cabal files themselves which are somewhat conflicted in purpose. on one hand, you have declarative stuff about a package. name, version, etc... information you want before you start to build something. but then you have build-depends, which is something that you cannot know until after your configuration manager (whatever it may be, autoconf being a popular one) is run.
Ah, but that's where the autoconf and Cabal models part ways.
What packages you depend on are going to depend on things like what compiler you have installed, your configuration options, which packages are installed, what operating system you are running on, which kernel version you are running, which c libraries you have installed. etc. things that cannot be predicted before the configuration is actually run.
So Cabal takes the view that the relationship between features and dependencies should be declarative. autoconf is essentially a function from a platform environment to maybe a configuration. That's a very flexible approach, the function is opaque and can do whatever feature tests it likes. The downside is that it is not possible to work out what the dependencies are. It might be able to if autoconf explained the result of its decisions, but even then, it's not possible to work out what dependencies are required to get a particular feature enabled. With the Cabal approach these things are explicit.
unfortunately the cabal approach doesn't work. note, I am not saying a declarative configuration manager won't work. in fact, I have sketched a design for one on occasion. but cabal's particular choices are broken. It is treading the same waters that made 'imake' fail. the ideas of forwards and backwards compatability are _the_ defining features of a configuration manager. Think about this, I can take my old sunsite CD, burned _ten years_ ago and take the unchanged tarballs off that CD and ./configure && make and in general most will work. many were written before linux even existed, many were written with non gcc compilers, yet they work today. The cabal way wasn't able to handle a single release of ghc and keep forwards or backwards compatability. That any project ever had to be changed to use the flag 'split-base' is a travesty. What about all the projects on burnt cds or that don't have someone to update them? 20 years from now when we are all using 'fhc' (Fred's Haskell Compiler) will we still have this reference to 'split-base' in our cabal files? how many more flags will have accumulated by then? Sure it's declarative, but in a language that doesn't make sense without the rule-book. autoconf tests things like 'does a library named foo exist and export bar'. 'is char signed or unsigned on the target system'. those are declarative statement and have a defined meaning through all time. (though, implemented in a pretty ugly imperative way) That is what allows autoconfed packages to be compiled by compilers on systems that were never dreamed of when the packages were written.
The conditionals in a .cabal file can be read in either direction so it is possible for a package manager to automatically work out what deps would be needed for that optional libcurl feature, or GUI.
In the cabal framework Will cabal be able to do things like cross compile a c file to an object file, and deconstruct the generated ELF file to determine parameters needed for an unknown embedded platform _and_ do so without me requiring the user to upgrade their cabal? This is an example of the type of autoconf test that comes up in the real world. You can never come up with a language that will have every needed primitive, any restricted set will ultimately not be enough for someone. and the only alternative is pretty much to not use cabal at all or hack around it in odd ways.
The other principle is that the packager, the environment is in control over what things the package 'sees'. With autoconf, the script can take into account anything it likes, even if you'd rather it did not. Eg it's important to be able to build a package that does not have that optional dependency, even though the C lib is indeed installed on the build machine, because I may be configuring it for a machine without the C lib. Sure, some good packages allow those automagic decisions to be overridden, but many don't and of course there is no easy way to tell if it's picking up deps it should not. So one of the principles in Cabal configuration is that all decisions about how to configure the package are transparent to the packager and can be overridden.
I am not sure what you mean by this. autoconf's flexibility in this regard is pretty exceptional when written properly. Native cross-compilation is one of autoconfs strengths and a big motivating factor in its design.
Now currently, Cabal only has a partial implementation of the concept because when it tries to find a configuration that works in the current environment (which it only does if the configuration is not already fully specified by the packager) it only considers dependencies on haskell packages. Obviously there are a range of other dependencies specified in the .cabal file and it should use them all, in particular external C libs.
And there are many other possible implementations of configuration managers. I fully believe that the next big one will come out of the haskell community, we are a good bunch of people. But it won't if innovation is stifled by cabal _insisting_ on using its own configuration manager and cabal being promoted as 'the way' to do things. This is completely independent of my opinions of cabal as a configuration manager, I would just hate to see such an enticing area of research be cut off prematurely. If cabal is going to be the way to do things with haskell, that means it cannot be the place to try out ones own pet projects about how one thinks things should be. A declarative configuration manager is an intruiging project. one I want to see people work on in different directions, but it is new research.
So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
This is just utterly untrue. autoconfed packages that generate rpms, debs, etc are quite common. The only reason cabal can autogenerate distro packages for so many is that many interesting or hard ones just _arn't possible with cabal at all_. Cabal's inflexibility puts a huge selection bias on the population of cabalized programs.
Then you have cabal as a packaging system (or perhaps hackage/cabal considered together). Which has its own warts, if it is meant to live in the niche of package managers such as rpm or deb, where are the 'release' version numbers that rpms and debs have for one example? If it is meant to be a tarball like format, where is the distinction between 'distribution' and 'source' tarballs?
Right, it's supposed to be the upstream release format, tarballs. Distro packages obviously have their additional revision numbers.
one might say hackage is a distro in and of itself, so should have similar numbers. reusing the same file directly for the packager and the build system makes things like this trickier than they need to be.
For instance, jhc from darcs for developers requires perl,ghc,DrIFT,pandoc,autotools, and happy. however the jhc tarball requires _only_ ghc. nothing else. This is because the make dist target is more interesting than just taring up the source. (and posthooks/prehooks don't really help. they are sort of equivalent to saying 'write your own build system'.)
Right. Cabal does that too (or strictly speaking, the Simple build system can do this). For pre-processors that are platform independent (like alex, happy etc) it puts the pre-processed source into the release tarball. It's also possible to make tarballs without the pre-generated files if it's important.
Sort of. but cabal can only do these things because they are _built in_ to cabal. make will happily use DrIFT, figure out dependencies for ghc, gcc, and jhc, and build my rpms without having to be modified itself. Because it was designed that way.
One of the biggest sources of conflict arise from using cabal as a configuration manager. A configuration managers entire purpose is to examine the system and figure out how to adapt your programs build to the system.
Well, that's the autoconf view. It's not the only way of looking at it as I explained above (perhaps not very clearly). I'd say a configuration manager should negotiate between the package and the packager/user/environment to find a configuration that is satisfactory to all (which requires information flow in both directions).
this is completely 100% at odds with the idea of users having to 'upgrade' cabal. Figuring out how to adapt your build to whatever cabal is installed or failing gracefully if you can't is exactly the job of the configuration manager. something like autoconf. This is why _users_ need not install autoconf, just developers. since autoconf generates a portable script is so that users are never told to upgrade their autoconf. if a developer wants to use new features, he gets the new autoconf and reruns 'autoreconf'. The user is never asked to update anything that isn't actually needed for the project itself. This distinction is key fora configuration manager and really conflicts with cabal wanting to also be a build system and package manager. It is also what is needed for forwards and backwards compatibility.
I suppose in principle it'd be possible to ship the build system in every package like autoconf/automake does. Perhaps we should allow that as an option. It's doable since the Setup.hs can import local modules.
I don't see what you mean, autoconf doesn't "ship the build system" with the package any more than ghc ships ghc with every binary it produces. autoconf is a _compiler_ of a domain specific language to a portable intermediate language by design. This means that autoconf need not be upgraded or installed by users, yet developers are free to take advantage of autoconf's newest features without troubling their users because what is distributed is autoconfs compiled output. If a user has to upgrade their cabal to install a package, then cabal is broken as a configuration manager by design. If I were willing to make a user upgrade their system, there would be _no need_ for a configuration manager at all. The problem of building the most recent and updated library with the most recent and updated compiler on a fully up to date system is a _non problem_.
All in all, I think these conflicting goals of cabal make it hard to use in projects and have led to very odd design choices. I think external tools should not be the exception but rather the rule. Not that cabal shouldn't come with a full set of said tools. But as long as they are integrated I don't see cabal's design problems being fixed, meerly augmented with various work-arounds.
One issue, with a pick and mix approach is what is the top level interface that users/package managers use? The current choice (which I'm not at all sure is the right one) is a Setup.hs file that imports its build system from a library that's already on the system (or a custom one implemented locally). So a system that uses make underneath still has to present the Setup.hs interface so that package managers can use it in a uniform way. You mention at the top that you think the make/cabal relationship is the wrong way round, but the Cabal/Setup.hs interface has to be the top level one (at least at the moment) so you'd have Setup.hs call make and make call it back again to build various bits like libs etc?
Right now I just have ./configure && make be the way to build things, and the ./configure generates an appropriate cabal file when needed. But the 'cabal proxy' stub cabal file similar to what you are saying is also something I have considered (only for haskell libraries I want to put on hackage) but it is far from ideal. As for programs written in haskell, I don't want people's first impression of haskell being "oh crap, I gotta learn a new way to build things just because this program is written in some odd language called 'haskell'" I don't care how awesome a language is, I am going to be annoyed by having to deal with it when I just want to compile/install a program. It will leave a bad taste in my mouth. I would much rather peoples first impression be "oh wow, this program is pretty sweet. I wonder what it is written in?" hence they all use ./configure && make by design rather than necessity.
Do you think that separating the Simple build system from the declarative part of Cabal would help? It'd make it more obvious that the build system part really is replaceable which currently is not so obvious since they're in the same package. I'm not averse to splitting them if it'd help. They're already completely partitioned internally.
Yes it would help signifigantly if it were its own program, invoked by cabal just like hmake or make or mk or cook or bake would be. It would be a step in the right direction. But what I'd really like to see is a split of the configuration management from the parts that meerly describe the package. I sometimes hear that I just shouldn't use cabal for some projects but, when it comes down to it. If cabal is a limited build/configuration system in any way, why would I ever choose it when starting a project when I know it is either putting a limit on my projects ability to innovate or knowing that at some point in the future I am going to have to switch build systems? If cabal isn't suitable or convinient for some projects (which we all admit) and cabal is the haskell way of doing things then the perception will be that _haskell_ is not suitable for said projects. And that is what I fear. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Wed, Aug 27, 2008 at 10:18:59PM +0100, Duncan Coutts wrote:
So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact
that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
This is just utterly untrue. autoconfed packages that generate rpms, debs, etc are quite common.
Can you give an example of how this works? I would expect autoconf scripts to be completely missing the necessary metadata to do this.
As for programs written in haskell, I don't want people's first impression of haskell being "oh crap, I gotta learn a new way to build things just because this program is written in some odd language called 'haskell'" I don't care how awesome a language is, I am going to be annoyed by having to deal with it when I just want to compile/install a program. It will leave a bad taste in my mouth. I would much rather peoples first impression be "oh wow, this program is pretty sweet. I wonder what it is written in?" hence they all use ./configure && make by design rather than necessity.
On the flip side, ./configure && make is completely useless on native windows (i.e. without cygwin, mingw or the like) platforms, whereas cabal works everywhere GHC does. Cheers, Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

John Meacham wrote:
unfortunately the cabal approach doesn't work. note, I am not saying a declarative configuration manager won't work. in fact, I have sketched a design for one on occasion. but cabal's particular choices are broken. It is treading the same waters that made 'imake' fail.
the ideas of forwards and backwards compatability are _the_ defining features of a configuration manager. Think about this, I can take my old sunsite CD, burned _ten years_ ago and take the unchanged tarballs off that CD and ./configure && make and in general most will work. many were written before linux even existed, many were written with non gcc compilers, yet they work today. The cabal way wasn't able to handle a single release of ghc and keep forwards or backwards compatability.
That any project ever had to be changed to use the flag 'split-base' is a travesty. What about all the projects on burnt cds or that don't have someone to update them? 20 years from now when we are all using 'fhc' (Fred's Haskell Compiler) will we still have this reference to 'split-base' in our cabal files? how many more flags will have accumulated by then? Sure it's declarative, but in a language that doesn't make sense without the rule-book. autoconf tests things like 'does a library named foo exist and export bar'. 'is char signed or unsigned on the target system'. those are declarative statement and have a defined meaning through all time. (though, implemented in a pretty ugly imperative way) That is what allows autoconfed packages to be compiled by compilers on systems that were never dreamed of when the packages were written.
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!) Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build. Now, Cabal's dependencies have the well-known problem that they're exceptionally brittle, because they either overspecify or underspecify, and it's not possible to get it "just right". On the other hand, autoconf configurations tend to underspecify dependencies, because you typically only write an autoconf test for something that you know has changed in the past - you don't know what's going to change in the future, so you usually just hope for the best. For Cabal I can ask the question "if I modify the API of package P, which other packages might be broken as a result?", but I can't do that with autoconf. Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example.
So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
This is just utterly untrue. autoconfed packages that generate rpms, debs, etc are quite common. The only reason cabal can autogenerate distro packages for so many is that many interesting or hard ones just _arn't possible with cabal at all_.
Exactly! Cabal is designed so that a distro packager can write a program that takes a Cabal package and generates a distro package for their distro. It has to do distro-specific stuff, but it doesn't typically need to do package-specific stuff. To generate a distro package from an autoconf package either the package author has to include support for that distro, or a distro packager has to write specific support for that package. There's no way to do generic autoconf->distro package generation, like there is with Cabal. Yes this means that Cabal is less general than autoconf. It was quite a revelation when we discovered this during the design of Cabal - originally we were going to have everything done programmatically in the Setup.hs file, but then we realised that having the package configuration available *as data* gave us a lot more scope for automation, albeit at the expense of some generality. That's the tradeoff - but there's still nothing stopping you from using autoconf and your own build system instead if you need to!
As for programs written in haskell, I don't want people's first impression of haskell being "oh crap, I gotta learn a new way to build things just because this program is written in some odd language called 'haskell'" I don't care how awesome a language is, I am going to be annoyed by having to deal with it when I just want to compile/install a program. It will leave a bad taste in my mouth. I would much rather peoples first impression be "oh wow, this program is pretty sweet. I wonder what it is written in?" hence they all use ./configure && make by design rather than necessity.
Python packages don't have ./configure or make...
I sometimes hear that I just shouldn't use cabal for some projects but, when it comes down to it. If cabal is a limited build/configuration system in any way, why would I ever choose it when starting a project when I know it is either putting a limit on my projects ability to innovate or knowing that at some point in the future I am going to have to switch build systems?
Because if you *can* use Cabal, you get a lot of value-adds for free (distro packages, cabal-install, Haddock, source distributions, Hackage). What's more, it's really cheap to use Cabal: a .cabal file is typically less than a screenful, so it's no big deal to switch to something else later if you need to. Cheers, Simon

| Yes this means that Cabal is less general than autoconf. It was quite a | revelation when we discovered this during the design of Cabal - originally | we were going to have everything done programmatically in the Setup.hs | file, but then we realised that having the package configuration available | *as data* gave us a lot more scope for automation, albeit at the expense of | some generality. Now there's a useful insight for the paper I hope Duncan (or someone) is going to write configuration as code [autoconf] vs configuration as data [cabal] Simon

We do have, although not with easy access, an additional declarative layer "built in" 90% of the time as configuration as type signature. An autoconf style approach to this where each type signature dependency is declared seperately would be needlessly complex and painful. However, there is room for a fruitful middle ground. Thanks to Hackage, at least for those packages that build properly and haddock on it, we have, although not in the best format, information on the type signatures of the functions of packages, across various package versions. So if I, when writing a cabal script, don't particularly want to figure out the exact range of, e.g., Network, packages that provide the correct API, it would be fairly reasonable to statically determine which functions from the Network package that are called, and which versions of Network on hackage provide them, and with the appropriate types no less. Thus, given that we need "Network," a tool could determine what the correct allowable range of versions is, and thus avoid both over- and under- specification. This same tool could be run over existing .cabal files on hackage, statically determining when they are likely to either over- or under- specify, and alerting package maintainers to this. --Sterl. On Aug 28, 2008, at 10:02 AM, Simon Peyton-Jones wrote:
| Yes this means that Cabal is less general than autoconf. It was quite a | revelation when we discovered this during the design of Cabal - originally | we were going to have everything done programmatically in the Setup.hs | file, but then we realised that having the package configuration available | *as data* gave us a lot more scope for automation, albeit at the expense of | some generality.
Now there's a useful insight for the paper I hope Duncan (or someone) is going to write
configuration as code [autoconf] vs configuration as data [cabal]
Simon _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 2008 Aug 28, at 22:00, Sterling Clover wrote:
We do have, although not with easy access, an additional declarative layer "built in" 90% of the time as configuration as type signature.
Sure? I think it's easier than you think: someone's already written code to extract the information from .hi files (and indeed ghc will dump it for you: ghc --dump-iface foo.hi). In theory there could be a master dictionary of these hosted on hackage, collected from each package's own dictionary, and a given package's dependencies could be computed with high accuracy from it. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On 2008 Aug 28, at 22:00, Sterling Clover wrote:
We do have, although not with easy access, an additional declarative layer "built in" 90% of the time as configuration as type signature.
Sure? I think it's easier than you think: someone's already written code to extract the information from .hi files (and indeed ghc will dump it for you: ghc --dump-iface foo.hi). In theory there could be a master dictionary of these hosted on hackage, collected from each package's own dictionary, and a given package's dependencies could be computed with high accuracy from it.
It's a good idea, but conditional compilation makes it quite a bit harder. Cheers, Simon

On Thu, 2008-08-28 at 15:02 +0100, Simon Peyton-Jones wrote:
| Yes this means that Cabal is less general than autoconf. It was quite a | revelation when we discovered this during the design of Cabal - originally | we were going to have everything done programmatically in the Setup.hs | file, but then we realised that having the package configuration available | *as data* gave us a lot more scope for automation, albeit at the expense of | some generality.
Now there's a useful insight for the paper I hope Duncan (or someone) is going to write
configuration as code [autoconf] vs configuration as data [cabal]
and there are more fine distinctions even than that. Each change in the power of the language used for configuration changes the range of things that the developer and packager/user can do, and in opposite directions. It's fairly similar to the tradeoffs between deep and shalow embeddings, but I think we more intermediate points. The challenge is in characterising the relationship between the language and the things the developer and packager can do so that we can pick a useful point (or points) in that tradeoff. Before anyone thinks about writing a paper on this topic, I recommend you read all of Eelco's papers[1] first just to make sure he's not already done it! :-) Which is another point that there's lots that Cabal (and ghc) can learn from Nix and related stuff. Duncan [1] http://nixos.org/docs/papers.html

On 28/08/2008, at 23:59, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build.
Cabal doesn't give this guarantee, either, since it allows you to depend on just network or on network>x. To be perfectly honest, I think neither autoconf's approach (free-form feature tests) nor Cabal's (version-based dependencies) really work for all important use cases. And I have to disagree with what you write below - I think both systems are fundamentally flawed. As I said before, what does (mostly) work IMO is depending on interfaces which are independent of packages. Being required to specify the exact interface you depend on solves the problem you describe above. It also solves the problem of name clashes with functions defined in later versions of a package. And it is still nicely declarative.
Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example.
Interesting. From our previous discussion I got the impression that you wouldn't like something like this. :-) Roman

Roman Leshchinskiy wrote:
On 28/08/2008, at 23:59, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build.
Cabal doesn't give this guarantee, either, since it allows you to depend on just network or on network>x.
Indeed. That's why I was careful not to say that Cabal gives you the guarantee, only that it's easy to achieve it.
Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example.
Interesting. From our previous discussion I got the impression that you wouldn't like something like this. :-)
Sorry for giving that impression. Yes I'd like to solve the problems that Cabal dependencies have, but I don't want the solution to be too costly - first-class interfaces seem too heavyweight to me. But I do agree with most of the arguments you gave in their favour. Cheers, Simon

On 29/08/2008, at 01:31, Simon Marlow wrote:
Roman Leshchinskiy wrote:
On 28/08/2008, at 23:59, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build. Cabal doesn't give this guarantee, either, since it allows you to depend on just network or on network>x.
Indeed. That's why I was careful not to say that Cabal gives you the guarantee, only that it's easy to achieve it.
True, it's easy to specify. But IIUC, if you do so you have to update your package whenever any of the packages you depend on changes even if that change doesn't affect you. This is a very high (if not prohibitive) cost and one which the autoconf model doesn't force on you.
Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package- version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example. Interesting. From our previous discussion I got the impression that you wouldn't like something like this. :-)
Sorry for giving that impression. Yes I'd like to solve the problems that Cabal dependencies have, but I don't want the solution to be too costly - first-class interfaces seem too heavyweight to me. But I do agree with most of the arguments you gave in their favour.
I'm not sure what you mean by first-class interfaces. Surely, if you specify the interfaces you depend on you'll want to share and reuse those specifications. Roman

On Thu, Aug 28, 2008 at 02:59:16PM +0100, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
I can easily achieve this with autoconf or even nothing, I can simply do a test to see if a system is running fedora core 9 using ghc 6.8.2 and be assured that my package will build properly. But this misses the entire point, I want my package to build not on my exact system, I want it to build on _other_ peoples systems. People running with compilers and libraries and on operating systems I never heard of. However, this has the huge flaw of requiring a closed universe. A complete and universal definition of what 'network == 1.0' means for all time that all future compilers must agree on. It places a huge burden on implementors to provide a 'network=1.0' compatible interface, simply so cabal doesn't complain even though all programs would be happy with a jhc-network 0.7 or a internet-5.0 package. It means that with jhc-network which has 90% of the functionality of network, including everything that 99.9% of all programs need every program will have to either know about jhc-network to edit their cabal file to include it conditionally, or they just won't work at all. Note, this is similar to the problem of symbol versioning placed on shared libraries. There is a fair amount of literature on the subject, most unix's .so's used to have something similar to the current cabal model, a version number with a minor/major part. it was found to lead to dll hell. (well, .so hell) and we don't want to be in the place with haskell (package hell?). Linux hence switched to its current system that has an individual version number for every api function. I am not saying that is the solution for haskell, but I do not see the current cabal approach scaling any better than the old unix one and leading to the same problems.
Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build.
And with cabal it breaks there in addition to another 80% of times when it could have worked just fine. The autoconf feature test is strictly superior here.
Now, Cabal's dependencies have the well-known problem that they're exceptionally brittle, because they either overspecify or underspecify, and it's not possible to get it "just right". On the other hand, autoconf configurations tend to underspecify dependencies, because you typically only write an autoconf test for something that you know has changed in the past - you don't know what's going to change in the future, so you usually just hope for the best. For Cabal I can ask the question "if I modify the API of package P, which other packages might be broken as a result?", but I can't do that with autoconf.
But the only reason they are broken is due to cabal's sledgehammer approach to package versioning. There is no reason an autoconf style system couldn't do the same thing. And again, you are assuming you can even enumerate all the packages that exist to find out which might be broken and what does that really give you in any case? By changing the API you know you are going to break some things, but what about all the company internal software out there that uses haskell? you can't look at all their packages. It just does not seem like a very useful thing to ask. as it is a question that can be answered by 'grep'.
Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example.
Again, I would like to see this as another option. I think there are interesting ideas in cabal about configuration management. But there needs to be room for alternates including old standby's like autoconf
So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
This is just utterly untrue. autoconfed packages that generate rpms, debs, etc are quite common. The only reason cabal can autogenerate distro packages for so many is that many interesting or hard ones just _arn't possible with cabal at all_.
Exactly! Cabal is designed so that a distro packager can write a program that takes a Cabal package and generates a distro package for their distro. It has to do distro-specific stuff, but it doesn't typically need to do package-specific stuff.
To generate a distro package from an autoconf package either the package author has to include support for that distro, or a distro packager has to write specific support for that package. There's no way to do generic autoconf->distro package generation, like there is with Cabal.
In cabal you only get it because you convinced the cabal people to put in code to support your distro. Which isn't much different than asking the package manager too. And besides, this ability has nothing to do with cabal's configuration management capabilities, simply its metadata format. which can easily be abstracted out and not tied to cabal. (which I would love to see. cabal has a lot of good ideas, but due to its design, its bad ideas are complete showstoppers rather than things you can replace) and there are many automatic package managers for autoconf style packages. http://www.toastball.net/toast/ is a good one, it even downloads dependencies from freshmeat when needed. in fact, your projects can probably be auto installed by 'toast projectname' and you didn't even know it! http://encap.org/ - one I use on pretty much all my systems. since it is distro independent.
Yes this means that Cabal is less general than autoconf. It was quite a revelation when we discovered this during the design of Cabal - originally we were going to have everything done programmatically in the Setup.hs file, but then we realised that having the package configuration available *as data* gave us a lot more scope for automation, albeit at the expense of some generality.
Note, I wholeheartedly agree with the idea of package configuration as data. In fact, when cabal first started, I was a huge advocate of it and in fact _lost interest_ in the project because of the decision to go with the programatic Setup.hs rather than a declarative approach. However, I think cabal is a _poor execution_ of the idea. And this problem is compounded by the fact it is being promoted as the haskell way to do things, it's design decisions are affecting development and evolution of the base libraries. And it's monolithic nature and attitude of wanting to take over your whole projects build cycle means that alternate approaches cannot be explored.
That's the tradeoff - but there's still nothing stopping you from using autoconf and your own build system instead if you need to!
But it is a false tradeoff. the only reason one needs to make that tradeoff is because cabals design doesn't allow the useful ability to mix-n-match parts of it. I would prefer to see cabal improved so I _can_ use its metadata format, its configuration manager for simple projects, autoconf's for more complex ones (with full knowledge of the tradeoffs) and without jumping through hoops.
As for programs written in haskell, I don't want people's first impression of haskell being "oh crap, I gotta learn a new way to build things just because this program is written in some odd language called 'haskell'" I don't care how awesome a language is, I am going to be annoyed by having to deal with it when I just want to compile/install a program. It will leave a bad taste in my mouth. I would much rather peoples first impression be "oh wow, this program is pretty sweet. I wonder what it is written in?" hence they all use ./configure && make by design rather than necessity.
Python packages don't have ./configure or make...
Some don't. And it bugs the hell out of me. They don't work with my autopackaging tools.
I sometimes hear that I just shouldn't use cabal for some projects but, when it comes down to it. If cabal is a limited build/configuration system in any way, why would I ever choose it when starting a project when I know it is either putting a limit on my projects ability to innovate or knowing that at some point in the future I am going to have to switch build systems?
Because if you *can* use Cabal, you get a lot of value-adds for free (distro packages, cabal-install, Haddock, source distributions, Hackage). What's more, it's really cheap to use Cabal: a .cabal file is typically less than a screenful, so it's no big deal to switch to something else later if you need to.
except suddenly you can't use hackage and have to come up with a new build system and perhaps upset my users as they have to learn a new way to build the project. The fact is that it _is_ a big deal to replace cabal is the main issue I have. switching involves changing your build system completely. you can't replace just parts of it easily. Or integrate cabal from the bottom up rather than the top down. And it wants to be the _one true_ build system in your project. I'd like to see a standardized meta-info format for just haskell libraries, based on the current cabal format without the cabal specific build information. (this is what jhc uses, and franchise too I think) Just like the 'lsm' linux software map files. Preferably YAML, we are pretty darn close already and it will give us parsers in many languages for free. We already have several tools that can use the meta-info, jhc, cabal, franchise, hackage (for the web site layout) so it seems like abstracting it from the build info would be a useful step in the right direction. John -- John Meacham - ⑆repetae.net⑆john⑈

On Thu, Aug 28, 2008 at 03:16:16PM -0700, John Meacham wrote:
On Thu, Aug 28, 2008 at 02:59:16PM +0100, Simon Marlow wrote:
To generate a distro package from an autoconf package either the package author has to include support for that distro, or a distro packager has to write specific support for that package. There's no way to do generic autoconf->distro package generation, like there is with Cabal.
In cabal you only get it because you convinced the cabal people to put in code to support your distro. Which isn't much different than asking the package manager too.
I don't understand this. Cabal doesn't have any distro-specific code.
And besides, this ability has nothing to do with cabal's configuration management capabilities, simply its metadata format.
I don't understand this either. You imply you like Cabal's metadata, which says "I depend on network version 1", right? But you don't like Cabal's configuration management? What is Cabal's configuration management, then?
and there are many automatic package managers for autoconf style packages.
http://encap.org/ - one I use on pretty much all my systems. since it is distro independent.
OK, so here's an encap package for DBI:
http://encap.org/search/encapinfo.fcgi?collection=cites&archive=DBI-1.21-encap-sparc-solaris8.tar.gz
ftp://ftp.encap.org/pub/encap/pkgs/cites/DBI-1.21-encap-sparc-solaris8.tar.gz
This tarball contains an encapinfo file that says
encap 2.1 # libencap-2.2.1
platform sparc-solaris8
date Mon Mar 11 19:38:02 CST 2002
contact "Mark D. Roth"

On Fri, Aug 29, 2008 at 12:21:10AM +0100, Ian Lynagh wrote:
You imply you like Cabal's metadata, which says "I depend on network version 1", right?
no, I mean a standard way to specify a package name, a description of it, a category, etc..
But you don't like Cabal's configuration management? What is Cabal's configuration management, then?
the build-depends field mainly, pretty much everything dealing with cabal building a package rather than just describing the _result_ of a successful building. One is constant between build systems and would be generally useful to standardize independently, the other depends on the specific build system/configuration manager used. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Thu, Aug 28, 2008 at 02:59:16PM +0100, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
I can easily achieve this with autoconf or even nothing, I can simply do a test to see if a system is running fedora core 9 using ghc 6.8.2 and be assured that my package will build properly. But this misses the entire point, I want my package to build not on my exact system, I want it to build on _other_ peoples systems. People running with compilers and libraries and on operating systems I never heard of.
But you can only do that by carefully enumerating all the dependencies of your code. autoconf doesn't help you do that - you end up underspecifying the dependencies. Cabal makes you overspecify. It's a soundness/completeness thing: Cabal is sound(*1), autoconf is complete(*2). You complain that Cabal is incomplete and I complain that autoconf is unsound. I'd like to make Cabal's dependency specs more complete, but I don't want to make it unsound. (*1) as long as you specify dependencies with both upper and lower bounds (*2) as long as you don't overspecify dependencies I'd be interested in discussing how to improve Cabal's dependency specifications, if you have any thoughts on that.
Again, I would like to see this as another option. I think there are interesting ideas in cabal about configuration management. But there needs to be room for alternates including old standby's like autoconf
autoconf isn't suitable as a replacement for Cabal's dependency specifications, because it doesn't specify dependencies. I couldn't use an autoconf-configured package with cabal-install, for exmaple.
To generate a distro package from an autoconf package either the package author has to include support for that distro, or a distro packager has to write specific support for that package. There's no way to do generic autoconf->distro package generation, like there is with Cabal.
In cabal you only get it because you convinced the cabal people to put in code to support your distro. Which isn't much different than asking the package manager too.
False! All of the distro packaging tools for Cabal are separate entities build using the Cabal library.
and there are many automatic package managers for autoconf style packages.
http://www.toastball.net/toast/ is a good one, it even downloads dependencies from freshmeat when needed. in fact, your projects can probably be auto installed by 'toast projectname' and you didn't even know it!
As I understand it, toast doesn't download and build dependencies, you have to know what the dependencies are. (maybe I'm wrong, but that's the impression I got from looking at the docs, and if it *does* know about dependencies, I'd like to know how).
http://encap.org/ - one I use on pretty much all my systems. since it is distro independent.
Again, dependencies are not tracked automatically, you (or someone else) have to specify them by hand.
That's the tradeoff - but there's still nothing stopping you from using autoconf and your own build system instead if you need to!
But it is a false tradeoff. the only reason one needs to make that tradeoff is because cabals design doesn't allow the useful ability to mix-n-match parts of it. I would prefer to see cabal improved so I _can_ use its metadata format, its configuration manager for simple projects, autoconf's for more complex ones (with full knowledge of the tradeoffs) and without jumping through hoops.
No, it is a tradeoff. We want packages on Hackage to be automatically installable by cabal-install, for one thing. That means they have to say what their dependencies are.
The fact is that it _is_ a big deal to replace cabal is the main issue I have. switching involves changing your build system completely. you can't replace just parts of it easily. Or integrate cabal from the bottom up rather than the top down. And it wants to be the _one true_ build system in your project.
The counterexample again is the GHC build system, we integrate make and Cabal and autoconf, and we're planning to do more of it with make. Have you thought about how to change Cabal to do what you want? It's only code, after all :-)
I'd like to see a standardized meta-info format for just haskell libraries, based on the current cabal format without the cabal specific build information. (this is what jhc uses, and franchise too I think) Just like the 'lsm' linux software map files. Preferably YAML, we are pretty darn close already and it will give us parsers in many languages for free. We already have several tools that can use the meta-info, jhc, cabal, franchise, hackage (for the web site layout) so it seems like abstracting it from the build info would be a useful step in the right direction.
I think we considered YAML, but I don't remember off-hand what the arguments against were. Maybe someone else knows? Cheers, Simon

Hi,
On Thu, Aug 28, 2008 at 6:59 AM, Simon Marlow
Because if you *can* use Cabal, you get a lot of value-adds for free (distro packages, cabal-install, Haddock, source distributions, Hackage). What's more, it's really cheap to use Cabal: a .cabal file is typically less than a screenful, so it's no big deal to switch to something else later if you need to.
Well, I think this illustrates the current thinking of the Haskell community, where emphasis has been on making things either easy to do, or really hard/impossible to do (a kind of Mac approach to software development! :-). It has the benefit that it makes things seem really easy occasionally, but is it really honest? Concretely: cabal-install: it does not work well with packages that have flags because it does not know what flags to use when building dependencies. Really, packages with conditionals are different packages in one cabal file. Haddock: something seems wrong, if I need to use a specific build system to document my code! source distributions: cabal is really of very little help here as one has to enumerate everything that should be in the distribution. Hackage: Again, something is wrong if I should have to use a specific build system to distribute my code. distro-packages: I have not used these, but the only ones that I have heard about are Don's Arch packages, which are not binary packages, so there the problem is a bit simpler (still nice that it works though). In summary, it seems to me that there are two or three components that are tangled in the term "cabal": 1) a machine readable format for describing the meta-data associated with a package/application (+ a library that can process this meta data). 2) a build tool that has support for interacting with Haskell compilers and other tools that it knows about, to build a package. It seems to me that most of the benefits of cabal come from (1), and for most "simple" cases, (2) is just a way to avoid writing a completely mundane Makefile, while for more complex cases (2) basically doesn't work. -Iavor

On Thu, 2008-09-04 at 09:59 -0700, Iavor Diatchki wrote:
Hi,
On Thu, Aug 28, 2008 at 6:59 AM, Simon Marlow
wrote: Because if you *can* use Cabal, you get a lot of value-adds for free (distro packages, cabal-install, Haddock, source distributions, Hackage). What's more, it's really cheap to use Cabal: a .cabal file is typically less than a screenful, so it's no big deal to switch to something else later if you need to.
Well, I think this illustrates the current thinking of the Haskell community, where emphasis has been on making things either easy to do, or really hard/impossible to do (a kind of Mac approach to software development! :-). It has the benefit that it makes things seem really easy occasionally, but is it really honest? Concretely:
cabal-install: it does not work well with packages that have flags because it does not know what flags to use when building dependencies. Really, packages with conditionals are different packages in one cabal file.
Packages are not supposed to expose different APIs with different flags so I don't think that's right. Under that assumption cabal-install can in principle resolve everything fine. I'm not claiming the current resolution algorithm is very clever when it comes to picking flags (though it should always pick ones that give an overall valid solution) but there is certainly scope for a cleverer one. Also, the user can always specify what features they want, which is what systems like Gentoo do. Do you have any specific test cases where the current algorithm is less than ideal? It'd be useful to report those for the next time someone hacks on the resolver.
Haddock: something seems wrong, if I need to use a specific build system to document my code!
You certainly do not need to use Cabal to use haddock. There are other build systems that integrate support for haddock (eg just makefiles).
source distributions: cabal is really of very little help here as one has to enumerate everything that should be in the distribution.
I think really in the end that the content does need to be specified. You can get quite a long way with inferring or discovering dependencies. Indeed it can work for build, but for source dist you need all the files, not just the ones used in this current build, so things like: #ifdef FOO import Foo #else import Bar #endif mean that if we cpp and then chase imports then we're stuffed, we'll miss one or the other. Trying to discover the deps before cpp is a lost cause becuase it's not just cpp to think about, there's all the other pre-processors, standard and custom. You can get your source control system to do your sdist, eg darcs can do that. It's great but not necessarily what you always want if you want to have files that live in your devel repo that are not included in the source tarball. Also if you want pre-processed files in the tarball it needs some help from the build system.
Hackage: Again, something is wrong if I should have to use a specific build system to distribute my code.
No, you only need to make a Cabal package. You can choose whatever build system you like so long as it presents that standard external command interface and metadata. I guess the fact that very few packages do use any build system other than the "Simple" build system does give the misleading impression that it's the only choice.
distro-packages: I have not used these, but the only ones that I have heard about are Don's Arch packages, which are not binary packages, so there the problem is a bit simpler (still nice that it works though).
Don has done very well recently and generated a lot of excellent publicity. There are also disto packages maintained for Debian, Fedora, Gentoo, FreeBSD which have been around for years. I think Arch packages are binary packages, as are the Debian and Fedora ones. The FreeBSD, MacPorts and Gentoo packages are of course source based.
In summary, it seems to me that there are two or three components that are tangled in the term "cabal": 1) a machine readable format for describing the meta-data associated with a package/application (+ a library that can process this meta data).
1a) a standard interface for users and package managers to configure, build and install a package which can be implemented by multiple build systems including autoconf+make.
2) a build tool that has support for interacting with Haskell compilers and other tools that it knows about, to build a package.
Right, a particular implementation of that interface with a bunch of extra features.
It seems to me that most of the benefits of cabal come from (1), and for most "simple" cases, (2) is just a way to avoid writing a completely mundane Makefile, while for more complex cases (2) basically doesn't work.
I'm not sure the makefiles were completely mundane, I'd more describe them as gnarly. :-) I would not underestimate the advantage for simple projects of not having to re-implement a build system. Being able to use a standard one and inherit new features and fixes for free is quite and advantage. There's also the issue of portability. For many packages the only thing preventing them from working on Windows was the use of make. Certainly, the Simple build system is not yet up to the task of building our most complex packages and will require some major surgery before we get near that. It is something we're working on, though perhaps not directly inside the current Cabal code base. Saizan's GSoC project was step 1 in that direction. As I mentioned elsewhere we should also take a step back and see what we need for Cabal-2, to think about what kind of design might scale. Duncan

cabal-install: it does not work well with packages that have flags because it does not know what flags to use when building dependencies. Really, packages with conditionals are different packages in one cabal file.
Packages are not supposed to expose different APIs with different flags so I don't think that's right. Under that assumption cabal-install can in principle resolve everything fine. I'm not claiming the current resolution algorithm is very clever when it comes to picking flags (though it should always pick ones that give an overall valid solution) but there is certainly scope for a cleverer one. Also, the user can always specify what features they want, which is what systems like Gentoo do.
Do you have any specific test cases where the current algorithm is less than ideal? It'd be useful to report those for the next time someone hacks on the resolver.
I have a package that builds a library and a test executable. The test executable uses QuickCheck 2, but I don't want to force random Jane who cabal-installs my package to install QuickCheck 2. One, it's not packaged up, and two, it's not necessary for using the library. The cleanest way I found to deal with this is to use a flag for hiding the build-depends of the test executable for the flag-less build. if flag(test) build-depends: QuickCheck >= 2.0 else buildable: False Sean

Hello,
On Thu, Sep 4, 2008 at 1:30 PM, Duncan Coutts
cabal-install: it does not work well with packages that have flags because it does not know what flags to use when building dependencies. Really, packages with conditionals are different packages in one cabal file.
Packages are not supposed to expose different APIs with different flags so I don't think that's right. Under that assumption cabal-install can in principle resolve everything fine. I'm not claiming the current resolution algorithm is very clever when it comes to picking flags (though it should always pick ones that give an overall valid solution) but there is certainly scope for a cleverer one. Also, the user can always specify what features they want, which is what systems like Gentoo do.
Do you have any specific test cases where the current algorithm is less than ideal? It'd be useful to report those for the next time someone hacks on the resolver.
The examples that I was thinking of arise when libraries can provide conditional functionality, depending on what is already installed on the system, a kind of "co-dependecy". For a concrete example, take a look at the JSON library that I wrote (I think that it is on hackage). It provides a number of different modules containing parsers written with different parser combinators: one that does not use a library, one that uses ReadP, and one that uses Parsec. The idea was that we do not want to force people to install a particular parser combinator library, instead, we provide compatibility with many different ones. Certainly, JSON does not require _all_ libraries to be installed, but the way the flags are at the moment, I think that that is what happens if you just use cabal-install. (The same sort of thing happens when a library provides instances for datatypes defined in different packages---these instances are often not required by the library, but are useful _if you have the other package installed_.) I guess, you could say that we structured the library wrong---perhaps we should have had a core package that only provides manual parsing (no external libraries required), and then have a separate packages for each of the parsers that use a different parsing combinator library. Conceptually, this might be better, but in practice it seems like a bit of a pain---each parser is a single module, but it would need a whole separate directory, with a separate cabal file, license, and a setup script, all of which would be almost copies of each other. Similarly, in the case of providing instances for datatypes from different packages, one would end up with a separate package for each set of instances, which would result in a proliferation of tiny packages. It seems that this problem should be solvable though... :-) (by the way, this has little to do with the GHC build system, so perhaps we should start a separate thread?) -Iavor

Just cleaning out my inbox and realised I meant to reply to this about 4 months ago :-) On Thu, 2008-09-04 at 23:15 -0700, Iavor Diatchki wrote:
On Thu, Sep 4, 2008 at 1:30 PM, Duncan Coutts
Packages are not supposed to expose different APIs with different flags so I don't think that's right. Under that assumption cabal-install can in principle resolve everything fine. I'm not claiming the current resolution algorithm is very clever when it comes to picking flags (though it should always pick ones that give an overall valid solution) but there is certainly scope for a cleverer one. Also, the user can always specify what features they want, which is what systems like Gentoo do.
Do you have any specific test cases where the current algorithm is less than ideal? It'd be useful to report those for the next time someone hacks on the resolver.
The examples that I was thinking of arise when libraries can provide conditional functionality, depending on what is already installed on the system, a kind of "co-dependecy". [...]
I guess, you could say that we structured the library wrong---perhaps we should have had a core package that only provides manual parsing (no external libraries required), and then have a separate packages for each of the parsers that use a different parsing combinator library.
Conceptually, this might be better, but in practice it seems like a bit of a pain---each parser is a single module, but it would need a whole separate directory, with a separate cabal file, license, and a setup script, all of which would be almost copies of each other.
Right, I admit it might be handy. Unfortunately we could not translate such packages into other packaging systems because I don't know of any standard native packaging systems that allow such co-dependencies. They have to be translated into multiple packages. If we did support such conditional stuff it would have to be explicit to the package manager because otherwise choices about install order would change the exposed functionality (indeed it might not even be stable / globally solvable). In particular I've no idea what we should do about instances. Where we'd like to provide an instance for a class defined in another package that we do not directly need (except to be able to provide the instance). If we did not have the constraint of wanting to generate native packages then there are various more sophisticated things we could do, but generating native packages is really quite important to our plans for world domination. Duncan

| So Cabal takes the view that the relationship between features and | dependencies should be declarative. ... | The other principle is that the packager, the environment is in control | over what things the package 'sees'. ... | that we can and that the approach is basically sound. The fact that we | can automatically generate distro packages for hundreds of packages is | not insignificant. This is just not possible with the autoconf approach. ... | Do you think that separating the Simple build system from the | declarative part of Cabal would help? It'd make it more obvious that the | build system part really is replaceable which currently is not so | obvious since they're in the same package. I'm not averse to splitting | them if it'd help. They're already completely partitioned internally. Duncan, I'm not following every detail here, but it's clear that you have some clear mental infrastructure in your head that informs and underpins the way Cabal is. Cabal "takes the view that...", has "principles", and "is clearly partitioned internally". These things are clear to you, but my sense it that they are *not* clear even to other well-informed people. (I exclude myself from this group.) It's like the Loch Ness monster: the bits above the waves make sense only when you get an underwater picture that shows you the monster underneath that explains why the humps surface in the way they do. This isn't a criticism: one of the hardest thing to do is to accurately convey this underwater stuff. But I wonder whether there might be a useful paper hiding here? Something that establishes terminology, writes down principles, explains the Cabal viewpoint, contrasts with alternatives, and thereby allows discussion about Cabal to be better informed. Simon PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build-system part.

On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:
PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build-system part.
The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part. Thanks Ian

On 28/08/2008, at 21:10, Ian Lynagh wrote:
On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:
PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build- system part.
The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part.
Hmm, from the previous discussion I got the impression that (large parts of) this functionality would be extracted from Simple and could then be used by other build systems. Is this wrong? Roman

On Fri, Aug 29, 2008 at 12:57:59AM +1000, Roman Leshchinskiy wrote:
On 28/08/2008, at 21:10, Ian Lynagh wrote:
On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:
PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build- system part.
The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part.
Hmm, from the previous discussion I got the impression that (large parts of) this functionality would be extracted from Simple and could then be used by other build systems. Is this wrong?
I thought that the proposal was to split Cabal into the "declarative package specification part", and the "how to build the package" part? If so, then surely "how to run haddock on the sources" belongs in the "how to build the package" part? Of course, you can call the haddocking code from another build system, provided your build system is compatible with the way the haddocking code works. That's more-or-less what "Setup makefile" does: It builds the package itself, but puts things in the same places as the simple build system, so the simple build system can be used for configuring, haddocking, installing, etc. I guess in principle you could split the "how to build the package" part up into multiple packages (Cabal-configure, Cabal-haddock, Cabal-install, etc), but I don't see what benefit that would provide. It would still be the same modules containing the same code inside. Thanks Ian

On 29/08/2008, at 03:11, Ian Lynagh wrote:
On Fri, Aug 29, 2008 at 12:57:59AM +1000, Roman Leshchinskiy wrote:
On 28/08/2008, at 21:10, Ian Lynagh wrote:
On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:
PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build- system part.
The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part.
Hmm, from the previous discussion I got the impression that (large parts of) this functionality would be extracted from Simple and could then be used by other build systems. Is this wrong?
I thought that the proposal was to split Cabal into the "declarative package specification part", and the "how to build the package" part?
If so, then surely "how to run haddock on the sources" belongs in the "how to build the package" part?
Ignore me, I misunderstood what your original mail. Sorry for the confusion. Roman

On 2008 Aug 28, at 5:27, Simon Peyton-Jones wrote:
This isn't a criticism: one of the hardest thing to do is to accurately convey this underwater stuff. But I wonder whether there might be a useful paper hiding here? Something that establishes terminology, writes down principles, explains the Cabal viewpoint, contrasts with alternatives, and thereby allows discussion about Cabal to be better informed.
I think we're at the point where such a document is necessary if we're going to have a coherent discussion. (And as an aside, I think the same is true of darcs.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 28/08/2008, at 19:27, Simon Peyton-Jones wrote:
Duncan, I'm not following every detail here, but it's clear that you have some clear mental infrastructure in your head that informs and underpins the way Cabal is. Cabal "takes the view that...", has "principles", and "is clearly partitioned internally".
These things are clear to you, but my sense it that they are *not* clear even to other well-informed people. (I exclude myself from this group.) It's like the Loch Ness monster: the bits above the waves make sense only when you get an underwater picture that shows you the monster underneath that explains why the humps surface in the way they do.
FWIW, I fully agree with this (although I'm not especially well- informed in this particular area). It would be immensely helpful if Cabal's "philosophy" was described somewhere. Roman

On Thu, 2008-08-28 at 10:27 +0100, Simon Peyton-Jones wrote:
| So Cabal takes the view that the relationship between features and | dependencies should be declarative. ... | The other principle is that the packager, the environment is in control | over what things the package 'sees'. ... | that we can and that the approach is basically sound. The fact that we | can automatically generate distro packages for hundreds of packages is | not insignificant. This is just not possible with the autoconf approach. ... | Do you think that separating the Simple build system from the | declarative part of Cabal would help? It'd make it more obvious that the | build system part really is replaceable which currently is not so | obvious since they're in the same package. I'm not averse to splitting | them if it'd help. They're already completely partitioned internally.
Duncan, I'm not following every detail here, but it's clear that you have some clear mental infrastructure in your head that informs and underpins the way Cabal is. Cabal "takes the view that...", has "principles", and "is clearly partitioned internally".
These things are clear to you, but my sense it that they are *not* clear even to other well-informed people. (I exclude myself from this group.) It's like the Loch Ness monster: the bits above the waves make sense only when you get an underwater picture that shows you the monster underneath that explains why the humps surface in the way they do.
This isn't a criticism: one of the hardest thing to do is to accurately convey this underwater stuff. But I wonder whether there might be a useful paper hiding here? Something that establishes terminology, writes down principles, explains the Cabal viewpoint, contrasts with alternatives, and thereby allows discussion about Cabal to be better informed.
Yes. Of course there is Isaac's existing Cabal paper from '05 but there are also some more recent ideas. http://www.cs.ioc.ee/tfp-icfp-gpce05/tfp-proc/24num.pdf I think the way forward is after the upcoming GHC+Cabal release to take a step back and think about a design document for Cabal-2.x. It should incorporate the things we think were right from the original Cabal design document, things we've learnt along the way and try as much as possible to incorporate the criticisms that people have been making in recent months. The goal should be a design with a somewhat greater ambition than the original Cabal design which was mostly aimed at relatively simple, single package projects (a goal which has mostly been achieved). This would also be the right place to explain the configuration model properly, so that the people who are familiar with the autoconf model don't think we're just crazy. Not that they have to agree with us, but at least we should explain the tradeoffs and limitations in either direction. That should also help to clarify what is a limitation in the model vs what is a limitation in the current implementation of the model in current Cabal. The most obvious and immediate problem (apart perhaps from handling complex multi-package systems) is in specifying dependencies on Haskell code. I did start a discussion on this topic on the libraries mailing list in April/May. We didn't reach any really firm conclusions but it's clear that specifying just module names or just package versions is only an inexact proxy for what we really mean. Modules can be renamed, modules can move between packages, neither capture names or types of exported/imported entities let alone semantics. Various ideas were floated like precisely specifying or inferring interfaces, in a style somewhat like ML functors.
PS: concerning your last point, about "separating the Simple build system", that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build-system part.
I don't think it affects that really. GHC would still come with both parts if it were split into two packages and the exported modules and apis would not be changed by splitting. If splitting allowed one part to be separated out and not need to be a boot lib then it'd be a different matter but I don't think that's the case (which I think Ian confirmed in his reply). Splitting might help nhc98 and external perceptions. A possible downside is that it might make bootstrapping Cabal harder. Duncan

Hi John, I've extracted among others two things you don't like too much about cabal: a) having to specify build dependencies because they may change (eg example split base or different libraries poviding network interfaces ..) b) Cabal can't do everything easily. Examples are the multi stage system of ghc or wxWidgets ... Can't say much to this point. For my needs cabal has been enough and packaging many package for NixOs has been easy (except adding foreign C lib dependencies) You would also prefer a build command which does never change solving the problem that you Setup.hs files maybe can't be read by Cabal vesrions of the future. You'd like to have a cabal independant package description so you can extract information easily? Maybe I got some things not right here.. Anyway, how does a package description look like which would satisfy the write once, let one true buildsystem build my package on foregin architectures, machines, distros without having me to change network to jhc-network or such? And probably without the burden for implementors to have to provide a network-1.0 compatible interface ? Let's talk about an example: module Main where import System.IO import ClassProvidingShowDerivedAB data A = ... {- derive .. - } main = do printLn $ showDerivedA $ A ... printLn $ showDerivedB $ A ... So the description must contain that we need - DrIft or the derive library (assuming that the derive library can read Drift syntax) - a foreign module (probably called ClassProvidingShowDerivedAB) providing showDerivedAB of type (A .. -> String) - a foreign module (probably called System.IO) providing printLn of type (String :: IO ()) where "a foreign module" may also mean or any other set of modules exporting the given functions.. So a configure system should find packages containing those functions beeing used and should know (or be able to figure out itself) that ClassProvidingShowDerivedAB has been split into ClassProvidingShowDerivedA ClassProvidingShowDerivedB on some systems? The type system of haskell will be able to give you much better matches compared to looking for C/C++ interfaces, but String -> IO () is not very uniq.. So how can this dependency be expressed? Only consider package dependencies having the word "network" in either its name or description? You could ask the configure system to test all functions String -> IO () wether they have the desired behaviour (hoping that none is implemnted as showDerivedA = const $ removeRecursive "/").. Anyway it would be quite interesting to see wether we can write an application guessing dependencies this way.. However I don't like this to be done automatically. Then the burden of writing the dependencies into a description file would even become smaller. If we want a package be buildable on a time frame of -10 years (past) -> now -> + 10 years (future) either the configure script or the configure system can check for different substitutes or versions which did exsist from -10 years in the past up to now. However we can't know the changes beeing made in the future or wether different substitutes will become availible.. However some "magic knowledge" could know about my lib and its dependencies at publication and know about changes having been made to dependencies up to now. Then it could try refactoring the published code to fit current versions of the libraries.. In the future maybe this magic knowledge can be found in darcs patches (it alread provides support to changing names to some degree..) or it could be collected by volunteers. Changes which could be resolved automatically are renaming , merging and splitting of functions or modules, refactorings such as adding additional parameters to functions etc.. This magic knowledge could be updated by running a tool such as autoreconf or whatsover.. I even think this could work quite well for if you have kind of package database such as hackage knowing about interface changes from package xx-1.0 to xx-2.0. But it would be hard to implement a refactoring tool which can refactor source containing some #ifdefs ? Wait, we could throw away many #ifdef GHC_VERSION > or < version x.y because the system would know about the differences? To uniqely define a set of transformations of your code which has been written using set of dependencies A so that it can be compiled on another system B you need something like a *.cabal file again to tell the nice configure and rewrite engine where to start.. This could look like this network-1.0 from hackage HaXmL git repo hash XXYYZZ Then you could tell the magic tool that instead of network-1.0 jhc-network-0.8 can be tried.. I would really appriciate such a magic tool knowing abouth thousands of changes/ refactorings a) if it knows a transformation I can just run it and compile b) if it knows the difference but there is no transformation maybe there is a comment helping me fixing the trouble faster Such a hint could look like: From now on there is a lazy and a strict variant, so import either X or Y. Or functions W has been removed, use lib K or L instead. c) The person doing the initial change (breaking things) knows most about the change and knows best how to tell people what to do.. Even if i didn't you can upload a transformation to save other peoples time.. I think this is the way to go in the long run: Create a huge database of refactorings either giving you hints how to make a package compile on a different system or maybe even applying some of them automatically. With some time this has the chance to become some "old deep" knowledge as it is burried in autotools now ? Anyway implementing this would be a lot of work put might pay of one day. Of course this applies not only to haskell systems.. Hopefully it will not result in breakages occur more often because they will become cheaper? What do you think? Sincerly Marc Weber
participants (16)
-
Brandon S. Allbery KF8NH
-
Duncan Coutts
-
Ian Lynagh
-
Iavor Diatchki
-
John Meacham
-
Malcolm Wallace
-
Manuel M T Chakravarty
-
Marc Weber
-
Norman Ramsey
-
Roman Leshchinskiy
-
Roman Leshchinskiy
-
Sean Leather
-
Simon Marlow
-
Simon Peyton-Jones
-
Sittampalam, Ganesh
-
Sterling Clover