
On Thu, Aug 28, 2008 at 02:59:16PM +0100, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. ("but but but..." I hear you say - don't touch that keyboard yet!)
I can easily achieve this with autoconf or even nothing, I can simply do a test to see if a system is running fedora core 9 using ghc 6.8.2 and be assured that my package will build properly. But this misses the entire point, I want my package to build not on my exact system, I want it to build on _other_ peoples systems. People running with compilers and libraries and on operating systems I never heard of. However, this has the huge flaw of requiring a closed universe. A complete and universal definition of what 'network == 1.0' means for all time that all future compilers must agree on. It places a huge burden on implementors to provide a 'network=1.0' compatible interface, simply so cabal doesn't complain even though all programs would be happy with a jhc-network 0.7 or a internet-5.0 package. It means that with jhc-network which has 90% of the functionality of network, including everything that 99.9% of all programs need every program will have to either know about jhc-network to edit their cabal file to include it conditionally, or they just won't work at all. Note, this is similar to the problem of symbol versioning placed on shared libraries. There is a fair amount of literature on the subject, most unix's .so's used to have something similar to the current cabal model, a version number with a minor/major part. it was found to lead to dll hell. (well, .so hell) and we don't want to be in the place with haskell (package hell?). Linux hence switched to its current system that has an individual version number for every api function. I am not saying that is the solution for haskell, but I do not see the current cabal approach scaling any better than the old unix one and leading to the same problems.
Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build.
And with cabal it breaks there in addition to another 80% of times when it could have worked just fine. The autoconf feature test is strictly superior here.
Now, Cabal's dependencies have the well-known problem that they're exceptionally brittle, because they either overspecify or underspecify, and it's not possible to get it "just right". On the other hand, autoconf configurations tend to underspecify dependencies, because you typically only write an autoconf test for something that you know has changed in the past - you don't know what's going to change in the future, so you usually just hope for the best. For Cabal I can ask the question "if I modify the API of package P, which other packages might be broken as a result?", but I can't do that with autoconf.
But the only reason they are broken is due to cabal's sledgehammer approach to package versioning. There is no reason an autoconf style system couldn't do the same thing. And again, you are assuming you can even enumerate all the packages that exist to find out which might be broken and what does that really give you in any case? By changing the API you know you are going to break some things, but what about all the company internal software out there that uses haskell? you can't look at all their packages. It just does not seem like a very useful thing to ask. as it is a question that can be answered by 'grep'.
Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example.
Again, I would like to see this as another option. I think there are interesting ideas in cabal about configuration management. But there needs to be room for alternates including old standby's like autoconf
So I accept that we do not yet cover the range of configuration choices that are needed by the more complex packages (cf darcs), but I think that we can and that the approach is basically sound. The fact that we can automatically generate distro packages for hundreds of packages is not insignificant. This is just not possible with the autoconf approach.
This is just utterly untrue. autoconfed packages that generate rpms, debs, etc are quite common. The only reason cabal can autogenerate distro packages for so many is that many interesting or hard ones just _arn't possible with cabal at all_.
Exactly! Cabal is designed so that a distro packager can write a program that takes a Cabal package and generates a distro package for their distro. It has to do distro-specific stuff, but it doesn't typically need to do package-specific stuff.
To generate a distro package from an autoconf package either the package author has to include support for that distro, or a distro packager has to write specific support for that package. There's no way to do generic autoconf->distro package generation, like there is with Cabal.
In cabal you only get it because you convinced the cabal people to put in code to support your distro. Which isn't much different than asking the package manager too. And besides, this ability has nothing to do with cabal's configuration management capabilities, simply its metadata format. which can easily be abstracted out and not tied to cabal. (which I would love to see. cabal has a lot of good ideas, but due to its design, its bad ideas are complete showstoppers rather than things you can replace) and there are many automatic package managers for autoconf style packages. http://www.toastball.net/toast/ is a good one, it even downloads dependencies from freshmeat when needed. in fact, your projects can probably be auto installed by 'toast projectname' and you didn't even know it! http://encap.org/ - one I use on pretty much all my systems. since it is distro independent.
Yes this means that Cabal is less general than autoconf. It was quite a revelation when we discovered this during the design of Cabal - originally we were going to have everything done programmatically in the Setup.hs file, but then we realised that having the package configuration available *as data* gave us a lot more scope for automation, albeit at the expense of some generality.
Note, I wholeheartedly agree with the idea of package configuration as data. In fact, when cabal first started, I was a huge advocate of it and in fact _lost interest_ in the project because of the decision to go with the programatic Setup.hs rather than a declarative approach. However, I think cabal is a _poor execution_ of the idea. And this problem is compounded by the fact it is being promoted as the haskell way to do things, it's design decisions are affecting development and evolution of the base libraries. And it's monolithic nature and attitude of wanting to take over your whole projects build cycle means that alternate approaches cannot be explored.
That's the tradeoff - but there's still nothing stopping you from using autoconf and your own build system instead if you need to!
But it is a false tradeoff. the only reason one needs to make that tradeoff is because cabals design doesn't allow the useful ability to mix-n-match parts of it. I would prefer to see cabal improved so I _can_ use its metadata format, its configuration manager for simple projects, autoconf's for more complex ones (with full knowledge of the tradeoffs) and without jumping through hoops.
As for programs written in haskell, I don't want people's first impression of haskell being "oh crap, I gotta learn a new way to build things just because this program is written in some odd language called 'haskell'" I don't care how awesome a language is, I am going to be annoyed by having to deal with it when I just want to compile/install a program. It will leave a bad taste in my mouth. I would much rather peoples first impression be "oh wow, this program is pretty sweet. I wonder what it is written in?" hence they all use ./configure && make by design rather than necessity.
Python packages don't have ./configure or make...
Some don't. And it bugs the hell out of me. They don't work with my autopackaging tools.
I sometimes hear that I just shouldn't use cabal for some projects but, when it comes down to it. If cabal is a limited build/configuration system in any way, why would I ever choose it when starting a project when I know it is either putting a limit on my projects ability to innovate or knowing that at some point in the future I am going to have to switch build systems?
Because if you *can* use Cabal, you get a lot of value-adds for free (distro packages, cabal-install, Haddock, source distributions, Hackage). What's more, it's really cheap to use Cabal: a .cabal file is typically less than a screenful, so it's no big deal to switch to something else later if you need to.
except suddenly you can't use hackage and have to come up with a new build system and perhaps upset my users as they have to learn a new way to build the project. The fact is that it _is_ a big deal to replace cabal is the main issue I have. switching involves changing your build system completely. you can't replace just parts of it easily. Or integrate cabal from the bottom up rather than the top down. And it wants to be the _one true_ build system in your project. I'd like to see a standardized meta-info format for just haskell libraries, based on the current cabal format without the cabal specific build information. (this is what jhc uses, and franchise too I think) Just like the 'lsm' linux software map files. Preferably YAML, we are pretty darn close already and it will give us parsers in many languages for free. We already have several tools that can use the meta-info, jhc, cabal, franchise, hackage (for the web site layout) so it seems like abstracting it from the build info would be a useful step in the right direction. John -- John Meacham - ⑆repetae.net⑆john⑈