Haskell library infrastructure coordinator

Gentle Haskell librarians It is common ground that "better libraries" is perhaps the single thing that would boost Haskell most; and that the best way to better libraries is the bazaar model, not the cathedral. There has therefore been quite a lot of discussion recently on this list about a) What does Joe Haskell Programmer have to do to build and distribute a Haskell library, so that it is easy for users to install and use? b) What infrastructure exists to ease Joe's task, especially building distributions, perhaps for for multiple platforms and multiple Haskell compilers? Lots of suggestions emerged, and lots of people have knowledge and willingness to help. What we need, though, is someone to coordinate this group. It's difficult to reach technical agreement in a distributed group without one person who is willing to * Make proposals * Moderate discussion * Be open minded; not impose his/her own views * Drive discussion towards taking decisions * Document the consensus * Probably implement some core infrastructure, or at least coordinate the efforts of others to do so * Restrain ambition: something modest that works is better than something aspirational that doesn't Simon Marlow and I would like to propose Isaac Jones for this coordination role. He is knowledgeable, open minded, and not connected with any particular Haskell implementation. Most important of all, he is willing! (Very much not to be taken for granted -- it's a big job.) You should know him from his postings to this list. Is that acceptable to everyone? Does anyone want to propose anyone else (check they are willing first), or themselves? A huge thank-you to Isaac. Haskell only works because people contribute to it. Simon and Simon

I wanted to write up a proposal regarding the organization of a »central« library infrastructure for weeks ... So I guess now is the time to do or die. Shae Matijs Erisson owns the »haskell-libs« project at sourceforge.net, and we figured this would be a good platform for the effort. The idea is to set-up a CVS repository, which contains the following directories: - hslibs Here do the actual source codes live, using a directory layout complying to the hierarchical library structure, for example: - Text - Data - Foobar The contents is by-approval-only: Any library accepted into this part of the repository has to undergo a review process, in order to ensure high quality and somewhat stable interfaces. A set of guidelines for this process would have to be created, but we could probably adapt the _excellent_ www.boost.org process with little effort. These libraries serve as a quasi-extension of the libraries shipped with ghc/hugs/nhc98. Ideally, everybody would be able to use them by just checking out this directory and adding it to the compiler's search path. - Unstable This directory will contain a bunch of per-user-directories, for example: - SimonsPeter - ErissonShaeMatijs - AgentSmith This directory serves as a private (but unique) namespace for _everybody_ to publish work-in-progress, or libraries that are too specific to be accepted into the general »hslibs« area. Everybody, who checks out »Unstable« and adds it to the compiler's search path, can access them like this: import Unstable.SimonsPeter.MyPrivateLib import Unstable.Whoever.Whatever Libraries from »hslibs« must not relay on libraries provided in the »Unstable« space, but user-space libraries may relay on »hslibs«. It's also a good idea to publish candidate libraries for inclusion in the »official« space here. But basically this space can contain _anything_ you feel others might be interested in. - tools This directory will contain several small tools useful for user's of the distribution. So far I am working on two things to go here: - cvssynch This is a small Haskell program to synchronize two CVS repositories. The idea is that you maintain your private sources in a CVS repository on your private machine. When you're ready to release it, all you do is call »cvssynch« and it will import the code into the »vendor branch« of the CVS repository at sourceforge.net. Unlike »cvs import«, though, »cvssynch« will handle added and erased files/directories automatically. Since the files are imported into the vendor branch, you can still make sourceforge.net-specific changes to the files using CVS's merging feature. If done right, this allows to maintain a private copy of the sources simultaneously to the public version with very little overhead. (A make target could take care of all this.) - hbuild hbuild is a monadic combinatory library, which implements a »make« utility in Haskell. Unlike normal makes, though, hbuild is an ordinary haskell program! You can build your sources by executing »runhugs HBuild all« or »runhugs HBuild install«. This is still highly experimental, but -- not surprisingly -- the package blows any make-variant I've seen out of the water in terms of flexibility and power. I'm still struggling with the design when it comes to building variants (say, you want to build a library for GHC and NHC98 simultaneously), but I hope to get a beta version ready within a few weeks at most. What this approach would achieve: * Users could access the libraries contained in here with no effort at all. * Users can chose whether they want only the tested and stable libraries, or whether they're up for some experimental stuff. * The experimental code is guaranteed to have a unique module name thanks to the »Unstable.UserID« path. * SourceForge.net provides us with all kinds of nifty features, like web space, bug tracking, feature request tracking, etc. * The »hslibs« space could evolve into a portable base library for Haskell, which can be re-used by _all_ compiler vendors, rather than each of them providing their own versions of standard modules. I would like to help getting this set-up, if there is any interest. In particular, I could help working on the guidelines, etc., since I have some experience with the Boost project, which does almost exactly the same thing for the C++ community. Any thoughts? Peter

Peter Simons
Any thoughts?
Yes: perfect! A lot of the libraries discussion seems to be about how to provide binary packages and how to integrate builds for various compilers etc etc. Give me a way to download the source hierarchy in full or in part, and a way to tell the compiler where to look for it, and I'll be quite happy. I think it makes sense to have the source around -- preferably a CVS checkout -- for all but the most basic and fundamental of functions, that way we can browse, learn, improve and enhance, and submit the modifications easily. (Do builds get so complex that compiling from source is unfeasible?) It seems to me that binary packages should be mostly optional, and can be deferred to the compiler developers to snarf from the "hslibs" (stable) hierarchy and include at their leisure? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

On Wednesday 28 May 2003 1:13 pm, Ketil Z. Malde wrote:
I think it makes sense to have the source around -- preferably a CVS checkout -- for all but the most basic and fundamental of functions, that way we can browse, learn, improve and enhance, and submit the modifications easily. (Do builds get so complex that compiling from source is unfeasible?)
There's advantages in working with a CVS checkout (bugs fixed quickly, etc.) but there's also disadvantages: - relies on all library maintainers being able to fix problems promptly or large chunks of the tree can get wedged. For example, greencard, libraries/x11 and libraries/win32 are (near the end of) undergoing large changes. I haven't made a public release of code with those changes in it but there are already bug reports which reveal that things broken in one subtree have made it impossible to build and install the entire fptools tree. - if you don't have official releases, you don't have version numbers which means you can't write dependencies like: requires ghc >= 5.04 requires greencard < 3.00 - everybody has slightly different source trees so it can be hard to reproduce a problem exactly. - doesn't fit at all well with any existing packaging mechanisms (like linux, *bsd, windows, etc. use) - we shouldn't force people onto the bleeding edge unless there's a good reason to. So, sure, let's have people build their own systems from cvs checkouts if they want but let's not make this the standard way of doing things. Let's make version numbered releases the standard. -- Alastair Reid

Alastair Reid
So, sure, let's have people build their own systems from cvs checkouts if they want but let's not make this the standard way of doing things. Let's make version numbered releases the standard.
Yes, I absolutely agree, of course. The important part is distributing source, which is simple and will work well in most cases. I imagine the shipped-with-the-compiler binary libraries will be more or less cast in rock, the "stable" hierarchy will be fairly stable, and anything will go for the "unstable" hierarchy -- which is where CVS will be more appropriate. There should probably be test suites to cover dependencies to ensure that all libraries are adequately in sync -- I'm not sure how easy this is to do in practice. BTW, Debian is unusual, in that updates happen all the time. (There are numbered releases, but they aren't commonly used, I think. They manage to make it work pretty well -- better than most others do for third party stuff, IMHO. Perhaps that model can be used? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Z Malde writes:
Alastair Reid
writes: Let's make version numbered releases the standard.
Yes, I absolutely agree, of course. The important part is distributing source, which is simple and will work well in most cases.
I agree as well. Following the CVS repository is for library developers. Regular releases should be provided and, if possible, should come with of the compilers. Regression testing (QuickCheck is great for that) should be provided for the stable branch, and I also envision semi-automatic verification of API-compatibility with older releases. I should be possible to automatically generate a list of all functions and data types exported by _all_ the source code in a release, and to create a module that imports them. Then we could compile this module with a newer release to check that all of them are still there and have the same type signature. And the infrastructure for that could be re-used by others in the Unstable branch, if they want to. Peter

Peter Simons
Ketil Z Malde writes:
Alastair Reid
writes: Let's make version numbered releases the standard.
Yes, I absolutely agree, of course. The important part is distributing source, which is simple and will work well in most cases.
I agree as well. Following the CVS repository is for library developers. Regular releases should be provided and, if possible, should come with of the compilers.
Whenever people mention altering or shipping something with the compilers, red flags go up in my brain. I think that one of the reasons for this project is to decouple more stuff from the compiler distributions. In general, I'd like to steer the discussion about distribution toward things that integrate with packaging systems like dpkg & FreeBSD's ports. I would like to see a system flexible enough that: - Authors can distribute libraries & tools from their web pages in a way that makes it very easy for users to download and use them. - Packagers for various OSs have a sane way to create packages. - Compiler distributions can hook into a set of libraries that consensus says most people will want by default (but may not be a part of the Haskell standard). We might call this "hslibs". In debian, ghc5 might depend on hslibs-ghc5, which will in turn depend on a bunch of small packages. And more optionally: - A central repository of useful Haskell libraries can be maintained in a way that blends the best of cathedral & bazaar. How do you feel about that way of putting it? peace, isaac

Isaac Jones writes:
Regular releases should be provided and, if possible, should come with of the compilers.
Whenever people mention altering or shipping something with the compilers, red flags go up in my brain.
I didn't mean to emphasize this fact, sorry. Whether the compiler ships the libraries, or whether your Linux distribution ships the package, or whether you compile and install the library yourself -- it is all the same to me. I mostly care for _having_ this platform, first of all.
I would like to see a system flexible enough that:
- Authors can distribute libraries & tools from their web pages in a way that makes it very easy for users to download and use them.
One idea would be to define an XML DTD for describing packages. Maybe work has even been done towards that end already. A conforming document could contain: - Short description of the package - Pointer (URI) to longer description / home page - License information - Author Contact Information - Pointer (URI) to the source code - Pointer (URI) to the binary package - Shell Script or Abstract Information how to build it Then authors could simply provide this XML file and all we'd do is maintain a list of them. From this list, the index on haskell.org could be generated and you could set a package system on top of it, which can download tar.gz archives, check-out CVS repositories, etc., and start the build process.
- Packagers for various OSs have a sane way to create packages.
Building binary packages is a whole new world in terms of complexity, and it is a quality assurance nightmare. Also, for _any_ binary package to be useful, it must be integrated with the packet manager of your Linux/FreeBSD/Apple OS X system. We could provide RPMs for starters, but that's only a fraction of the "market". IMHO binary packages should be left to the distributors; the Haskell community should rather provide source code.
And more optionally: - A central repository of useful Haskell libraries can be maintained in a way that blends the best of cathedral & bazaar.
Well, I have to admit: For me, the whole process starts _here_. I think we should begin small and let the system evolve. Again, I refer to boost.org. The project is doing exactly what I proposed, and it has lifted the C++ language to another level. They have practically taken over the language and library development for the new C++ Standard there! So the idea can't be that bad after all. Peter

G'day all. On Thu, May 29, 2003 at 12:12:18AM +0200, Peter Simons wrote:
Again, I refer to boost.org. The project is doing exactly what I proposed, and it has lifted the C++ language to another level. They have practically taken over the language and library development for the new C++ Standard there! So the idea can't be that bad after all.
Boost is a little more structured than what we want, I think. They emphasise good software engineering practices, such as code reviews. For libraries that we want part of the standard, we may want to follow that. For others, we probably won't want to go that far. Cheers, Andrew Bromage

Peter Simons
Isaac Jones writes: (snip)
Whenever people mention altering or shipping something with the compilers, red flags go up in my brain.
I didn't mean to emphasize this fact, sorry.
:)
One idea would be to define an XML DTD for describing packages. Maybe work has even been done towards that end already. A conforming document could contain: (snip)
I added those comments to the LibraryInfrastructureNotes page for now. They sound good.
- Packagers for various OSs have a sane way to create packages.
Building binary packages is a whole new world in terms of complexity, and it is a quality assurance nightmare.
Also, for _any_ binary package to be useful, it must be integrated with the packet manager of your Linux/FreeBSD/Apple OS X system. We could provide RPMs for starters, but that's only a fraction of the "market".
Right, I'm not suggesting that we centralize these packages.
IMHO binary packages should be left to the distributors; the Haskell community should rather provide source code.
Yes, the packages will be left to the distributors. However, I think that the library infrastructure should be able to provide some metadata and tools for assisting packagers [1]. To me, this is a killer feature. Proposals for a library infrastructure should deal with the issue of integrating with packaging systems so as not to duplicate work or infrastructure, and so we don't make it harder than it has to be for those packagers. More points if you can make it easier for them.
And more optionally: - A central repository of useful Haskell libraries can be maintained in a way that blends the best of cathedral & bazaar.
Well, I have to admit: For me, the whole process starts _here_. I think we should begin small and let the system evolve.
So to be clear: What happens now (correct me if I'm wrong) is that a consensus is reached on this list about how new libraries should fit into the hierarchy. Then the compiler groups each integrate the packages into their distribution, and when the next release comes around, the end users get the libraries. How would your boost-like distribution system interact with different compilers and their different ways of packaging libraries? Isn't some kind of common build infrastructure like the one I propose still necessary so that they don't have to repackage it?
Again, I refer to boost.org. The project is doing exactly what I proposed, and it has lifted the C++ language to another level. They have practically taken over the language and library development for the new C++ Standard there! So the idea can't be that bad after all.
Some people think that the library infrastructure should look just like FreeBSD, I happen to think it should look more like Python's distutils, someone else might think that Debian has the answer. I'm rather convinced that duplicating any one system will leave us with 1) the flaws of that system and 2) the flaws that Haskell adds to the system by not being a perfect fit. Lets try to look at a lot of different tools and find the best synthesis. peace, isaac [1] ] One issue, for instance, is that a whole plethora of seperate binary packages needs to be provided for various compilers and compiler versions since most of them are not binary compatible (see the previous discussions on packaging stuff for Debian). I think we could help automate the process of generating and maintaining a large set of binary packages.

Isaac Jones writes:
Yes, the packages will be left to the distributors. However, I think that the library infrastructure should be able to provide some metadata and tools for assisting packagers [1].
I just realize we really _are_ approaching the problem from different ends! :-) For the sake of avoiding misunderstandings, which I believe we had, let me re-state my approach in other words. I'll address (hopefully) all points you made, even though I don't quote your text: A Haskell module is the most re-usable software component I have ever seen. It is virtually impossible to write something that is _not_ re-usable by someone else. The problem is, though, that many modules are really small, and thus do not warrant the effort of "releasing" them. Plus, not everybody has the infrastructure to do it. So if we could provide a platform where it takes you five minutes to put your source code into an, say, Unstable.YourName directory, hopefully many people would do just that. And then the source code would be _there_. It could be re-used by others -- whereas otherwise, nobody would ever have seen it. Now, if the module is interesting and people are sending you e-mail, asking for features or saying they like it, you'll be writing a short manual and clean-up the API before you know it. So in effect, the Haskell community would have one more good library to re-use. And if the library is nice, we can "publish" it in the Stable area and users of our distribution would just have it on their hard disk. This is already an incredibly useful service, even if it doesn't do anything but distribute the sources -- no build system and nothing. BUT, once you have two or three libraries in this repository, you could go ahead and create a build system for them. The more code comes flowing in, the more complex will your build system get, until some day, when it can - build all the Stable libraries on virtually any platform and with virtually any compiler, - run automated regression tests and produce result reports, - generate all the library documentation (maybe based on Haddock) and install them, either as HTML, PDF, or man pages(!), - generate a sophisticated web site, on which all the contents of the archive can be browsed, searched, etc., - create several different binary package formats and provide means for users to run the regression tests on them. Assuming this system exists, then everybody can check his code into the Unstable area and provide build information for the it. (And this has to be _really_ simple.) So everybody else can just check-out parts of the Unstable branch and integrate them seamlessly into the rest, albeit in the Unstable namespace. Or programmers can chose to re-use the build system for their own projects, if they want to release their package themselves.
Isn't some kind of common build infrastructure [...] necessary so that [different compilers] don't have to repackage it?
I doubt there is such a thing as a common build infrastructure on which everybody could agree. Just look at the reality today: We have BSD Make, GNU Make, Jam, Boost.Jam, Odin, Cook, Automake and only god knows what else -- all being actively used for all kinds of projects. Even within the Haskell community, not everybody is using the fptools, even though it clearly can build pretty much everything. In my opinion, the reason for this lies in the nature of the task, not in the lack of a better system. Given this, I don't have a problem with the fact that people will use a different build infrastructure than the one I designed.
Again, I refer to boost.org.
[...] I'm rather convinced that duplicating any one system will leave us with 1) the flaws of that system and 2) the flaws that Haskell adds to the system by not being a perfect fit.
Uh, I was referring to Boost's model of running the source repository and their success with it -- not to their build infrastructure. Which, by the way, is entirely inappropriate for Haskell, unless we're interested in several hundred kilobytes worth of C++-compiler properties. :-) Peter

Hi, Peter,
Peter Simons
Isaac Jones writes:
Yes, the packages will be left to the distributors. However, I think that the library infrastructure should be able to provide some metadata and tools for assisting packagers [1].
I just realize we really _are_ approaching the problem from different ends! :-)
Indeed! I've been thinking this all along :-)
I doubt there is such a thing as a common build infrastructure on which everybody could agree. Just look at the reality today: We have BSD Make, GNU Make, Jam, Boost.Jam, Odin, Cook, Automake and only god knows what else -- all being actively used for all kinds of projects. Even within the Haskell community, not everybody is using the fptools, even though it clearly can build pretty much everything.
This is the beauty of a layered tools approach like the one I describe. The "build" infrastructure can be any that the library author likes. We'll provide a nice default one, but the layer on top of that structure, like Debian and Distutils, will provide a standard interface.
[...] I'm rather convinced that duplicating any one system will leave us with 1) the flaws of that system and 2) the flaws that Haskell adds to the system by not being a perfect fit.
Uh, I was referring to Boost's model of running the source repository and their success with it -- not to their build infrastructure.
There are several systems which run source repositories that we can learn from, including Debian, FreeBSD, CPAN, Boost, etc. You've done a good job convincing me that there's a lot of work to be done in both on the distribution end and on the "building" end. If we work closely together, we can probably meet in the middle with a successful system. Do you mind helping to convert the build system for the libraries you put together once the "building" side is ready? Some points: We need to isolate overlap between distribution and "distutils" (from now on, when I say Library Infrastructure, I mean "make system" + "distutils" + "distribution"[1]). - Meta information that the libraries should provide (you probably don't have to or want to worry about this yet) will be important to methods of download / depends / install - If not a standard make system, at least a standard method for invoking the make system ala distutils and Debian. You probably don't want to worry about this yet either, but keep it in mind. Here are some suggestions on how you might proceed usefully and give us some time to come up with a plan for a build system - collect the libraries that are already included with various compilers and see what the differences are both in what libraries are included and behaviors of the libraries - find orphaned libraries (ala Numeric Quest) and try to find the authors to give them licenses. - some documentation for standards lightly based on the libraries currently included w/ compilers. These will probably want to include some reference to various licenses. Part of this task will be to help new library authors realize that they should include _some_ license with their stuff, and help them decide what license best fulfills their goals (my personal preference is GPL for applications, LGPL for libraries, and I have no trouble w/ BSD-style.) Does anyone know of a good, unbiased reference re: licenses? - a list of libraries that need better documentation, (which I'll be glad to help write) - develop the requirements / use cases or whatever for a distribution system that you plan to fulfill ala Boost. peace, isaac [1] Things that need names: Library Infrastructure includes: * Building (like the fptools make system, maybe we could call it fpmake or fpmakefiles) * Intermediate Layer (like Distutils, we'll hold off on naming it until it has more form) * Distribution (like "the hump" (OCaml) or CPAN (perl)) * A standard subset of all the libraries in the distribution, like whats currently included w/ the compilers (hslibs is taken, right?) Also, I have an idea for a mascot (see section 3.5: Mascot): http://www.haskell.org/hawiki/LibraryInfrastructure :)

Isaac Jones writes:
This is the beauty of a layered tools approach like the one I describe. [...] there's a lot of work to be done in both on the distribution end and on the "building" end.
I guess I have to apologize. I didn't realize you intend to separate the two stages so clearly, until now. :-( In that case, forget everything I even argued about. I'm completely with you.
Do you mind helping to convert the build system for the libraries you put together once the "building" side is ready?
Not at all ... Like you suggested, my next step will be to come up with a description of "my" project's goal, a repository layout, and a prototype build system. This document can then be discussed, revised, thrown away, and be re-written. :-)
- Meta information that the libraries should provide (you probably don't have to or want to worry about this yet) will be important to methods of download / depends / install
I am very fond XML for this purpose, because it is so re-usable (and can be verified for correctness). If we'd store this meta information in a simple document, for which we provide a DTD, then I don't see a reason why this information couldn't be re-used by a build infrastructure wherever possible. In this case, there would be no overlap between the two efforts at all. Of course the same can be achieved with any format, admittedly, but for XML there might be really good editors available, like psgml for Emacs, etc.
- If not a standard make system, at least a standard method for invoking the make system ala distutils and Debian. You probably don't want to worry about this yet either, but keep it in mind.
In fact, I favor an Autoconf- and Make-based system over all alternatives, just because everybody knows how to use those tools. That's a huge advantage for the users and for the developers as well. If there is a "better" solution, I'd like it to be able to generate Makefiles and configure scripts, as a fallback. :-)
- collect the libraries that are already included with various compilers and see what the differences are both in what libraries are included and behaviors of the libraries
- find orphaned libraries (ala Numeric Quest) and try to find the authors to give them licenses.
If anybody is aware of candidate libraries, please let me know -- preferably including an URL. :-) Also, it would help immensely if everybody would look through his source codes and just make useful modules available for release. Then we'd have more original material, too. I'm sure that licensing will not be a show-stopper issue. Peter P. S.: Sorry for the long delay in answering your e-mail, Isaac. For various reasons, I had to re-install my work station from the scratch lately, and it will still take a while until I'm fully operational again.

There should probably be test suites to cover dependencies to ensure that all libraries are adequately in sync -- I'm not sure how easy this is to do in practice.
Hmmm. It's best done as the library is being written because it could actually help development and because the author likely has a better idea of what a function is supposed to do. But test suites are usually ignored or built long after the fact. Since quickcheck is quite easy to use, I wonder what's missing? Is it infrastructure (a handy driver program and some make targets) or some well-worked examples to cut and paste from? Where do I find examples of how to quickly add quickcheck to my library?
BTW, Debian is unusual, in that updates happen all the time. (There are numbered releases, but they aren't commonly used, I think. They manage to make it work pretty well -- better than most others do for third party stuff, IMHO. Perhaps that model can be used?
Debian had some technical advantages in the beginning (better dependency annotations) but I understand that redhat and others have now caught up. But Debian still has a cultural advantage: packages are built, tested, debugged, etc. with the intention that people will update and install a little bit at a time rather than buying a new set of CDs and updating the entire system in one go. This puts a lot of stress on the dependency annotations which keep the system consistent but it also means that problems in the dependency annotations rapidly come to light. (I think it also means that the debian testing procedures are more rigorous but don't know for sure.) Debian does use numbered releases though. Here's a small sample of the version numbers installed on the machine I'm using just now: ii zip 2.30-5 Archiver for .zip files ii zlib1g 1.1.4-11 compression library - runtime ii zlib1g-dev 1.1.4-11 compression library - development ii zlibc 0.9j-6 Uncompressing C Library The version numbers are of the form a.b.c.d-p where a.b.c.d is whatever version number the authors use and the -p part is the version of the debian package. So, for example, the package for zip version 2.30 has been revised 5 times (perhaps fixing dependencies, adding documentation, etc.) But, Debian packages might apply patches to the software which are not included in the author's software release. These might be trivial like changing a hardwired path from /usr/local/lib to /usr/lib or they might be significant like fixing a known security hole in the Linux kernel. -- Alastair Reid

Alastair Reid
There should probably be test suites to cover dependencies to ensure that all libraries are adequately in sync -- I'm not sure how easy this is to do in practice.
It's best done as the library is being written because it could actually help development and because the author likely has a better idea of what a function is supposed to do.
(I was thinking of cross-module dependencies, quite possibly involving multiple authors)
But test suites are usually ignored or built long after the fact.
Like documentation? I suppose this is a programmers' mindset thing.
Since quickcheck is quite easy to use, I wonder what's missing? Is it infrastructure (a handy driver program and some make targets) or some well-worked examples to cut and paste from? Where do I find examples of how to quickly add quickcheck to my library?
Examples to cut and paste from, would be my vote. Which is why I like to stress installing libraries as source.
Debian does use numbered releases though.
Of individual components, but not of the whole repository as such. Anyway, I'm not sure dependencies will be so crucial for Haskell source, most incompatibilities should be caught at compile time by type checking, a luxury Linux distributions don't have to the same extent. Some dependencies on specific behaviour is caused by the need to work around bugs, this could be alleviated by having source and an obvious path for updates -- i.e. instead of working around a bug, I can fix it, and send the fix upstream. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Greetings all.
Simon Marlow and I would like to propose Isaac Jones for this coordination role.
Thank you :) I'll wait to hear if any objections come up, but in the
mean time...
"Simon Peyton-Jones"
Lots of suggestions emerged, and lots of people have knowledge and willingness to help.
There are already some comments from others that I'll reply to separately.
What we need, though, is someone to coordinate this group. It's difficult to reach technical agreement in a distributed group without one person who is willing to * Make proposals
I have started on a proposal that I put on the wiki and I'll paste in this email [1]. Perhaps before proposals are really addressed, we should come up with a set of requirements. Does anyone want to propose a set of requirements? For what its worth, my initial proposal, to get the ball rolling, is here: http://www.haskell.org/hawiki/LibraryInfrastructure
* Moderate discussion
I feel that discussion should happen on this mailing list and perhaps a little on the wiki.
* Document the consensus
I plan to document the consensus on the Wiki page mentioned above.
* Restrain ambition: something modest that works is better than something aspirational that doesn't
Its already been said that my proposal is too ambitious, but one of its strengths is that it can be implemented incrementally and in layers so that each layer is useful on its own, but is specifically designed with "hooks" to be useful to the upper layers.
A huge thank-you to Isaac. Haskell only works because people contribute to it.
You're very welcome. I think this issue is indeed an extremely important one and I am excited to be a part of the solution. peace, isaac [1] Much of the discussion on this issue has focused on ideas to retrofit "pickOneFrom [Python, Debian, FreeBSD, fptools, hmake, CPAN]" to Haskell. This may not, however, be appropriate. I propose a combination of appropriate concepts from each system, reimplemented for Haskell. As such, this proposal is somewhat more ambitious than the other suggestions, but I feel that this is warranted. This proposal is based on a layered tools approach. The layers can be built and used incrementally, but it is important to keep the higher layers in mind when implementing the lower layers. == Building == * Each sufficiently complex library should have a robust build system, perhaps based on the fptools makefiles. (Alastair Reid seems to be working on cleaning this up) A program to generate a skeleton can be provided. * Standard library search paths might be necessary, and standard flags would be useful. hmake could go some of the way (maybe it already does?) in hiding different compiler behavior where possible. == DistUtils == * The make system should be wrapped with a python-style distutils system which provides a ./setup.hs with standard targets so that every library gets built and installed the same way (this standardization will help to build tools around the infrastructure, as Debian does). Simpler libraries that don't need something as complex as fptools can use just this script. Projects that don't want to use fptools system can wrap their own system with this script. Python provides some standard useful classes that can be imported to help with building and installation, and Haskell can do the same. This would require some support like a standard /usr/bin/haskell interpreter that works like runhugs but invokes the user's default compiler. hmake could probably be altered to serve this purpose considering that 'hi' gets us part way there. The tool that Henrik Nilsson mentioned might help a lot too. * Some meta information could be kept in a setup.cfg file or in a subdirectory. This would be similar to the way Debian is able to standardize the build process so well. This setup file might have build dependencies, a package name, version number, etc. More about the meta information later. * hmake already keeps a list of installed compilers and a default compiler. The setup.hs script could have default targets like "./setup.hs build all" or "./setup.hs build default" or "./setup.hs build ghc5" which would build for the given compiler. Then "./setup.hs install" would install and perhaps register each package that was compiled on this machine on a per-compiler basis. When a new compiler is installed, it'll go about recompiling registered libraries where necessary so that end users don't get "permission denied" errors when building their own software that depends on libraries built with older versions. When a compiler is removed, it could remove the binaries for libraries associated with that version. This might be the hardest point: getting over binary incompatibility. The registering is for software compiled on the end-user's machine. This is specifically for systems like Solaris or for libraries dropped into /usr/local, ie packages whose source is available but who are outside a packaging system which solves this problem some other way. == Packaging == * Now how would this situation interact with systems like FreeBSD and Debian? I envision a tool for Debian which will look at the meta information in the setup.cfg file and the installed compilers, and build a skeleton "debian" directory which would help a packager adhere to a Haskell policy so packages are all consistent; man pages are put in the right section, libraries are put in the right places, etc. This could also help in the creation of separate binary packages (from one source package) for each compiler. I don't know much about FreeBSD's ports system, but I imagine that a similar thing would be a bit easier for a source distribution. == Distribution == * We might also want a CPAN-like repository for authors to put Haskell related libraries which use this build infrastructure so that dependencies can be easily downloaded, etc. This is a pretty separate issue, but it would be really great to have. Sourceforge might be good for now, but something like what Debian has would be great.

Isaac Jones writes:
* Standard library search paths might be necessary, and standard flags would be useful. hmake could go some of the way (maybe it already does?) in hiding different compiler behavior where possible.
Some of the issues I've found in adapting the fptools infrastructure are: 1) It's possible to install packages in global filespace (e.g., /usr/lib/greencard) or in local filespace (e.g., $HOME/lib/greencard). If you're putting it in global filespace, it is appropriate to register it in the global package list (e.g., /usr/lib/ghc-5.04.2/package.conf) but where do you register it if installing it locally? The fix seems to be to have ghc-pkg default to recording packages in the local fielspace (e.g., $HOME/lib/ghc-5.04.2/package.conf). The same will go for any other compiler and for anything that maintains a list of installed packages. There has to be a standard place in local filespace ($HOME/...) for the list to be maintained. An easy way of doing this seems to be to copy and extend the ghc flag --print-libdir which tells you where the global copy is stored. That is, we should add an additional flag to tell you where the local copy is supposed to be. 2) When compiling and installing libraries with hugs, ghc, nhc, it's useful to store it in a directory which includes both the compiler name and the compiler version. For example: /usr/lib/greencard/$compiler.$version This makes it easy to have multiple versions of your compiler installed. It's also useful to be able to find the HsFFI.h file required by the ffi extension. For this, it would be nice if all major tools provided a way to extract things like version number in an easily parsed way. For example, it'd be nice if compilers could act like this: $ hc --print-install-info compiler = GHC major-version = 5 minor-version = 04 patch-level = 2 include-dir = /usr/lib/ghc-5.04.2/include global-libdir = /usr/lib/ghc-5.04.2 local-libdir = $HOME/lib ... Note that I want the tool itself to print this information not to have a separate program which prints the info. This seems to work better if I ask for the library to be compiled and built with a particular compiler (e.g., $HOME/cvs-dirs/nhc98) because there's no risk that I'll run the wrong program to get the install info. 3) When installing libraries, it initially seems fine to install the greencard library in /usr/lib/greencard. But then we move onto the X11 library and try to install into /usr/lib/X11. Hmmmm, that doesn't work so well. It'd be good to have some standard suggestions for people to use. Some which spring to mind are: /usr/lib/HS$package /usr/lib/haskell-libraries/$package /usr/lib/haskell-libraries/$compiler.$version/$package /usr/lib/$compiler.$version/$package Note that with the last, we're using the directory that the compiler itself uses so there's a possibility of conflicts.
* The make system should be wrapped with a python-style distutils system which provides a ./setup.hs with standard targets so that every library gets built and installed the same way
I think it is a great idea to have this extra level of indirection. It means that people can use makefiles, configure scripts, etc. suited to their application, preferences, choice of license, etc. and still play with everyone else. Obviously this needs a different version of setup.hs for each infrastructure but this hopefulyl isn't too hard. I think it's a mistake to make it a Haskell script though because that's bound to lead to bootstrapping issues. I'd use a shell script instead. -- Alastair Reid

On Wed, May 28, 2003 at 05:00:20PM +0100, Alastair Reid wrote:
Isaac Jones writes:
* The make system should be wrapped with a python-style distutils system which provides a ./setup.hs with standard targets so that every library gets built and installed the same way
I think it is a great idea to have this extra level of indirection. It means that people can use makefiles, configure scripts, etc. suited to their application, preferences, choice of license, etc. and still play with everyone else. Obviously this needs a different version of setup.hs for each infrastructure but this hopefulyl isn't too hard.
I think it's a mistake to make it a Haskell script though because that's bound to lead to bootstrapping issues. I'd use a shell script instead.
The shell script would lead to issues on windows, though. I think it's probably better to assume that since you need the haskell compiler in order to compile the libraries it's safe to use it for the setup. It seems like a lot of the point of the library infrastructure is to separate the compiler build process from the library build process. Libraries should be much easier to build than compilers... -- David Roundy http://civet.berkeley.edu/droundy/

Alastair Reid wrote:
I think it's a mistake to make it a Haskell script though because that's bound to lead to bootstrapping issues. I'd use a shell script instead.
On Wednesday 28 May 2003 10:38 pm, David Roundy wrote:
The shell script would lead to issues on windows, though. [...]
Remember, we're talking about a script for _building_ packages not for installing them. For windows, people will use .msi packages to install. People who want to build packages generally need to install a bunch of other things too. Obviously, GHC users will have installed cygwin (whcih includes shells) but even Hugs users need to install a C compiler if they're using a package that uses the foreign function interface (and quite a lot of libraries involve the ffi). Also, my experience from Hugs is that people do not want to build packages from source and, though I think binary packages are a flawed concept, I choose to use binary packages for my Linux system even though I could theoretically build my entire system from source. -- Alastair Reid

On Wed, 28 May 2003 17:04:07 +0100
Alastair Reid
Simon Marlow and I would like to propose Isaac Jones for this coordination role.
I've exchanged a bunch of email with him about how libraries ought to work and I think Isaac would be great as coordinator.
-- Alastair Reid
He seems enthusiastic, capable and takes the initiative. A smaller example being some Control.Monad.State documentation (http://www.syntaxpolice.org/~ijones/tmp/Control.Monad.State.html), and prodding me (implicitly/explicitly) to write up what I could for other Control.Monad libraries (http://haskell.org/hawiki/MonadTemplateLibrary) so that there is at least basic documentation. So consider this my 'Me too' post to Alastair's.
participants (9)
-
Alastair Reid
-
Andrew J Bromage
-
David Roundy
-
Derek Elkins
-
Isaac Jones
-
Ketil Z Malde
-
ketil@ii.uib.no
-
Peter Simons
-
Simon Peyton-Jones