Version control systems

Simon Marlow

5 Aug 2008 5 Aug '08

9:23 a.m.

Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git. It came down to two things: the degree of support available, and flexibility of the tools (git is much happier to let you modify the history than Mercurial). Speed ruled out bzr, and Windows support is less of an issue: git appears to work reasonably well on Windows these days. So we need a plan for switching. We aim to make the switch shortly before branching the repository for 6.10, which would mean we need to make the switch early September, in around 5 weeks time. Before then, the goal is to get the infrastructure to the point where we can switch with minimum fuss. We already have an up-to-date git mirror thanks to Thomas Schilling: git clone http://darcs.haskell.org/ghc.git (notice how fast that is :-) darcs-all will be able to work with either the git repository or the darcs repository (Max Bolingbroke is working on this, I believe). We can switch the automatic builds over to git as soon as darcs-all is working, and as long as the git mirror is kept up to date (Thomas: is the mirror being automatically updated now?). I'd urge people to try out the git mirror and let us know how you get on. We'll also work on updating the build documentation on the wiki and creating a page of getting-started info on using git. Cheers, Simon

Show replies by date

Thomas Schilling

5 Aug 5 Aug

9:46 a.m.

On 5 Aug 2008, at 11:23, Simon Marlow wrote:

...

(Thomas: is the mirror being automatically updated now?).

No. I have added the sync script as a post-apply hook, but it doesn't seem to work. Maybe I can debug this with Igloo today. / Thomas

Don Stewart

6 Aug 6 Aug

5:12 a.m.

marlowsd:

...

Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git.

Hooray, this will generate a lot of open source good will, and help make GHC more accessible to the outside world. Just see the comments here, http://www.reddit.com/comments/6v2nl/ghc_project_switches_to_git/ "Great news!" "I'm trying to clone this now," If this means a few more eyes on the code, then that's all win. -- Don

Duncan Coutts

9:53 a.m.

On Tue, 2008-08-05 at 22:12 -0700, Don Stewart wrote:

...

marlowsd:

...
Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git.

Hooray, this will generate a lot of open source good will, and help make GHC more accessible to the outside world.

Heh, you still need darcs to build it, because all the libs are using darcs, and that's not going to change any time soon.

...

Just see the comments here,

http://www.reddit.com/comments/6v2nl/ghc_project_switches_to_git/

"Great news!"

"I'm trying to clone this now,"

Let's see what they say when they find out :-) Duncan

Max Bolingbroke

10:31 a.m.

2008/8/6 Duncan Coutts :

...

On Tue, 2008-08-05 at 22:12 -0700, Don Stewart wrote:

...
marlowsd:

...
Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git.

Hooray, this will generate a lot of open source good will, and help make GHC more accessible to the outside world.

Heh, you still need darcs to build it, because all the libs are using darcs, and that's not going to change any time soon.

One thing that might be a good idea is setting up Git mirrors of the libraries etc that we cannot convert to Git since other people depend on them. This would give us nice integration with Gits submodule support, allowing us to check out a consistent snapshot of the entire tree (including the libraries, Cabal etc) at any point in time straightforwardly. Of course, as a bonus you wouldn't have to install Darcs to clone. Cheers, Max

Duncan Coutts

11:08 a.m.

On Wed, 2008-08-06 at 11:31 +0100, Max Bolingbroke wrote:

...

2008/8/6 Duncan Coutts :

...
On Tue, 2008-08-05 at 22:12 -0700, Don Stewart wrote:

...
marlowsd:

...
Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git.

Hooray, this will generate a lot of open source good will, and help make GHC more accessible to the outside world.

Heh, you still need darcs to build it, because all the libs are using darcs, and that's not going to change any time soon.

One thing that might be a good idea is setting up Git mirrors of the libraries etc that we cannot convert to Git since other people depend on them. This would give us nice integration with Gits submodule support, allowing us to check out a consistent snapshot of the entire tree (including the libraries, Cabal etc) at any point in time straightforwardly. Of course, as a bonus you wouldn't have to install Darcs to clone.

If that means I can continue to use darcs for Cabal development then I'm happy. Duncan

Manuel M T Chakravarty

8 Aug 8 Aug

2:04 a.m.

Max Bolingbroke:

...

2008/8/6 Duncan Coutts :

...
On Tue, 2008-08-05 at 22:12 -0700, Don Stewart wrote:

...
marlowsd:

...
Following lots of useful discussion and evaluation of the available DVCSs out there, the GHC team have made a decision: we're going to switch to git.

Hooray, this will generate a lot of open source good will, and help make GHC more accessible to the outside world.

Heh, you still need darcs to build it, because all the libs are using darcs, and that's not going to change any time soon.

One thing that might be a good idea is setting up Git mirrors of the libraries etc that we cannot convert to Git since other people depend on them. This would give us nice integration with Gits submodule support, allowing us to check out a consistent snapshot of the entire tree (including the libraries, Cabal etc) at any point in time straightforwardly. Of course, as a bonus you wouldn't have to install Darcs to clone.

I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all. Manuel

Ian Lynagh

1:03 p.m.

On Fri, Aug 08, 2008 at 12:04:15PM +1000, Manuel M T Chakravarty wrote:

...

I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all.

No, the plan is to move only the GHC and testsuite repos to git, as the others are also used by hugs, nhc98, etc. It would be possible to move GHC's Cabal repo over too, as that is private to GHC, but given the other libraries will be using darcs anyway I think it is simpler to keep all darcs repos using the same VCS. Thanks Ian

Duncan Coutts

9 Aug 9 Aug

12:32 a.m.

On Fri, 2008-08-08 at 14:03 +0100, Ian Lynagh wrote:

...

On Fri, Aug 08, 2008 at 12:04:15PM +1000, Manuel M T Chakravarty wrote:

...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all.

No, the plan is to move only the GHC and testsuite repos to git, as the others are also used by hugs, nhc98, etc.

It would be possible to move GHC's Cabal repo over too, as that is private to GHC, but given the other libraries will be using darcs anyway I think it is simpler to keep all darcs repos using the same VCS.

If there's some way of having automated git mirrors of the upstream darcs repos then that's might be convenient for people building ghc. Asking the maintainers of all other libs to switch is a bit much though. Duncan

Ian Lynagh

3:39 p.m.

On Sat, Aug 09, 2008 at 01:32:51AM +0100, Duncan Coutts wrote:

...

If there's some way of having automated git mirrors of the upstream darcs repos then that's might be convenient for people building ghc.

I don't think that that really helps. If all you want to do is build then the sync-all script will do the get/pull for you (as long as you have both git and darcs installed). If you want to make any changes at all then you really need to be using the "native" repo format. If someone thinks it is worth doing then it is possible, though. Thanks Ian

Manuel M T Chakravarty

5:46 a.m.

Ian Lynagh:

...

On Fri, Aug 08, 2008 at 12:04:15PM +1000, Manuel M T Chakravarty wrote:

...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all.

No, the plan is to move only the GHC and testsuite repos to git, as the others are also used by hugs, nhc98, etc.

It would be possible to move GHC's Cabal repo over too, as that is private to GHC, but given the other libraries will be using darcs anyway I think it is simpler to keep all darcs repos using the same VCS.

I think all *core* libraries must switch. Seriously, requiring GHC developer to use a mix of two vcs during development is a Very Bad Idea. Don was excited about getting more people to look at the source when it is in git (see the comments he posted from reddit). By requiring two vcs you will get *less* people to look at the source. This is not only to get the sources to hack them, but you effectively require developers to learn the commands for two vcs (when they are already reluctant to learn one). For example, often enough somebody who changes something in GHC will modify the base package, too. Then, to commit the overall work, you need to commit using both vcs. If you need to branch for your work, you need to create branches in two vcs (no idea whether the semantics of a branch in git and darcs is anywhere similar). When you merge your branch, you need to merge in both vcs. You can't seriously propose such a set up! Duncan wrote,

...

If there's some way of having automated git mirrors of the upstream darcs repos then that's might be convenient for people building ghc. Asking the maintainers of all other libs to switch is a bit much though.

I am not talking about all libs, I am talking about the core libs. Most developers of the core libs are also GHC developers. So, you ask them to change already by changing the vcs of GHC. Asking them to work with two vcs at the same time is worse IMHO. I *strongly* object to moving to git before this isn't sorted out. As Roman said before, GHC is heading into a dangerous direction. It gets progressively harder to contribute to the project at the moment. First, changing the build system to Cabal. Now, proposing to use two vcs. Somebody who is new to the project not only has to learn the internals of GHC, but they also have to learn two new vcs, and if they need to change the build system, they need to learn a new build tool. Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path. Manuel

Duncan Coutts

9:51 a.m.

On Sat, 2008-08-09 at 15:46 +1000, Manuel M T Chakravarty wrote:

...

Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I don't especially relish having to learn another vcs tool or raising the bar for contributions to Cabal either (we have lots of people who make small one-off contributions). Duncan

Isaac Dupree

2:50 p.m.

Duncan Coutts wrote:

...

On Sat, 2008-08-09 at 15:46 +1000, Manuel M T Chakravarty wrote:

...
Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I don't especially relish having to learn another vcs tool or raising the bar for contributions to Cabal either (we have lots of people who make small one-off contributions).

I wonder how many of the libraries are "core" in that they need to be changed a lot for GHC? - all the ones that depend on GHC internals, such as base. (Except the current system has many of them use preprocessor conditionals so that can they depend on various compilers' internals, including nhc98 and hugs? Because a lot of that code is actually shared between implementations) - Cabal, since it's needing a lot of extension to make GHC work with it. Do boot-libraries like unix typically need work by GHC devs? On the other hand, it's looking like there's enough intersection between GHC and other-haskell that it's not such a helpful path to pursue. not quite related: I wonder about various haskell libs switching to darcs2 format. A few new programs use it already. As distros include darcs2, it should become less painful. The conversion is less painful for code that's branched less. So maybe in the future a lot of Haskell libs will be in the superior darcs2 format. what an unpleasant situation! But cross-converting between darcs and git format for the same repo is probably even worse. Last time I tried the darcs-all script (maybe a month ago, using darcs 2.0.2), IIRC, it hung, or had some other problem in one of the libraries. Even though it was a clean copy that I'd only ever pulled into (many times, and was getted by darcs-1.0.9, but still). And darcs-all on the libraries has always been a slow sequential task. So I'm not actually all that enamoured of darcs for ghc development, even for the libs. Since I couldn't update anymore (despite going into ghc-head/libraries/something and mucking around with darcs-revert and such), I just deleted the tree and decided to wait until GHC switches VCS before getting a new copy. (trying git-cloning ghc.git sometime, took about 10 minutes, nearly no CPU time, and 80 MB, so I'm pretty happy about that random experience, but I didn't try to do anything with the repo) -Isaac

Manuel M T Chakravarty

10 Aug 10 Aug

4:22 a.m.

Isaac Dupree:

...

Duncan Coutts wrote:

...
...
Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path. I don't especially relish having to learn another vcs tool or raising

On Sat, 2008-08-09 at 15:46 +1000, Manuel M T Chakravarty wrote: the bar for contributions to Cabal either (we have lots of people who make small one-off contributions).

I wonder how many of the libraries are "core" in that they need to be changed a lot for GHC?

The boot libraries, ie, those needed to build the HEAD of the ghc repo: SUBDIRS = ghc-prim $(INTEGER_LIBRARY) base array packedstring SUBDIRS += containers bytestring old-locale old-time filepath directory ifeq "$(GhcLibsWithUnix)" "YES" SUBDIRS += unix endif ifeq "$(Windows)" "YES" SUBDIRS += $(wildcard Win32) endif SUBDIRS += process pretty hpc template-haskell editline Cabal random haskell98 Here Cabal, is ghc variant of the Cabal repo, not the actually Cabal head. The whole point is to make sure that anybody who decides to hack GHC needs to install and learn just one vcs, not two. Manuel

Manuel M T Chakravarty

4:16 a.m.

Duncan Coutts:

...

On Sat, 2008-08-09 at 15:46 +1000, Manuel M T Chakravarty wrote:

...
Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I don't especially relish having to learn another vcs tool or raising the bar for contributions to Cabal either (we have lots of people who make small one-off contributions).

I don't think it matters what vcs Cabal uses. GHC does already for a while use a separate repo for its version of Cabal, and the GHC Cabal repo needs to be explicitly updated to ensure that changes to Cabal do not randomly break GHC. To be honest, if I had to say anything, I would say that GHC has to uses fixed, stable versions of Cabal (like it does of gmp). So, it really doesn't matter what vcs Cabal uses. A completely different matter are libraries like base which are deeply connected to GHC. Manuel

Ian Lynagh

11:59 a.m.

On Sun, Aug 10, 2008 at 02:16:25PM +1000, Manuel M T Chakravarty wrote:

...

Duncan Coutts:

...
I don't especially relish having to learn another vcs tool or raising the bar for contributions to Cabal either (we have lots of people who make small one-off contributions).

I don't think it matters what vcs Cabal uses. GHC does already for a while use a separate repo for its version of Cabal, and the GHC Cabal repo needs to be explicitly updated to ensure that changes to Cabal do not randomly break GHC. To be honest, if I had to say anything, I would say that GHC has to uses fixed, stable versions of Cabal (like it does of gmp). So, it really doesn't matter what vcs Cabal uses.

Unless we do get to a point where we are literally using tarballs[1] of Cabal, I don't think using a mixture of VCSs for Cabal is a good idea. Having to convert patches from one VCS format to the other sounds like a recipe for a lot of pain and suffering. [1] which I think is a bad idea anyway, as it makes it a lot more hassle to fix Cabal bugs that GHC+bootlibs expose. Thanks Ian

Manuel M T Chakravarty

11 Aug 11 Aug

3:35 a.m.

Ian Lynagh:

...

On Sun, Aug 10, 2008 at 02:16:25PM +1000, Manuel M T Chakravarty wrote:

...
Duncan Coutts:

...
I don't especially relish having to learn another vcs tool or raising the bar for contributions to Cabal either (we have lots of people who make small one-off contributions).

I don't think it matters what vcs Cabal uses. GHC does already for a while use a separate repo for its version of Cabal, and the GHC Cabal repo needs to be explicitly updated to ensure that changes to Cabal do not randomly break GHC. To be honest, if I had to say anything, I would say that GHC has to uses fixed, stable versions of Cabal (like it does of gmp). So, it really doesn't matter what vcs Cabal uses.

Unless we do get to a point where we are literally using tarballs[1] of Cabal, I don't think using a mixture of VCSs for Cabal is a good idea. Having to convert patches from one VCS format to the other sounds like a recipe for a lot of pain and suffering.

[1] which I think is a bad idea anyway, as it makes it a lot more hassle to fix Cabal bugs that GHC+bootlibs expose.

The hassle that having two different repo types for Cabal head and Cabal GHC is part of the price of switching from darcs to git for ghc. Incidentally, that you are concerned about Cabal devel in the GHC tree is a consequence out of using GHC as a guinea pig for Cabal development, which by itself is IMHO a Very Bad Idea. Cabal is supposed to be a tool like Happy or Alex. If Cabal *were* mature enough to be used in GHC's build system in the way it is now, GHC would just use the latest stable release of Cabal and we wouldn't have a problem. So, let's please not use one bad idea (using an immature and constantly changing build tool whose use in GHC's build tree barely anybody understands) to justify another bad idea (using two vcs for one project). Manuel

Ian Lynagh

9 Aug 9 Aug

4:08 p.m.

On Sat, Aug 09, 2008 at 03:46:50PM +1000, Manuel M T Chakravarty wrote:

...

Don was excited about getting more people to look at the source when it is in git (see the comments he posted from reddit).

I am skeptical that this initial excitement and cloning will translate into more developers. Also, for someone who's never used either VCS, I think the overhead of learning to use darcs is far lower than of learning to use git. The move to git is more likely to help by not driving away people who have had problems working with GHC in darcs, than by attracting developers in the first place. New GHC developers come from GHC users, not darcs/git users.

...

I am not talking about all libs, I am talking about the core libs. Most developers of the core libs are also GHC developers.

I'm not sure that's true. e.g. Malcolm and Ross both commit to the bootlibs, and we get a lot of patches from various people in the community.

...

I *strongly* object to moving to git before this isn't sorted out.

FWIW, personally I would prefer staying with darcs. I prefer its underlying philosophy, and I find its UI far more intuitive and easy to use. I don't suffer from its problems, though - but then, I don't maintain a long-running HEAD branch, and I mostly don't use it on Windows. However, there certainly are a number of people who are having problems working with darcs (although in some cases this may be because they are working in a way incompatible with darcs, e.g. one person had replaced libraries/ with a symlink, for reasons he didn't explain). Given darcs certainly has some problems, and I seem to be in a minority, I don't feel I can stand in the way of a move. But I think we need a wider discussion before we can think about moving the bootlibs to git. If we are going to have a changeover, then the most convenient time in GHC's development cycle to make it is in 4 or 5 weeks time. Thanks Ian

Manuel M T Chakravarty

10 Aug 10 Aug

4:40 a.m.

Ian Lynagh:

...

On Sat, Aug 09, 2008 at 03:46:50PM +1000, Manuel M T Chakravarty wrote:

...
I am not talking about all libs, I am talking about the core libs. Most developers of the core libs are also GHC developers.

I'm not sure that's true. e.g. Malcolm and Ross both commit to the bootlibs, and we get a lot of patches from various people in the community.

Ross does commit patches to ghc (according to darcs changes). So, either he stops that or has to learn git anyway. I don't think we are talking about random contributions from the community. If anything, we need to compare two numbers (1) developers who need to start using git when the ghc repo changes and (2) library developers (ie, people with commit bits regularly contributing to the boot libs) who do not contribute to ghc and hence could avoid learning git if the boot libs stay in a darcs repo.

...

...
I *strongly* object to moving to git before this isn't sorted out.

FWIW, personally I would prefer staying with darcs. I prefer its underlying philosophy, and I find its UI far more intuitive and easy to use.

Personally, I am more than happy to stay with darcs, too, but my understanding was that at least the Simons decided that we are going to move from darcs to git. All I am saying is that whatever vcs ghc uses, you need to be able to *easily* get, modify, and commit patches to the HEAD and the boot libs with *just one* vcs. Using two vcs is going to make the current situation worse, not better. For example, SimonPJ said one reason for switching vcs is that interns had trouble getting started because they did have trouble obtaining the head as darcs caused them grief. If the boot libs stay under darcs control. Nothing is one, the same interns still won't get going any quicker. Presumably, they are going to take even longer, because they can now get into trouble with darcs and git. We want to lower the barrier to entry, not raise it. By effectively adding a complications (namely git) and not removing any, matters will get worse. Manuel

Roman Leshchinskiy

5:44 a.m.

On 10/08/2008, at 14:40, Manuel M T Chakravarty wrote:

...

Personally, I am more than happy to stay with darcs, too, but my understanding was that at least the Simons decided that we are going to move from darcs to git. All I am saying is that whatever vcs ghc uses, you need to be able to *easily* get, modify, and commit patches to the HEAD and the boot libs with *just one* vcs. Using two vcs is going to make the current situation worse, not better.

I suspect that if GHC switches to git, it will become the standard vcs in the Haskell community sooner or later. Expecting that people (especially newcomers) will use different vcs for different libraries/ compilers is just unrealistic. Really, why should they? Any advantages in usability that darcs might have over git will be overshadowed by the inconvenience of having to remember two different sets of commands. I expect that many new projects will use git and old projects will start switching to it over time. So if the move is made, it should IMO include as big a chunk of the infrastructure as possible. Eventually, it will migrate to git anyway and the earlier it does, the simpler life will be for the developers. As to whether the switch should be made at all, I'm not sure. I've had my share of problems with darcs and I don't think it's suitable for a project of GHC's size at the moment. On the other hand, I suspect that a mixture of git and darcs repos will be even more problematic than what we have now. Maybe investing some time in fixing the most obvious darcs problems would be a better solution? Roman

Jason Dagit

6:12 a.m.

On Sat, Aug 9, 2008 at 10:44 PM, Roman Leshchinskiy wrote: Maybe investing some time in fixing the most obvious darcs problems would be

...

a better solution?

We're working on that over at Darcs HQ, but there is no guarantee that we'd come close to fixing the problems within the 4-5 week window that Ian mentioned. Supposing that the main problems GHC has with darcs 2 format get solved in the next month, would that give GHC reason enough to keep using darcs? It seems many of you are eager to use git; perhaps even if darcs was working to satisfaction. People will be working on making darcs work better with the GHC repo as a test case either way. And personally, since I'm not a GHC dev, the decision doesn't affect my life. Having said that, I'm still obviously biased. I'd love for darcs to work well enough that this never came up. Let me throw out one more idea: What if, as a GHC contributor, I could pick equally between git and darcs? My understanding is that, while not optimal, you could use tailor[1] to synchronize a darcs repository with a git one. Offer up both repositories and keep them in sync. Let the masses decide? Jason [1] http://progetti.arstecnica.it/tailor

Brandon S. Allbery KF8NH

6:47 a.m.

On 2008 Aug 10, at 2:12, Jason Dagit wrote:

...

On Sat, Aug 9, 2008 at 10:44 PM, Roman Leshchinskiy
...
wrote:

Maybe investing some time in fixing the most obvious darcs problems would be a better solution?

We're working on that over at Darcs HQ, but there is no guarantee that we'd come close to fixing the problems within the 4-5 week window that Ian mentioned. Supposing that the main problems GHC has with darcs 2 format get solved in the next month, would that give GHC reason enough to keep using darcs? It seems many of you are eager to use git; perhaps even if darcs was working to satisfaction.

Some people are. I'm more on the side of "are we creating a bigger problem than we already have?" It's not at all clear to me that switching to git would solve more problems than it would cause --- and if you toss in core libraries possibly needing to stay in darcs, or other projects being abruptly forced to switch to git because the core libs did, it's pretty clearly on the "biting off more than we can chew" side of things.

...

Let me throw out one more idea: What if, as a GHC contributor, I could pick equally between git and darcs? My understanding is that, while not optimal, you could use tailor[1] to synchronize a darcs repository with a git one. Offer up both repositories and keep them in sync. Let the masses decide?

There has been some discussion along those lines, but doing that bidirectionally is logitically difficult. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Manuel M T Chakravarty

9:06 a.m.

Jason Dagit:

...

On Sat, Aug 9, 2008 at 10:44 PM, Roman Leshchinskiy
...
wrote:

Maybe investing some time in fixing the most obvious darcs problems would be a better solution?

We're working on that over at Darcs HQ, but there is no guarantee that we'd come close to fixing the problems within the 4-5 week window that Ian mentioned. Supposing that the main problems GHC has with darcs 2 format get solved in the next month, would that give GHC reason enough to keep using darcs? It seems many of you are eager to use git; perhaps even if darcs was working to satisfaction.

People will be working on making darcs work better with the GHC repo as a test case either way. And personally, since I'm not a GHC dev, the decision doesn't affect my life. Having said that, I'm still obviously biased. I'd love for darcs to work well enough that this never came up.

Same here, and fwiw I won't change any of my many other darcs repos any time soon. However, as I have said before, if ghc is to switch, it must be a clean switch, and no messy use of two vcs at the same time for ghc and boot libs.

...

Let me throw out one more idea: What if, as a GHC contributor, I could pick equally between git and darcs? My understanding is that, while not optimal, you could use tailor[1] to synchronize a darcs repository with a git one. Offer up both repositories and keep them in sync. Let the masses decide?

I don't think that this technical feasible. I used tailor once to convert a CVS repo to darcs, and while that was better than throwing away the history, it was pretty messy and nothing that you would want to do on a regular basis. Besides, even if the actual conversion would work smoothly (which I strongly doubt), you'd immediately be faced with problems of atomicity and associated race conditions of commits to the two repos. Manuel

Don Stewart

9 Aug 9 Aug

7:06 p.m.

chak:

...

Ian Lynagh:

...
On Fri, Aug 08, 2008 at 12:04:15PM +1000, Manuel M T Chakravarty wrote:

...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all.

No, the plan is to move only the GHC and testsuite repos to git, as the others are also used by hugs, nhc98, etc.

It would be possible to move GHC's Cabal repo over too, as that is private to GHC, but given the other libraries will be using darcs anyway I think it is simpler to keep all darcs repos using the same VCS.

I think all *core* libraries must switch. Seriously, requiring GHC developer to use a mix of two vcs during development is a Very Bad Idea.

I agree with this. As Audrey says, you have to lower the barrier to entry. That means: * one build system * one vcs to build ghc (and anything it requires, such as the core libraries). This is a chance to make a big step towards accessibility, let's make that step. -- Don

Don Stewart

7:38 p.m.

dons:

...

chak:

...
Ian Lynagh:

...
On Fri, Aug 08, 2008 at 12:04:15PM +1000, Manuel M T Chakravarty wrote:

...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too. In other word, everything that you need to build the development version of GHC should come via git. Having a mix of VCSs would be the worst option of all.

No, the plan is to move only the GHC and testsuite repos to git, as the others are also used by hugs, nhc98, etc.

It would be possible to move GHC's Cabal repo over too, as that is private to GHC, but given the other libraries will be using darcs anyway I think it is simpler to keep all darcs repos using the same VCS.

I think all *core* libraries must switch. Seriously, requiring GHC developer to use a mix of two vcs during development is a Very Bad Idea.

I agree with this.

As Audrey says, you have to lower the barrier to entry. That means:

* one build system * one vcs

to build ghc (and anything it requires, such as the core libraries).

This is a chance to make a big step towards accessibility, let's make that step.

I just want to add to this. We're offering an unusual product: a lazy, purely functional language. This already separates us from the mainstream. So how to we ensure we minimise the stress of adopting something so unfamiliar? By ensuring that in all other respects the environment they have to learn is familiar and simple. I think we risk isolating oursevles yet further -- and creating new barriers to adoption, beyond those we can't avoid -- by adding more complicated dependencies (two revision control systems, one of which, darcs, is now firmly out of the maintstream). Instead, if we just use ubiquitous, common tools -- like git -- for everything, we minimise the pain for people, and sit firmly in the mainstream of open source. If anything has been learnt by Spencer and I while working on xmonad, is: dependencies, dependencies, dependencies reduce these, and you gain eyeballs, and ultimately developers. Increase these, and you end up isolated and marginalised. So let's capitalise on this switch to git, and take the opportunity to remove one big dependency from the system. -- Don

Manuel M T Chakravarty

10 Aug 10 Aug

4:42 a.m.

Listen to Don, he is a wise man! Manuel Don Stewart:

...

...
I agree with this.

As Audrey says, you have to lower the barrier to entry. That means:

* one build system * one vcs

to build ghc (and anything it requires, such as the core libraries).

This is a chance to make a big step towards accessibility, let's make that step.

I just want to add to this. We're offering an unusual product: a lazy, purely functional language. This already separates us from the mainstream. So how to we ensure we minimise the stress of adopting something so unfamiliar?

By ensuring that in all other respects the environment they have to learn is familiar and simple.

I think we risk isolating oursevles yet further -- and creating new barriers to adoption, beyond those we can't avoid -- by adding more complicated dependencies (two revision control systems, one of which, darcs, is now firmly out of the maintstream).

Instead, if we just use ubiquitous, common tools -- like git -- for everything, we minimise the pain for people, and sit firmly in the mainstream of open source.

If anything has been learnt by Spencer and I while working on xmonad, is: dependencies, dependencies, dependencies

reduce these, and you gain eyeballs, and ultimately developers. Increase these, and you end up isolated and marginalised.

So let's capitalise on this switch to git, and take the opportunity to remove one big dependency from the system.

-- Don

Roman Leshchinskiy

5:53 a.m.

On 10/08/2008, at 05:38, Don Stewart wrote:

...

Instead, if we just use ubiquitous, common tools -- like git -- for everything, we minimise the pain for people, and sit firmly in the mainstream of open source.

While I agree with this in general, I'm not sure it really applies to vcs (especially darcs) all that much. I don't think anyone who has ever worked with a vcs will need more than a day to learn how to use darcs (or any other sane vcs, for that matter). Really, the problem with darcs is not that it is not mainstream; rather, it's just that it simply doesn't work sometimes. Roman

Malcolm Wallace

9 Aug 9 Aug

8:30 p.m.

...

...
...
...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too.

...

* one build system * one vcs This is a chance to make a big step towards accessibility, let's make that step.

Ultimately, I don't think git would make ghc any more accessible to new contributors. Darcs is not especially offputting to any beginner who already knows something about VCS in general. What the move to git is about, is making life easier for the *existing* HQ and core contributors. Evaluate it on that basis, and not in terms of unknown (and unknowable) benefits to current non- contributors. Indeed, you should also consider how many contributors you might lose in a move. I do hear some significant current contributors having doubts. I can certainly appreciate that having to run 2 VCS in parallel might be confusing and simply make matters worse than at present. The libraries question is a difficult one. We have made a lot of effort over the last 5 years to build infrastructure and code that is shared and portable across multiple implementations of the language. Is this the time to fork those supposedly "common" core libraries into ghc versions vs the rest? As someone who is not a contributor to GHC, and has never experienced anything more than trivial problems with darcs, I have not felt qualified to comment on the proposal to change GHC's VCS. But as a frequent fixer of breakage in the core libraries, I would be reluctant to have to move to a different VCS there. If the core libraries do move, it will be increasingly difficult to avoid also needing to move nhc98 and Hugs and goodness-knows how many other libraries. For me, it would be un-forced, annoying, and I may not have the extra time available to keep up. So there is a danger that the community will be left with a single (albeit very high quality) compiler, with no need for a Haskell Prime (or any other Standard) in future. If there are technical solutions that can reduce the pain, whilst keeping multiple stake-holders happy, then I think they should be investigated. Regards, Malcolm

Norman Ramsey

10:56 p.m.

As a very part-time, temporarily inactive GHC developer I will offer some opinions which should carry no weight: * When I saw the announcement, I cheered! Last fall, I lost 2 weeks of a 9-week visit to darcs hell. While the alleged features may be alluring, the software simply doesn't do what it says on the box. The highly touted 'theory of patches' is not published anywhere in any form that can be understood and checked by anyone with a little bit of mathematics (e.g., group theory or algebra). All these truths make me eager to be rid of darcs. * It seems clear that git offers a richer user interface than darcs and that the UI is harder to master. Moreover, because everyone I have talked to finds the git documentation less than completely helpful, git is probably harder to learn than darcs. Git's advantages are that it is robust, fast, and popular. * A number of people I trust and respect have urged me to adopt git. Since 2005, nobody has urged me to adopt darcs. A number of people I trust and respect have said they wished to abandon darcs. Nobody has suggested abandoning git. * I violently agree with whomever (Don? Malcolm?) said that the Haskell community will prosper to the degree that we have *one* build system and *one* version-control system. And when the build system or version-control system is standard, we gain eyeballs and developers. I haven't found a standard build system that I am willing to use, but I think git is good enough to be used. * Our long-term goal should be to get the *entire* Haskell development community to agree on a version-control system---one that is not darcs. We should expect this process to take several years, and we should expect it to cost money. Would Microsoft or Galois or York or other large players be willing to donate part of an expert's time to migrate to the new version-control system? I'm no particular fan of git. But in a worse-is-better sort of way, I think it's in---it will fill the niche of free, distributed version control. It would be good to identify a way of helping to smooth the path not only for GHC and *all* the libraries but for Hugs, York, xmonad, everybody. Norman, whose most popular open-source software still lives under RCS!

Ian Lynagh

10 Aug 10 Aug

11:32 a.m.

On Sat, Aug 09, 2008 at 06:56:23PM -0400, Norman Ramsey wrote:

...

* I violently agree with whomever (Don? Malcolm?) said that the Haskell community will prosper to the degree that we have *one* build system and *one* version-control system. And when the build system or version-control system is standard, we gain eyeballs and developers. I haven't found a standard build system that I am willing to use, but I think git is good enough to be used.

* Our long-term goal should be to get the *entire* Haskell development community to agree on a version-control system---one that is not darcs. We should expect this process to take several years, and we should expect it to cost money. Would Microsoft or Galois or York or other large players be willing to donate part of an expert's time to migrate to the new version-control system?

It is, of course, up to people with money what they spend it on, but personally I would much prefer to see money spent on making darcs better, for reasons I won't repeat again. If it makes a difference, I would expect a research paper on how conflictors work would be easy to produce as a side-effect, as we would need to get a good description of how it works, and proofs that it does, anyway. Also, I expect we could get a BSDed darcs as a result. Thanks Ian

Norman Ramsey

11 Aug 11 Aug

12:17 a.m.

...

On Sat, Aug 09, 2008 at 06:56:23PM -0400, Norman Ramsey wrote:

...
* Our long-term goal should be to get the *entire* Haskell development community to agree on a version-control system---one that is not darcs. We should expect this process to take several years, and we should expect it to cost money. Would Microsoft or Galois or York or other large players be willing to donate part of an expert's time to migrate to the new version-control system?

It is, of course, up to people with money what they spend it on, but personally I would much prefer to see money spent on making darcs better, for reasons I won't repeat again.

I missed them and wouldn't mind receiving a private note. For the last year I have been hoping to make 'a new darcs-like thing, with a real theory founding it' an important part (one of three) of a grant proposal in distributed computing. So you can see I am in favor of spending money to create a better darcs (which is not quite the same thing as making darcs better; I want to start with a new theory). But I am having second thoughts because I think by the time a proposal reaches a review committee, git may be so firmly entrenched (worse is better) that the work would be considered not worth funding. I realize that I am now firmly off topic, but if people here have opinions, I would be grateful to receive them (perhaps off-list). Norman

Brandon S. Allbery KF8NH

12:19 a.m.

On 2008 Aug 10, at 20:17, Norman Ramsey wrote:

...

...
For the last year I have been hoping to make 'a new darcs-like thing, with a real theory founding it' an important part (one of three) of a grant proposal in distributed computing. So you can see I am in favor of spending money to create a better darcs (which is not quite the same thing as making darcs better; I want to start with a new theory).

Can you elucidate what's wrong with the current one? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Sittampalam, Ganesh

7:05 a.m.

Brandon S Allbery wrote:

...

On 2008 Aug 10, at 20:17, Norman Ramsey wrote:

...

...
For the last year I have been hoping to make 'a new darcs-like thing, with a real theory founding it' an important part (one of three) of a grant proposal in distributed computing. So you can see I am in favor of spending money to create a better darcs (which is not quite the same thing as making darcs better; I want to start with a new theory).

...

Can you elucidate what's wrong with the current one?

Noone knows how to formalise it, and (AFAIK) only David understands the new conflicts handling, and hasn't managed to completely communicate that understanding to anyone else. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Ian Lynagh

12 Aug 12 Aug

11:31 a.m.

On Sun, Aug 10, 2008 at 08:17:50PM -0400, Norman Ramsey wrote:

...

...
On Sat, Aug 09, 2008 at 06:56:23PM -0400, Norman Ramsey wrote:

...
personally I would much prefer to see money spent on making darcs better, for reasons I won't repeat again.

I missed them and wouldn't mind receiving a private note.

OK, I'll send to the list so that I have somewhere convenient to point people if this comes up in the future: * A lot of darcs's functionality could be refactored into generally usable Haskell libraries, e.g. LCS-finding, downloading-with-libcurl. * darcs was once a flagship Haskell application, supporting the idea that Haskell can be used in the real world. That image has mostly faded away now due to the problems it has, but I think we can get it back if we can get a high quality darcs out there. That would be good for the community's image. * darcs has (in my opinion, at least) a much simpler, more intuitive interface than the other version control systems. I don't think I'm alone here, as I think this is where a lot of the resistance against moving to git is coming from. * I think darcs is the Obvious, Right way to do version control. Phil Wadler (at least, I think it was him; and probably many others too) has said that the lambda calculus is universal, in the sense that if we were to meet a sufficiently advanced alien culture, it is almost inconceivable that they would not have also discovered the lambda calculus. Darcs-style patch theory, before conflicting patches are introduced, falls into the same category in my opinion. (I'm not yet sure if it can be extended to include some definition of conflictors too). By contrast, the heuristics and multiple merge algorithms of other systems feels very ad-hoc. Thanks Ian

Manuel M T Chakravarty

13 Aug 13 Aug

12:49 a.m.

Ian, I completely agree with you. I love the darcs vcs model, too. However, we have three discussions here: (1) Do we want darcs vcs model? Except Thomas Schilling, who seems to be dead set to get rid of darcs, everybody who voiced their opinion seems to be in favour of the darcs model. (2) Is the current implementation of darcs up to a project the size of ghc? Due to problems in the past & performance regressions with darcs2, a serious number of (important) people believe that the current implementation is not good enough. (3) If we change the vcs for the ghc repo, do we change the vcs for the boot libs, too. This is just an open-source project maintenance question. It has nothing to do with which vcs is better. This is the only point I have been arguing: *if* GHC's repo changes, all boot lib repos must change, too. Manuel Ian Lynagh:

...

On Sun, Aug 10, 2008 at 08:17:50PM -0400, Norman Ramsey wrote:

...
...
On Sat, Aug 09, 2008 at 06:56:23PM -0400, Norman Ramsey wrote:

...
personally I would much prefer to see money spent on making darcs better, for reasons I won't repeat again.

I missed them and wouldn't mind receiving a private note.

OK, I'll send to the list so that I have somewhere convenient to point people if this comes up in the future:

* A lot of darcs's functionality could be refactored into generally usable Haskell libraries, e.g. LCS-finding, downloading-with-libcurl.

* darcs was once a flagship Haskell application, supporting the idea that Haskell can be used in the real world. That image has mostly faded away now due to the problems it has, but I think we can get it back if we can get a high quality darcs out there. That would be good for the community's image.

* darcs has (in my opinion, at least) a much simpler, more intuitive interface than the other version control systems. I don't think I'm alone here, as I think this is where a lot of the resistance against moving to git is coming from.

* I think darcs is the Obvious, Right way to do version control. Phil Wadler (at least, I think it was him; and probably many others too) has said that the lambda calculus is universal, in the sense that if we were to meet a sufficiently advanced alien culture, it is almost inconceivable that they would not have also discovered the lambda calculus. Darcs-style patch theory, before conflicting patches are introduced, falls into the same category in my opinion. (I'm not yet sure if it can be extended to include some definition of conflictors too). By contrast, the heuristics and multiple merge algorithms of other systems feels very ad-hoc.

Thanks Ian

Iavor Diatchki

4:04 a.m.

Hello, On Tue, Aug 12, 2008 at 5:49 PM, Manuel M T Chakravarty wrote:

...

Ian, I completely agree with you. I love the darcs vcs model, too. However, we have three discussions here:

(1) Do we want darcs vcs model?

Except Thomas Schilling, who seems to be dead set to get rid of darcs, everybody who voiced their opinion seems to be in favour of the darcs model.

I also don't think that the darcs model has much to offer over git, in fact I find that it lacks some useful features (not counting a reliable implementation). Examples include good support for branching, and being able to easily determine the version of the software that is in a repository (git uses a hash of the content to identify the current state, so it is easy to check if we two developers have the same version of the content). By the way, git's UI is really not as bad as people seem to think. For everyday development "git gui" works very well, and provides a nice GUI that lets you see what you have modified, choose what you want to record, and push/pull from other repos. -Iavor

Isaac Dupree

14 Aug 14 Aug

4:10 p.m.

Iavor Diatchki wrote:

...

I also don't think that the darcs model has much to offer over git, in fact I find that it lacks some useful features (not counting a reliable implementation). Examples include good support for branching, and being able to easily determine the version of the software that is in a repository (git uses a hash of the content to identify the current state, so it is easy to check if we two developers have the same version of the content).

I think these things are possible in darcs's model, just not its implementation. For example, under _darcs it could have enough info in various states to allow one to switch branches within the same physical directory tree (and if there aren't many changes between the two branches/patchsets, the switch can be quick). And if it weren't for the varying ways the same patch can be stored, hashes of history ought to work too (although that's certainly very built in to the current implementation of darcs; whether it's technically part of the model probably depends whether you can provide the exact same interface, semantics, and computational complexity with a different representation). And I wonder why (it sounds like) Git doesn't have tools to do some kind of smart cherrypicking, using a heuristic to decide which patches in a branch are definitely dependencies of the cherry-picked patch. In any case, I notice a few times with ghc/darcs/Trac tickets, more than one commit has to be listed explicitly to be merged into the stable branch. Maybe it's not very useful/reliable for these purposes anyway? Since I've only ever used Darcs (besides read-only CVS/SVN/etc.), I personally can't speak to what model is better for me! -Isaac

Bryan Donlan

15 Aug 15 Aug

8 p.m.

On Thu, Aug 14, 2008 at 12:10 PM, Isaac Dupree wrote:

...

And I wonder why (it sounds like) Git doesn't have tools to do some kind of smart cherrypicking, using a heuristic to decide which patches in a branch are definitely dependencies of the cherry-picked patch. In any case, I notice a few times with ghc/darcs/Trac tickets, more than one commit has to be listed explicitly to be merged into the stable branch. Maybe it's not very useful/reliable for these purposes anyway?

The intent with git is that you would do such cherrypicks at the branch level, not at the individual commit level - ie, if you have dependent patches that also need to be backported or whatever, you really ought to have developed the feature as a branch in the first place. You could then rebase such a branch to a prior version, and merge it into both old and new; or you could just rebase it on top of wherever you're backporting to, if you don't intend to do big merges much between the two (as the commit IDs would be different in this case). You can of course just use git cherry-pick, but this doesn't have any intelligence at all when it comes to avoiding duplicate patches - it basically just diffs from the old commit and applies it somewhere else. The git merging logic does have some heuristics to detect duplicate patches and do the right thing, however. The limitations come from git's relatively simple history model, in which commits have parent commits but no 'sideways' relationships. In practice I don't think it will be a problem - how often will there be branches which will receive cherry picks and then later have a merge from or to the same source?

Johan Tibell

13 Aug 13 Aug

7:09 a.m.

On Wed, Aug 13, 2008 at 2:49 AM, Manuel M T Chakravarty wrote:

...

Ian, I completely agree with you. I love the darcs vcs model, too. However, we have three discussions here:

(1) Do we want darcs vcs model?

Except Thomas Schilling, who seems to be dead set to get rid of darcs, everybody who voiced their opinion seems to be in favour of the darcs model.

I'm also in favor of the switch to Git. The Git model has proved to be both more productive and more reliable. And the interface, as far as I'm concerned, is *better*. Cheers, Johan

Austin Seipp

3:04 p.m.

Excerpts from Johan Tibell's message of Wed Aug 13 02:09:00 -0500 2008:

...

I'm also in favor of the switch to Git. The Git model has proved to be both more productive and more reliable. And the interface, as far as I'm concerned, is *better*.

Seconded. The git documentation these days I find is excellent; many of the man pages have fairly lucid examples and explanations of their usage. The documentation on kernel.org is also very extensive. This software is used by a lot of people already and has proven to be reliable for projects the size of e.g. the linux kernel. While it may be true that in the past git was cumbersome and difficult to use, I can't agree with such an assessment these days. In many instances, you can probably get away with ignoring most of the more 'advanced' commands and simply using it like any other DVCS. "Everyday git" in fact, is surprisingly simple I think. I certainly don't think that the GHC devs will need anything under the porcelain commands either. You don't have to use super-advanced commands that might endanger your history (and commands like git-reflog are a safety net around that too.) You can get far with just a little. Also, this tutorial (and its sequel in particular) should do well for anybody new to git: http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html It's really not that bad and should take you no more than a few minutes to read it and the second part. I encourage those who might be confused to read it. And besides familiarity, what does darcs give us that git doesn't? I find git's core model (objects, trees, blobs and tags are it) more simple and straightforward than the blackbox that is darcs. (And although it's already been elaborated upon before I'm sure, waiting on your tools is nothing but a waste of time. If I even want to get the latest HEAD, I find I can spend a lot of time waiting on darcs. I rarely wait on git for anything - it's fairly instant most of the time.) So I'm with Johan and Thomas for a switch to git. Austin

Manuel M T Chakravarty

10 Aug 10 Aug

8:40 a.m.

Malcolm Wallace:

...

...
...
...
...
I seriously hope the plan is to move all *core* libraries (including GHC's cabal repo) etc over to git, too.

...
* one build system * one vcs This is a chance to make a big step towards accessibility, let's make that step.

Ultimately, I don't think git would make ghc any more accessible to new contributors. Darcs is not especially offputting to any beginner who already knows something about VCS in general.

What the move to git is about, is making life easier for the *existing* HQ and core contributors. Evaluate it on that basis, and not in terms of unknown (and unknowable) benefits to current non- contributors. Indeed, you should also consider how many contributors you might lose in a move.

I am not advocating to move. I am just saying, if ghc moves, every component needs to move on which the HEAD build depends and that is needed in its current development form (eg, *not* alex, happy, cabal).

...

I do hear some significant current contributors having doubts. I can certainly appreciate that having to run 2 VCS in parallel might be confusing and simply make matters worse than at present.

It is confusing and it is going to make matters worse as two failure points are worse than one. And two extra tools to learn worse than one.

...

The libraries question is a difficult one. We have made a lot of effort over the last 5 years to build infrastructure and code that is shared and portable across multiple implementations of the language. Is this the time to fork those supposedly "common" core libraries into ghc versions vs the rest?

It would be a pity to fork, but to be honest, I'd rather fork the libs than have to use two vcs for GHC. The only other alternative is to decouple more library releases from ghc releases. Manuel

Ian Lynagh

12:05 p.m.

On Sat, Aug 09, 2008 at 09:30:52PM +0100, Malcolm Wallace wrote:

...

The libraries question is a difficult one. We have made a lot of effort over the last 5 years to build infrastructure and code that is shared and portable across multiple implementations of the language. Is this the time to fork those supposedly "common" core libraries into ghc versions vs the rest?

I think the non-GHC implementations have been struggling for development time as it is. As you say, we've been trying to increase the amount of shared code, to reduce the burden on them. I think forking the bootlibs would represent a huge step the other way, and, as you said later in your e-mail, may be what finally kills them off. Thanks Ian

Thomas Schilling

2:24 p.m.

I had my share of problems with Darcs; working on the GHC API I constantly have to avoid conflicts. My temporary workaround is to not update at all. Maybe switching to Darcs 2 format would help here, but there are other issues. I initially converted GHC to Git to be able to more easily checkout older versions (e.g., to find a build bug using git-bisect) but with external core libraries this just doesn't work. Right now, there is simply no practical way to check out an old, building version of GHC! Even if we'd switch to Darcs 2 this problem could not be solved. We would also still need turn to the Git repo to get change histories for specific files or to run commands such as 'git-blame' (unless you don't mind getting a cup of coffee and some biscuits each time you run those commands). I think we can make things easier for existing library contributors by providing a darcs/git cheat sheet or even a command line wrapper. Previous attempts at creating such a wrapper have been abandoned, possibly because some commands cannot easily be modelled in Git. However, if we accept some limitations this is doable. In particular the tricky commands are: darcs pull -- (save) cherry picking requires patch dependency information darcs push -- same as above (darcs pull -a and darcs push -a both can be modelled easily) darcs replace -- not directly supported in Git, but could be modelled -- with a script. If these missing features don't feel like too big a handicap the change should be fairly easy for existing contributors. (And with some time they can start and learn Git's other features.) For our build woes integrating the libraries and the main GHC repo in one Git repo will be very helpful, since we can now just instruct build bots to try and build revision 12345deadbeef and be happy. / Thomas -- My shadow / Change is coming. / Now is my time. / Listen to my muscle memory. / Contemplate what I've been clinging to. / Forty-six and two ahead of me.

Iavor Diatchki

6:01 p.m.

Hello, I think that we should switch the repositories of the core libraries to git too, not just because GHC is switching, but simply because git is a more reliable RCS. It seems that this does not prevent other implementations from using them---the code in the repositories will be still the same! -Iavor

Manuel M T Chakravarty

11 Aug 11 Aug

3:38 a.m.

Thomas Schilling:

...

I had my share of problems with Darcs; working on the GHC API I constantly have to avoid conflicts. My temporary workaround is to not update at all. Maybe switching to Darcs 2 format would help here, but there are other issues.

I initially converted GHC to Git to be able to more easily checkout older versions (e.g., to find a build bug using git-bisect) but with external core libraries this just doesn't work. Right now, there is simply no practical way to check out an old, building version of GHC!

Correct me if I am wrong, but this sounds as if you support my point that switching the GHC repo to git without doing the same for the core libs (in an integrated way) would not address the problems you experienced with darcs. Manuel

Thomas Schilling

10:21 a.m.

On 11 Aug 2008, at 05:38, Manuel M T Chakravarty wrote:

...

Correct me if I am wrong, but this sounds as if you support my point that switching the GHC repo to git without doing the same for the core libs (in an integrated way) would not address the problems you experienced with darcs.

Partly. It does address some issues (fear of conflict, speed, case- sensitivity bugs, easier branches). I personally wouldn't mind having both Darcs and Git repositories, although I can understand why having a mixture of both is bad. I was just mentioning some other advantages of also having the libraries in Git. However, I think that it would be really disappointing if we would not move to Git for the main GHC repository. Simon M reported that a merge took him over a whole day, Norman reported two weeks of lost work, Don reported corrupted repos, Simon PJ reported that in order to avoid conflicts he constantly unrecords and re-records one big patch; all that doesn't give much confidence in Darcs. Additionally, no-one except David seems to actually understand Darcs' theory (and we don't even know if David actually does.) Darcs 2 claims to fix those problems, but I don't know how many are actually using it. Darcs 1 had the exponential runtime bug and it wasn't discovered right away. I don't believe that Darcs 2 can fulfil GHC's needs anytime soon, especially since it is always a bad idea to use a brand-new release of a not much used VCS. (I am also no longer convinced that Darcs' automatic patch dependency calculations are actually a good idea. Just because two patches don't touch the same files, doesn't mean they aren't semantically dependent. Take for example "monadification" patches, which are typically submitted split up for each file. A branch captures those dependencies just fine.) / Thomas -- Push the envelope. Watch it bend.

Sittampalam, Ganesh

10:38 a.m.

Thomas Schilling wrote:

...

(I am also no longer convinced that Darcs' automatic patch dependency calculations are actually a good idea. Just because two patches don't touch the same files, doesn't mean they aren't semantically dependent. Take for example "monadification" patches, which are typically submitted split up for each file. A branch captures those dependencies just fine.)

But the darcs approach to dependency is what underlies cherry-picking, which many people consider the most worthwhile feature of darcs. In fact many people would like it to be possible to override even the dependencies that darcs *does* find to cherry-pick patch A without patch B that A depends on, at the expense of producing a conflict that then has to be fixed up by hand. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Thomas Schilling

11:18 a.m.

On 11 Aug 2008, at 12:38, Sittampalam, Ganesh wrote:

...

Thomas Schilling wrote:

...
(I am also no longer convinced that Darcs' automatic patch dependency calculations are actually a good idea. Just because two patches don't touch the same files, doesn't mean they aren't semantically dependent. Take for example "monadification" patches, which are typically submitted split up for each file. A branch captures those dependencies just fine.)

But the darcs approach to dependency is what underlies cherry-picking, which many people consider the most worthwhile feature of darcs. In fact many people would like it to be possible to override even the dependencies that darcs *does* find to cherry-pick patch A without patch B that A depends on, at the expense of producing a conflict that then has to be fixed up by hand.

Cherry-picking just a single patch is simple in Git: "git cherry-pick <commit-id>"[1]. What's missing in Git is the automatic detection of dependent patches. Otherwise it would be straightforward to write a Darcs frontend for Git. [1]: http://www.kernel.org/pub/software/scm/git/docs/git-cherry- pick.html / Thomas -- Push the envelope. Watch it bend.

Sittampalam, Ganesh

12:34 p.m.

-----Original Message----- From: Thomas Schilling [mailto:nominolo@googlemail.com] Sent: 11 August 2008 12:18 To: Sittampalam, Ganesh Cc: Manuel Chakravarty; Don Stewart; Ian Lynagh; Simon Peyton-Jones; GHC Users Mailing List Subject: Re: Version control systems Thomas Schilling wrote:

...

On 11 Aug 2008, at 12:38, Sittampalam, Ganesh wrote:

...

...
Thomas Schilling wrote:

...
(I am also no longer convinced that Darcs' automatic patch dependency calculations are actually a good idea. Just because two patches don't touch the same files, doesn't mean they aren't semantically dependent. Take for example "monadification" patches, which are typically submitted split up for

...

...
...
each file. A branch captures those dependencies just fine.)

But the darcs approach to dependency is what underlies cherry-picking, which many people consider the most worthwhile feature of darcs. In fact many people would like it to be possible to override even the dependencies that darcs *does* find to cherry-pick patch A without patch B that A depends on, at the expense of producing a conflict that then has to be fixed up by hand.

...

Cherry-picking just a single patch is simple in Git: "git cherry-pick <commit-id>"[1].

I wasn't saying that Git doesn't support cherry-picking, just that you would expect dependencies to restrict what you can and can't cherry-pick; if you specify dependencies just in a linear fashion along each branch (i.e. each patch depends on all those before it on that branch) as I thought you were suggesting, then you enormously restrict what cherry-picks are possible. Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ==============================================================================

Duncan Coutts

11 a.m.

On Mon, 2008-08-11 at 12:21 +0200, Thomas Schilling wrote:

...

However, I think that it would be really disappointing if we would not move to Git for the main GHC repository. Simon M reported that a merge took him over a whole day, Norman reported two weeks of lost work, Don reported corrupted repos, Simon PJ reported that in order to avoid conflicts he constantly unrecords and re-records one big patch; all that doesn't give much confidence in Darcs.

We all accept there are problems with darcs v1 and the darcs v1 repo format for larger projects that do lots of development in branches and then merge back.

...

Additionally, no-one except David seems to actually understand Darcs' theory (and we don't even know if David actually does.) Darcs 2 claims to fix those problems, but I don't know how many are actually using it.

It's not clear to me that we've really bothered to find out. The last evaluation in relation to ghc that I'm aware of was prior to the 2.0 release. My impression is that we've all complained about the darcs v1 problems (justly) but spent the most effort investigating things other than darcs v2 which would be the easiest to upgrade to and not have the problems of using two different systems for ghc vs other libs. On a slightly related issue, we're currently evaluating upgrading to darcs 2 for code.h.o. We'll let people know how that goes. It's not directly relevant to ghc since we'd not be switching to darcs v2 format (that's the prerogative of the repo owners, not code.h.o admins). Duncan

Thomas Schilling

12:29 p.m.

On 11 Aug 2008, at 13:00, Duncan Coutts wrote:

...

It's not clear to me that we've really bothered to find out. The last evaluation in relation to ghc that I'm aware of was prior to the 2.0 release. My impression is that we've all complained about the darcs v1 problems (justly) but spent the most effort investigating things other than darcs v2 which would be the easiest to upgrade to and not have the problems of using two different systems for ghc vs other libs.

I converted the ghc repo to darcs2 (locally): Getting file local history: * darcs changes --last 20 compiler/main/HscTypes.lhs very quick but prints only two patches * darcs changes compiler/hsSyn/HsTypes.lhs 1m22s (16s for the second time) Git <1s * darcs get ghc2 ghc-test (creating a *local* branch) real 13m25.365s user 0m14.677s sys 0m29.541s (at least it seems it actually worked, though) git clone ghc g2 (the slow method of creating a local branch) real 0m6.742s user 0m0.335s sys 0m0.652s * I haven't tested a remote pull yet. At 80 Kb/s, it should take about 15min to clone via Git (70 MB). A test of darcs would be interesting. Finally, of course, we have to hope that Darcs2's conflict problems are actually solved. I also had some weird automerges with Darcs when pulling from Max' repository, so Darcs isn't flawless there, either (this seemed to be one of the main critiques of Git). / Thomas -- Awareness is the enemy of sanity, for once you hear the screaming, it never stops.

Duncan Coutts

12 Aug 12 Aug

12:17 p.m.

On Mon, 2008-08-11 at 14:29 +0200, Thomas Schilling wrote:

...

On 11 Aug 2008, at 13:00, Duncan Coutts wrote:

...
It's not clear to me that we've really bothered to find out. The last evaluation in relation to ghc that I'm aware of was prior to the 2.0 release. My impression is that we've all complained about the darcs v1 problems (justly) but spent the most effort investigating things other than darcs v2 which would be the easiest to upgrade to and not have the problems of using two different systems for ghc vs other libs.

I converted the ghc repo to darcs2 (locally):

Getting file local history:

* darcs changes --last 20 compiler/main/HscTypes.lhs

very quick but prints only two patches

* darcs changes compiler/hsSyn/HsTypes.lhs

1m22s (16s for the second time)

Interesting that you get so much variance between runs. I get 32s user time first time and 30s the second. In this test darcs 2 is faster that darcs 1 on v1 format repos and darcs 2 is faster on v2 format repos than on v1 format, though only by a few seconds. At a guess, the issue here is that darcs is not indexing those changes per-file, which is why --last 20 doesn't give the last 20 for that file and why asking for all changes takes so long. Perhaps if it did cache this info per-file it'd help with annotate too.

...

Git <1s

* darcs get ghc2 ghc-test (creating a *local* branch)

real 13m25.365s user 0m14.677s sys 0m29.541s

(at least it seems it actually worked, though)

That's an order of magnitude different to what I see: $ time darcs2 get ghc2 ghc-test Copying patches, to get lazy repository hit ctrl-C... Finished getting. real 0m21.428s user 0m11.221s sys 0m1.380s Note that this is much faster in the darcs v2 format than darcs 2 using the darcs v1 format: $ time darcs2 get ghc ghc-test1 Finished getting. real 1m51.959s user 1m15.449s sys 0m11.877s However darcs v1 is faster still: $ time darcs1 get ghc ghc-test1_ Copying patch 19084 of 19084... done. Finished getting. real 0m8.851s user 0m3.668s sys 0m0.708s It doesn't seem to spend any time applying the patches, unlike what darcs 2 is doing for v1 or v2 formats. Though in any case, one doesn't need to darcs get locally since one can use cp -a right?

...

git clone ghc g2 (the slow method of creating a local branch)

real 0m6.742s user 0m0.335s sys 0m0.652s

* I haven't tested a remote pull yet. At 80 Kb/s, it should take about 15min to clone via Git (70 MB). A test of darcs would be interesting.

We'll be testing this for the code.h.o conversion. We'll keep you posted. Duncan

Simon Marlow

11 Aug 11 Aug

12:57 p.m.

Duncan Coutts wrote:

...

It's not clear to me that we've really bothered to find out. The last evaluation in relation to ghc that I'm aware of was prior to the 2.0 release. My impression is that we've all complained about the darcs v1 problems (justly) but spent the most effort investigating things other than darcs v2 which would be the easiest to upgrade to and not have the problems of using two different systems for ghc vs other libs.

I promised to put together our reasoning on why we don't think moving to darcs2 would help enough. Here's a summary: - using the darcs2 format may well fix the exponential-time merge problem, but the UI for merging conflicts is still lacking in many important ways in darcs: * The conflict markers are not annotated with the patch that they came from, and the ordering of patches in conflict markers is non-deterministic (when I asked about this problem, I was told it was hard to fix). * The 'darcs changes' output only shows one of the patches that is conflicting, you have to guess at the other one(s). Also, it doesn't show which patches are conflict resolutions. - Performance. darcs2 regressed in performance for many operations we commonly use. I've submitted some measurements for some things, but it's pretty easy to find your own test cases: things like "darcs add", "darcs whatsnew", "darcs unrecord" are all slower than darcs 1. When simple operations take multiple seconds to complete, it really slows down your workflow. - I still can't use 'darcs annotate' because it's too slow. Also, we can't browse the GHC repository on the web because the web interface wants to do 'darcs changes <file>', and that takes minutes. It's possible with caching, but you still have to regenerate the cache after a change. - why can I do a complete git clone of a remote GHC repo in a few minutes, but it takes hours to do a complete 'darcs get'? - Bugs. Many bugs have been fixed in darcs2, which is great, but we did already encounter one (hard to reproduce) bug on Windows, when trying to get an up-to-date repo. Perhaps bugs will be less of an issue in the future, but we have had painful experiences particularly on Windows and I know the darcs developers are still not actively testing on Windows. FWIW, I'd also like to stay with darcs because it has the right model, but unfortunately the current implementation is not useable for us, and it's holding us back. I'll say something about core libs in a separate mail. Cheers, Simon

Duncan Coutts

13 Aug 13 Aug

9:49 p.m.

On Mon, 2008-08-11 at 13:57 +0100, Simon Marlow wrote:

...

- Performance. darcs2 regressed in performance for many operations we commonly use. I've submitted some measurements for some things, but it's pretty easy to find your own test cases: things like "darcs add", "darcs whatsnew", "darcs unrecord" are all slower than darcs 1. When simple operations take multiple seconds to complete, it really slows down your workflow.

Turns out that the reason for slow darcs whatsnew is ghc bug #2093 http://hackage.haskell.org/trac/ghc/ticket/2093 because getSymbolicLinkStatus is broken on 32bit systems in 6.8.2 it means that the 'stat' optimisation does not work so darcs has to read the actual contents of many files. Obviously that's very slow, especially over nfs. That explains why it worked for me in 0.2 seconds but for you took several seconds user time and (even more real time due to nfs). If you were using http://darcs.haskell.org/ghc-hashedrepo/ then there's a further explanation. According to the darcs devs that repo is: "in some weird intermediate (not final) hashed format that doesn't keep (original) filesizes in filenames. So in effect, it's running like with --ignore-times still" So I suggest we get rid of that old repo so as not to further the confusion. Duncan

Manuel M T Chakravarty

14 Aug 14 Aug

2:26 a.m.

Duncan Coutts:

...

On Mon, 2008-08-11 at 13:57 +0100, Simon Marlow wrote:

...
- Performance. darcs2 regressed in performance for many operations we commonly use. I've submitted some measurements for some things, but it's pretty easy to find your own test cases: things like "darcs add", "darcs whatsnew", "darcs unrecord" are all slower than darcs 1. When simple operations take multiple seconds to complete, it really slows down your workflow.

Turns out that the reason for slow darcs whatsnew is ghc bug #2093

http://hackage.haskell.org/trac/ghc/ticket/2093

because getSymbolicLinkStatus is broken on 32bit systems in 6.8.2 it means that the 'stat' optimisation does not work so darcs has to read the actual contents of many files. Obviously that's very slow, especially over nfs. That explains why it worked for me in 0.2 seconds but for you took several seconds user time and (even more real time due to nfs).

LOL - that is funny. GHC devel slowed down by slow darcs due to GHC bug. The bug is fixed, isn't it? So, recompiling darcs with 6.8.3 should improve matters. Manuel

Simon Marlow

7:43 a.m.

Duncan Coutts wrote:

...

Turns out that the reason for slow darcs whatsnew is ghc bug #2093

http://hackage.haskell.org/trac/ghc/ticket/2093

because getSymbolicLinkStatus is broken on 32bit systems in 6.8.2 it means that the 'stat' optimisation does not work so darcs has to read the actual contents of many files. Obviously that's very slow, especially over nfs. That explains why it worked for me in 0.2 seconds but for you took several seconds user time and (even more real time due to nfs).

Yes, I was aware of the #2093 problem (someone else pointed it out to me earlier), but it's not the cause of the slow whatsnew I'm seeing: my darcs is compiled with 6.8.3. ~/darcs/ghc-testing/testsuite-hashed > darcs +RTS --info [("GHC RTS", "Yes") ,("GHC version", "6.8.3") ,("RTS way", "rts_thr") ,("Host platform", "x86_64-unknown-linux") ,("Build platform", "x86_64-unknown-linux") ,("Target platform", "x86_64-unknown-linux") ,("Compiler unregisterised", "NO") ,("Tables next to code", "YES") ] ~/darcs/ghc-testing/testsuite-hashed > time darcs wh No changes! [2] 15793 exit 1 darcs wh 21.35s real 9.56s user 4.28s system 64% darcs wh ~/darcs/ghc-testing/testsuite-hashed > darcs --version 2.0.1rc2 (2.0.1rc2 (+ -1 patch)) ~/darcs/ghc-testing/testsuite-hashed > darcs query repo Type: darcs Format: hashed Root: /home/simonmar/darcs-all/work/ghc-testing/testsuite-hashed Pristine: HashedPristine Cache: thisrepo:/home/simonmar/darcs-all/work/ghc-testing/testsuite-hashed boringfile Pref: .darcs-boring Default Remote: /home/simonmar/darcs-all/work/ghc-testing/testsuite Num Patches: 2834 It's better on the darcs-2 version of the repo: ~/darcs/ghc-testing/testsuite-hashed2 > darcs query repo Type: darcs Format: hashed, darcs-2 Root: /home/simonmar/darcs-all/work/ghc-testing/testsuite-hashed2 Pristine: HashedPristine Cache: thisrepo:/home/simonmar/darcs-all/work/ghc-testing/testsuite-hashed2 Num Patches: 2834 ~/darcs/ghc-testing/testsuite-hashed2 > time darcs wh No changes! [2] 15824 exit 1 darcs wh 3.69s real 1.08s user 0.53s system 43% darcs wh Better, but still a factor of ~4 slower than on the darcs-1 repo.

...

If you were using http://darcs.haskell.org/ghc-hashedrepo/ then there's a further explanation. According to the darcs devs that repo is: "in some weird intermediate (not final) hashed format that doesn't keep (original) filesizes in filenames. So in effect, it's running like with --ignore-times still"

Nope, I'm not using that repo, these were ones I created freshly yesterday. I will try building a fresh darcs to see if that helps. Cheers, Simon

Jason Dagit

11 Aug 11 Aug

6:22 p.m.

2008/8/11 Thomas Schilling

...

(I am also no longer convinced that Darcs' automatic patch dependency calculations are actually a good idea. Just because two patches don't touch the same files, doesn't mean they aren't semantically dependent. Take for example "monadification" patches, which are typically submitted split up for each file. A branch captures those dependencies just fine.)

Darcs has a feature to deal with patches that are unrelated in patch theory but are related from the user's point of view. When you record you can use --ask-deps to specify dependent patches. These dependencies are then artificially enforced in commute (where dependencies are normally detected). Note: I'm not trying to advocate anything here, I just wanted to let you know that others noticed this and added a feature for it long ago. Jason

Simon Marlow

3:17 p.m.

Manuel M T Chakravarty wrote:

...

I think all *core* libraries must switch. Seriously, requiring GHC developer to use a mix of two vcs during development is a Very Bad Idea. Don was excited about getting more people to look at the source when it is in git (see the comments he posted from reddit). By requiring two vcs you will get *less* people to look at the source.

This is not only to get the sources to hack them, but you effectively require developers to learn the commands for two vcs (when they are already reluctant to learn one). For example, often enough somebody who changes something in GHC will modify the base package, too. Then, to commit the overall work, you need to commit using both vcs. If you need to branch for your work, you need to create branches in two vcs (no idea whether the semantics of a branch in git and darcs is anywhere similar). When you merge your branch, you need to merge in both vcs. You can't seriously propose such a set up!

I completely agree this is a problem. The main obstacle with just switching the core libraries is that they are shared by other implementations and other maintainers. So I see no alternative but to create forks of those repositories for use by GHC, unless/until the other projects/maintainers want to migrate to git. Some of the repositories are not shared - for example ghc-prim, integer-gmp, template-haskell, and these don't need to be forked. One way we could create the forks would be to create a git repo for each package with two branches: the master branch that GHC builds, and a separate branch that tracks the main darcs repository, and is synced automatically whenever patches are pushed to the main darcs repo. We'd have to explicitly merge the tracking branch into the master branch from time to time. When we want to make changes locally, we can just commit them to the GHC branch and push the changes upstream in a batch later (and then we'd end up having to merge them back in to the GHC branch... but hopefully git's merge is clever enough to avoid manual intervention here). This is complicated and ugly of course; better suggestions welcome.

...

I *strongly* object to moving to git before this isn't sorted out. As Roman said before, GHC is heading into a dangerous direction. It gets progressively harder to contribute to the project at the moment. First, changing the build system to Cabal. Now, proposing to use two vcs. Somebody who is new to the project not only has to learn the internals of GHC, but they also have to learn two new vcs, and if they need to change the build system, they need to learn a new build tool. Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I'm not completely convinced we need to have this all worked out before GHC switches, although it would be nice of course. We currently have infastructure in place for the build to work with a mixture of darcs and git repositories, and existing developers already have to learn git anyway. They just need to remember to use darcs for libraries and git for the main GHC repo, and this is only a temporary situation. As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice. Cheers, Simon

Ross Paterson

5:27 p.m.

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote:

...

The main obstacle with just switching the core libraries is that they are shared by other implementations and other maintainers. So I see no alternative but to create forks of those repositories for use by GHC, unless/until the other projects/maintainers want to migrate to git.

Forking is much worse than using multiple vcs's, and if we don't fork, anyone working on those libraries will have to use git at least to get GHC HEAD to check that they're not breaking it. And clearly GHC developers outnumber developers of other implementations. (I don't think a move to git will lead to more GHC developers, but I buy the interns argument.) My concern is that there are rather more developers of libraries and assorted other packages, and this will place an arbitrary divide across those. Unless everyone moves to git, of course.

Manuel M T Chakravarty

12 Aug 12 Aug

12:35 a.m.

Ross Paterson:

...

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote:

...
The main obstacle with just switching the core libraries is that they are shared by other implementations and other maintainers. So I see no alternative but to create forks of those repositories for use by GHC, unless/until the other projects/maintainers want to migrate to git.

Forking is much worse than using multiple vcs's, and if we don't fork, anyone working on those libraries will have to use git at least to get GHC HEAD to check that they're not breaking it. And clearly GHC developers outnumber developers of other implementations. (I don't think a move to git will lead to more GHC developers, but I buy the interns argument.)

Ah, good point! Changing ghc to git means *all* developers of boot libraries need to use git *regardless* of what repo format the boot libraries are in. After all, they need to validate against the current ghc head before pushing. In other words, the decision to move the ghc repo affects all core library developers anyway. No use pretenting that changing only the ghc repo (and leaving the rest in darcs) would make anything simpler for anybody.

...

My concern is that there are rather more developers of libraries and assorted other packages, and this will place an arbitrary divide across those. Unless everyone moves to git, of course.

There are surely more developers of libraries in general than there are GHC developers. However, I doubt that there are more developers of boot libraries, who are not also ghc developers, than there are ghc developers. The change doesn't have to affect anybody, but ghc developers and *core* library developers. Manuel

Gour

5:12 a.m.

...

...
...
...
...
"Manuel" == Manuel M T Chakravarty writes:

Manuel> In other words, the decision to move the ghc repo affects all Manuel> core library developers anyway. No use pretenting that changing Manuel> only the ghc repo (and leaving the rest in darcs) would make Manuel> anything simpler for anybody. I'd say that moving GHC to git affects the WHOLE haskell community and we can already think about having all the HackageDB running git. I'm not at all pleased with it (personally I switched from darcs to bzr), bet better to face what the future brings instead of putting the head in the sand. :-) Sincerely, Gour -- Gour | Zagreb, Croatia | GPG key: C6E7162D ----------------------------------------------------------------

Malcolm Wallace

9:10 a.m.

On 12 Aug 2008, at 01:35, Manuel M T Chakravarty wrote:

...

Ah, good point! Changing ghc to git means *all* developers of boot libraries need to use git *regardless* of what repo format the boot libraries are in. After all, they need to validate against the current ghc head before pushing.

It is worth pointing out that I *never* validate against ghc head when I commit to the core libraries. (Actually, I don't even keep any checkout of ghc head.) Generally I'm fixing something that has unintentionally broken the nhc98 build of the libraries, *despite* the breaking-patch being validated against ghc. To be honest I don't particularly care if my fixing patch then breaks ghc again. Why not? Because the "chain of blame" effectively leads back past me to the earlier patch. (In practice, re-breaking ghc is very rare.) Now, there is only one person taking care of nhc98 (me), and probably I'm its only user as well, but I do still think it is worth the 30 secs or so every day it takes to check the nightly build logs and the 30mins it occasionally takes to fix breakage when necessary. Building a full Haskell'98 compiler is a significant undertaking, and it would be a great shame to simply discard it because the libraries are no longer available in a shared format. Who knows, maybe someone will find it easier to port to their iPhone than ghc. :-) What I'm not really prepared to do is to extend the fixing time by an extra 30mins just to validate against ghc. I might be prepared to learn a new VCS, but from what I've seen so far, git looks rather complex and difficult to use. It is also worth noting that where a larger community of developers has gathered around a core library (e.g. Cabal), ghc has found it necessary to branch off a ghc-only version of that library, so that commits to the library head do not need to be validated against ghc head. Igloo takes care of merging across a large bunch of patches every once in a while. This model seems to work well. In theory, the core library head could remain in darcs, with the ghc branch of it in git. All the pain of merging would be dumped on one person (sorry Igloo!) but everyone else gets the benefit. Regards, Malcolm

Simon Peyton-Jones

9:29 a.m.

| It is worth pointing out that I *never* validate against ghc head when | I commit to the core libraries. I think that's perfectly reasonable for the reasons you explain. Simon

Manuel M T Chakravarty

13 Aug 13 Aug

12:34 a.m.

Simon Peyton-Jones:

...

| It is worth pointing out that I *never* validate against ghc head when | I commit to the core libraries.

I think that's perfectly reasonable for the reasons you explain.

Sorry, but I think the only reason its halfway acceptable is that Malcolm didn't break the GHC build yet. If he does, I'll be screaming as loudly as for anybody else. What Malcolm is basically saying is that he doesn't contribute to the functionality of the boot libraries, he simply makes sure they compile with nhc98. That's a valuable contribution, of course, but to be honest, I don't think its a valid reason for us to go to the trouble of having two vcs for ghc. Manuel

Malcolm Wallace

8:54 a.m.

Manuel wrote:

...

...
| It is worth pointing out that I *never* validate against ghc head when | I commit to the core libraries.

...

Sorry, but I think the only reason its halfway acceptable is that Malcolm didn't break the GHC build yet. If he does, I'll be screaming as loudly as for anybody else.

Whilst I'm in no way saying that a working nhc98 head is anything like as important as a working ghc head, are you saying that I should scream louder everytime someone breaks nhc98 too? It is happening several times a week at the moment. It can be jolly frustrating when I have other things I could be doing. But I accept that it is simply the price to pay for keeping up-to-date with the libraries everyone else is using. Ghc has no monopoly on the "core" libraries. They are a shared resource.

...

to be honest, I don't think its a valid reason for us to go to the trouble of having two vcs for ghc.

Well indeed, I don't want to stand in the way of ghc. There are far more people contributing to it, so their needs have greater weight. But I am raising the libraries question, because I think it has an impact much more widely than just ghc (or Hugs or nhc98, for that matter). Git may turn out to be sufficiently easy to use that this will all seem like a storm in a teacup once the dust has settled. (I'm not filled with confidence by blog postings that say "granted, git is a usability disaster zone", and "[you] may find git to be hostile, unfriendly and needlessly complex", but those seem to be minority opinions.) Regards, Malcolm Regards, Malcolm

Johan Tibell

9:26 a.m.

As someone who is not contributing to the core libraries I find a few things in this discussions a bit puzzlng. - Why does NHC98 break so often? Is it because people are checking in code that is not Haskell 98 compatible? - It seems to me that implementations "share" libraries using CPP. At least there are plenty of if-defs on symbols like __HUGS__ in the implementation. That seems like a bad approach to me. It doesn't factor out the specifics of an implementation to one place but instead litters the code with it making it hard to read and change without breaking it. I would imagine that this slows down library development. You could compare it to a scenario where Windows and Linux shared their libc implementation! If it's so difficult to share code without continuously breaking the build then we're better of keeping the code separate. I might have gotten this wrong so could someone please explain to me what exactly is the problem and why we are in this situation? Thank you, Johan

Malcolm Wallace

10:21 a.m.

...

- Why does NHC98 break so often? Is it because people are checking in code that is not Haskell 98 compatible?

Yes, there is a bit of that. Also, as you point out, there is quite a lot of CPP conditionally compiled code in the libraries, and I would say that it is the major contributor to breakage. It is often unclear which parts of the code are shared and which separate, so a lot of breakage arises from e.g. exporting a name that is defined for ghc only. In addition, there are some (once obscure) bugs in nhc98 that are now triggered increasingly frequently. (We can't blame anyone except nhc98 for those of course.) These include complex import renaming resolution, and contexts on pattern variables.

...

- It seems to me that implementations "share" libraries using CPP. That seems like a bad approach to me.

Agreed. The CPP was always intended to be as temporary as possible, with the goal to share as much as possible. One of the problems is that the primitives provided by compilers are different. Really, there should be a package below 'base' in the dependency tree, specific to each compiler, and non-shared. Then everything from base upwards would be cleaner and more portable.

...

If it's so difficult to share code without continuously breaking the build then we're better of keeping the code separate.

I don't agree. The only way to achieve convergence is to start from some semi-merged point, and work to eliminate the differences. Igloo is doing a fantastic job of determining the dependencies and gradually moving stuff around to enable this to happen. Regards, Malcolm

Johan Tibell

11:43 a.m.

On Wed, Aug 13, 2008 at 12:21 PM, Malcolm Wallace wrote:

...

...
- Why does NHC98 break so often? Is it because people are checking in code that is not Haskell 98 compatible?

Yes, there is a bit of that. Also, as you point out, there is quite a lot of CPP conditionally compiled code in the libraries, and I would say that it is the major contributor to breakage. It is often unclear which parts of the code are shared and which separate, so a lot of breakage arises from e.g. exporting a name that is defined for ghc only.

In addition, there are some (once obscure) bugs in nhc98 that are now triggered increasingly frequently. (We can't blame anyone except nhc98 for those of course.) These include complex import renaming resolution, and contexts on pattern variables.

Can we make sure that these libraries are always built with some Haskell 98 compatibility flag by GHC so people find out when they add non Haskell 98 stuff?

...

...
- It seems to me that implementations "share" libraries using CPP. That seems like a bad approach to me.

Agreed. The CPP was always intended to be as temporary as possible, with the goal to share as much as possible. One of the problems is that the primitives provided by compilers are different. Really, there should be a package below 'base' in the dependency tree, specific to each compiler, and non-shared. Then everything from base upwards would be cleaner and more portable.

Some code rearrangement could go a long way to. We could put compiler specific implementations of certain functions in a separate module. We could then import the right compiler specific module with just one if-def and then reexport the functions from e.g. Data.Array. module Data.Array ( map, filter ) where #ifdef __GLASGOW_HASKELL__ import Ghc.Data.Array as Impl #endif #ifdef __HUGS__ import Hugs.Data.Array as Impl #endif -- | Documentation map = Impl.map -- | This implementation is shared. filter p xs = ... I don't know if this is the best way (I need to give the subject some more thought).

...

...
If it's so difficult to share code without continuously breaking the build then we're better of keeping the code separate.

I don't agree. The only way to achieve convergence is to start from some semi-merged point, and work to eliminate the differences. Igloo is doing a fantastic job of determining the dependencies and gradually moving stuff around to enable this to happen.

Trying to come up with a solution that allows us to share code in a sane way would of course be better. But if that doesn't work, maybe because the problem is inherent with the way we share code, and the build breakages continue I would argue strongly for always keeping HEAD buildable over sharing implementation. Cheers, Johan

Bulat Ziganshin

1:09 p.m.

New subject: Re[2]: Version control systems

Hello Johan, Wednesday, August 13, 2008, 3:43:15 PM, you wrote:

...

...
...
- Why does NHC98 break so often? Is it because people are checking in code that is not Haskell 98 compatible?

...

Can we make sure that these libraries are always built with some Haskell 98 compatibility flag by GHC so people find out when they add non Haskell 98 stuff?

on ghc, we use many non-h98 features. actually, there are even non-h98 features that are common to all 3 compilers (f.e. rank-2 types, afair) we need to check that non-nhc features are not used in *shared* code and the best tool to do such checks is nhc itself -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jason Dagit

5:11 p.m.

On Wed, Aug 13, 2008 at 1:54 AM, Malcolm Wallace < malcolm.wallace@cs.york.ac.uk> wrote:

...

Manuel wrote:

| It is worth pointing out that I *never* validate against ghc head when

...
...
| I commit to the core libraries.

Sorry, but I think the only reason its halfway acceptable is that Malcolm

...
didn't break the GHC build yet. If he does, I'll be screaming as loudly as for anybody else.

Whilst I'm in no way saying that a working nhc98 head is anything like as important as a working ghc head, are you saying that I should scream louder everytime someone breaks nhc98 too? It is happening several times a week at the moment. It can be jolly frustrating when I have other things I could be doing. But I accept that it is simply the price to pay for keeping up-to-date with the libraries everyone else is using. Ghc has no monopoly on the "core" libraries. They are a shared resource.

to be honest, I don't think its a valid reason for us to go to the trouble

...
of having two vcs for ghc.

Well indeed, I don't want to stand in the way of ghc. There are far more people contributing to it, so their needs have greater weight. But I am raising the libraries question, because I think it has an impact much more widely than just ghc (or Hugs or nhc98, for that matter).

Git may turn out to be sufficiently easy to use that this will all seem like a storm in a teacup once the dust has settled. (I'm not filled with confidence by blog postings that say "granted, git is a usability disaster zone", and "[you] may find git to be hostile, unfriendly and needlessly complex", but those seem to be minority opinions.)

I'm not a contributor for hugs, nhc, jhc, ghc, or any other project that is affect here, but when I see this part of the discussion come up again and again I have to wonder if anyone has done the obvious thing of asking these other communities if they would mind switching to git? Of course each of them are free to say "No, we won't switch" for any reason they like and you'd have to then deal with the situation. But, it seems that it can't hurt to ask, and I get the impression no one has asked them formally. If everyone did happen to agree on using git for the shared libraries, wouldn't that end this part of the debate? Just my $0.02, Jason

Ian Lynagh

12 Aug 12 Aug

11:53 a.m.

On Tue, Aug 12, 2008 at 10:10:31AM +0100, Malcolm Wallace wrote:

...

On 12 Aug 2008, at 01:35, Manuel M T Chakravarty wrote:

...
Ah, good point! Changing ghc to git means *all* developers of boot libraries need to use git *regardless* of what repo format the boot libraries are in. After all, they need to validate against the current ghc head before pushing.

It is worth pointing out that I *never* validate against ghc head when I commit to the core libraries.

Also, all of the people who send us patches don't need to validate (and I suspect most of them don't), as we validate the patches before pushing them. Thanks Ian

Manuel M T Chakravarty

13 Aug 13 Aug

12:39 a.m.

Ian Lynagh:

...

On Tue, Aug 12, 2008 at 10:10:31AM +0100, Malcolm Wallace wrote:

...
On 12 Aug 2008, at 01:35, Manuel M T Chakravarty wrote:

...
Ah, good point! Changing ghc to git means *all* developers of boot libraries need to use git *regardless* of what repo format the boot libraries are in. After all, they need to validate against the current ghc head before pushing.

It is worth pointing out that I *never* validate against ghc head when I commit to the core libraries.

Also, all of the people who send us patches don't need to validate (and I suspect most of them don't), as we validate the patches before pushing them.

Well, its up to you whether you want to validate for other people, but I don't think that is the right policy. Everybody (including Malcolm) should validate. If you contribute code to the linux kernel, comprehensive testing of the code is a requirement, too. It's not as if it were a strange requirement of GHC as an open source project to ask people to test their code properly. In fact, I even tell my first year students that they should test their code properly. Maybe we should hand books to introductory software engineering out to potential ghc and library contributors. (Sorry to get cynic, but I can't believe that we are even discussing that.) Manuel

Bulat Ziganshin

6:06 a.m.

New subject: Re[2]: Version control systems

Hello Manuel, Wednesday, August 13, 2008, 4:39:25 AM, you wrote:

...

Well, its up to you whether you want to validate for other people, but I don't think that is the right policy. Everybody (including Malcolm) should validate.

as far as we have people validating patches for their platforms (Igloo for GHC, Malcolm for nhc...) this reduces amount of time required to support things working. ideal solutions are too expensive for real world :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Malcolm Wallace

11:52 a.m.

...

I don't think that is the right policy. Everybody (including Malcolm) should validate.

If you contribute code to the linux kernel, comprehensive testing of the code is a requirement, too.

The analogy is flawed. It is like asking the developers of _gcc_ to ensure that the Linux kernel still builds after every modification to the gcc project code base. The projects are different, so the suggested requirement would be an unreasonable burden. Regards, Malcolm

Johan Tibell

12:18 p.m.

On Wed, Aug 13, 2008 at 1:52 PM, Malcolm Wallace wrote:

...

...
I don't think that is the right policy. Everybody (including Malcolm) should validate.

If you contribute code to the linux kernel, comprehensive testing of the code is a requirement, too.

The analogy is flawed. It is like asking the developers of _gcc_ to ensure that the Linux kernel still builds after every modification to the gcc project code base. The projects are different, so the suggested requirement would be an unreasonable burden.

I think an even better analogy is probably comparing it to developer of GCC changing the libc implementation of another compiler or vice versa. -- Johan

Malcolm Wallace

1:13 p.m.

...

I think an even better analogy is probably comparing it to developer of GCC changing the libc implementation of another compiler or vice versa.

Our shared libraries do not belong to any one compiler. They are joint creations, with a lot of community (non-compiler-hacker) involvement. Regards, Malcolm

Johan Tibell

1:57 p.m.

On Wed, Aug 13, 2008 at 3:13 PM, Malcolm Wallace wrote:

...

...
I think an even better analogy is probably comparing it to developer of GCC changing the libc implementation of another compiler or vice versa.

Our shared libraries do not belong to any one compiler. They are joint creations, with a lot of community (non-compiler-hacker) involvement.

I'm very grateful these people took the time to write these libraries. However, how these modules were created is irrelevant when it comes to addressing the current problem. Parts of their implementation is compiler dependent and having compiler specific code live together is bound to lead to problems because the people hacking on those modules are likely to use and validate on only one compiler. It would be difficult to require them to do otherwise too. To avoid this problem either the compiler dependent code has to be abstracted out from these modules so people can ignore the differences or the implementations of these files need to be kept separate. Consider the following scenario. GHC hackers implement Data.Array in A.hs, Hugs and NHC98 hackers develop separate implementations in B.hs and C.hs respectively. We now run something like diff A.hs B.hs C.hs | sed <diff markers with if-defs> > X.hs X.hs could be likened to what current implementation of these files look like. If the three groups don't validate their changes on all compilers they risk breaking someones build. Especially note the poor scaling properties of this approach were each new implementation adds one more compiler for everyone to verify on. I think the reason this works at all right now is that most work is happening on GHC and that's also were the most users are. Cheers, Johan

Matthias Kilian

11 Aug 11 Aug

9:22 p.m.

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote: [...]

...

As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata. Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal. I see more and more workarounds for workarounds for an unmaintainable (and unusable) build system, and after the latest discussions about git vs. darcs, maintaining GHC-specific branches of libraries etc., I think I'll just drop maintainership from all GHC-related OpenBSD ports until the GHC build system chaos settles down a little bit. Ciao, Kili

Don Stewart

9:40 p.m.

kili:

...

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote: [...]

...
As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata. Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal.

I see more and more workarounds for workarounds for an unmaintainable (and unusable) build system, and after the latest discussions about git vs. darcs, maintaining GHC-specific branches of libraries etc., I think I'll just drop maintainership from all GHC-related OpenBSD ports until the GHC build system chaos settles down a little bit.

Ian, please read this. The inability to build GHC reliably is a problem. Can someone with a plan please describe what measures are in place to ensure GHC emerges buildable, and the tree regains the status of a tree that *does not break*? -- Don

Johan Henriksson

11:15 p.m.

Don Stewart wrote:

...

kili:

...
On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote: [...]

...
As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata. Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal. if we're going to kick on cabal I might throw in my two cents.

I see an increasing problem in that every community comes up with their own package system, instead of reusing existing frameworks. dependencies to other non-haskell libraries has to be addressed for every other coexisting package system (such as apt-get), if it is addressed at all. likewise, other languages depending on haskell will have trouble resolving dependencies. so my point is, if there will be any bigger reworking of cabal, I think one should consider how it could work as a module in a bigger (maybe future) meta-packaging framework, lifting up binaries to for example .deb, .exe-installer, .dmg or whatever is the most native for the platform. I see a point in language specific package systems as they have more insight into the build process, but the current implementations assume a very ideal world in which there are no other dependencies involved. /Johan

Duncan Coutts

12 Aug 12 Aug

2:35 a.m.

On Tue, 2008-08-12 at 01:15 +0200, Johan Henriksson wrote:

...

I see an increasing problem in that every community comes up with their own package system, instead of reusing existing frameworks.

That's because there are no usable existing frameworks. It would be wonderful of course if there were some standard language neutral build and packaging system where each language just wrote some lib and could integrate nicely into multi-language systems.

...

dependencies to other non-haskell libraries has to be addressed for every other coexisting package system (such as apt-get), if it is addressed at all. likewise, other languages depending on haskell will have trouble resolving dependencies.

so my point is, if there will be any bigger reworking of cabal, I think one should consider how it could work as a module in a bigger (maybe future) meta-packaging framework, lifting up binaries to for example .deb, .exe-installer, .dmg or whatever is the most native for the platform.

There are tools to convert Cabal packages to native packages for rpm, deb, ebuild and arch. The Cabal format was designed to allow this translation. This includes dependencies on C libs and external programs. Note that this is in contrast to existing frameworks like autoconf which do not allow the automatic extraction of dependencies to allow automatic conversion into native packages.

...

I see a point in language specific package systems as they have more insight into the build process, but the current implementations assume a very ideal world in which there are no other dependencies involved.

I don't think this is true. Duncan

Norman Ramsey

6:55 p.m.

...

...
I see an increasing problem in that every community comes up with their own package system, instead of reusing existing frameworks.

That's because there are no usable existing frameworks.

I couldn't agree more. I have been working on this problem off and on since 1993, and the situation now is even worse than it was then. It's become a full-time job just to keep track of the frameworks. (For example, I'm not as informed about Omake as I'd like to be.) As someone who hangs out in a bunch of different language communities, I see two needs driving unnecessary diversity in build/package systems: 1. Language implementors see that they are serving multiple platforms (debian, red hat, bsd, windows, macos, ...), each of which has its own native packaging system, with its own way of expressing dependencies. The bad idea: invent a new system which works across all these platforms but serves only one language. This path was the genesis of 'Lua rocks', for example. And although I can't speak of Ruby and Java of my own knowledge, I suspect that Ruby gems and Java beans are similar. I don't really understand Cabal, but to the degree that I do, Cabal fills a similar role for Haskell. 2. The technique of 'smart recompilation', described in a 1986 journal article by Walter Tichy (who also invented RCS), has (again to my knowledge) has been reinvented again and again for one language after another---it has *never* been packaged as a reusable framework. I know of two valiant efforts: Clemm and Osterweil's "Odin" build tool and Blume and Appel's Compilation Manager. But the Odin people never really had a programming-language background, and Geoff Clemm was the only one who could extend the system. And the Compilation Manager, while really interesting, didn't make it obvious how to use with another compiler---in fact I'm not sure if the hooks to support smart recompilation were even exported. I also see repeatedly that the distinction between the build system and packaging system is blurry: both have to know about build targets, dependencies, and so on. At the time of the wonderful GHC Hackathon in Portland, where the GHC API was first introduced to the public, I urged Simon PJ to consider taking ghc --make and generalising it to support other languages. I still think this would be a good project.

...

There are tools to convert Cabal packages to native packages for rpm, deb, ebuild and arch. The Cabal format was designed to allow this translation. This includes dependencies on C libs and external programs.

I think this is an essential property for any language-dependent packaging system to be successful. I think this is a very good path for Haskell, even though Cabal is a work in progress. What I like is that it overcomes an impedance mismatch: * The developer of a Haskell package is presented with *one* packaging interface (Cabal), which will create a package native to any widely used platform. * The client of a Haskell package treats it like any other native package: rpm, apt-get, emerge, or InstallShield just *work*---Haskell programs are not marginalized. Of course this model puts a heavy weight on the shoulders of the Cabal team, but given the current state of play, I don't see how it is possible to do better for developers and users. It's certainly better than the 'Lua rocks' model, which requires the end user to run both the platform-native packaging system *and* the Lua rocks packaging system. Such an outcome for Haskell is to be avoided at all costs. Norman

Sean Leather

7:43 p.m.

Norman Ramsey wrote:

...

At the time of the wonderful GHC Hackathon in Portland, where the GHC API was first introduced to the public, I urged Simon PJ to consider taking ghc --make and generalising it to support other languages. I still think this would be a good project.

As well as supporting any mix of languages (and platforms, since it's often difficult to separate the two). I would love to see such a tool! As flexible as make if necessary, but as easy as doing nothing if possible. Sean

Simon Marlow

13 Aug 13 Aug

7:51 a.m.

Norman Ramsey wrote:

...

I also see repeatedly that the distinction between the build system and packaging system is blurry: both have to know about build targets, dependencies, and so on.

At the time of the wonderful GHC Hackathon in Portland, where the GHC API was first introduced to the public, I urged Simon PJ to consider taking ghc --make and generalising it to support other languages. I still think this would be a good project.

I don't want to speak for those involved, but I believe this is what the "make-like dependency framework for Cabal" SoC project is doing: http://vezzosi.blogspot.com/2008/06/my-summer-of-code-project-dependency.htm... Cheers, Simon

Manuel M T Chakravarty

12 Aug 12 Aug

12:07 a.m.

Matthias Kilian:

...

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote: [...]

...
As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata. Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal.

I see more and more workarounds for workarounds for an unmaintainable (and unusable) build system, and after the latest discussions about git vs. darcs, maintaining GHC-specific branches of libraries etc., I think I'll just drop maintainership from all GHC-related OpenBSD ports until the GHC build system chaos settles down a little bit.

Thanks for demonstrating my point... Complicated build infrastructure and lack of portability used to be a big problem for GHC in the past. Over the last years, the situation got much better (to a large extent due to SimonM sanitising the makefile-based build system). Why are we so keen to throw it all away now? Manuel

Simon Marlow

10:59 a.m.

Matthias Kilian wrote:

...

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote: [...]

...
As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata.

I'm not completely sure, but I think you may have misunderstood how Cabal's makefile generation currently works. It has no specific knowledge of GHC's build system, and it does rely on its own metadata. (in my other message I'm suggesting moving the Makefile generation into GHC's build system so that it could be made specific to GHC, though).

...

Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal.

This is what makes me unsure. Implementation of what? Are you suggesting a redesign of Cabal, or just changing the way something works? Cheers, Simon

Matthias Kilian

8:29 p.m.

On Tue, Aug 12, 2008 at 11:59:37AM +0100, Simon Marlow wrote:

...

...
Well, at least the Makefile creation was a step (the first step?) into the wrong direction, IMHO. I'll run a GHC build to get some of those generated Makefiles and followup on cvs-ghc, but for a starter, Cabal shouldn't know anything about implementation-specific internal build systems; instead it should rely only on it's own metadata.

I'm not completely sure, but I think you may have misunderstood how Cabal's makefile generation currently works. It has no specific knowledge of GHC's build system, and it does rely on its own metadata.

I mean the GHC-specific template used for building the Makefile (Distribution/Simple/GHC/Makefile.in) and the function `makefile` in Distribution/Simple/GHC.hs (this function even spills out some some make rules in addition to what's in Makefile.in, which looks very wrong to me). Yes, it relies only on the Cabal metadata, but the output is a Makefile only useful for building GHC. It'd be better to be able to run $ ./Setup mkmetadata -prefix foo- which just produces some simple variable declarations like foo-impl-ghc-build-depends = rts foo-impl-ghc-exposed-modules = Data.Generics Data.Generics.Aliases ... ... foo-exposed-modules = Control.Applicative Control.Arrow ... ... foo-c-sources = cbits/PrelIOUtils.c cbits/WCsubst.c ... ... foo-windows-extra-libaries = wsock32 msvcrt kernel32 user32 shell32 foo-extensions = CPP foo-ghc-options = -package-name base foo-nhc98-options = -H4M -K3M Basically, the .cabal file is just converted into some other format that may be included by another Makefile. And since it's a really simple output format, it could even be used by different implementations of make(1) or even other build tools. The `foo-' prefix just shields variables in the including Makefile. Take this output, write it to some cabalmetadata.mk, and then use a (GHC-specific) Makefile copied over into all library directories that does an include cabalmetadata.mk ... GHC_OPTS += $(foo-ghc-options) EXPOSED_MODULES = $(foo-exposed-modules) $(foo-impl-ghc-exposed-modules) EXTRA_LIBS = $(foo-extra-libraries) $(foo-$(HostOS_CPP)-extra-libraries) Thus, Cabal dumps the metadata, without knowing how it's used. All the remaining stuff are some (implementation specific) Makefiles relying on recursive variable expansion. I'll implement this for GHC when I've a little bit more spare time (in three or four weeks).

...

(in my other message I'm suggesting moving the Makefile generation into GHC's build system so that it could be made specific to GHC, though).

Generated files should be as simple, primitive and portable as possible. Generating complete Makefiles make things more difficult. And it doesn't matter wether they're generated by Cabal or by GHC's build system. If you've to tweak the build system, you don't want to tweak generators but just an existing Makefile.

...

...
Implementation-specific stuff (such as how to run the compiler) should be supplied by the implementation, not by Cabal.

This is what makes me unsure. Implementation of what?

The Haskell compiler. Or, to be more exact, the Cabal library shipped with the Haskell compiler (or some supplementary compiler-specific library -- I didn't think much about this part yet). However, my main concern is the usage of Cabal from within the GHC build system, so please just forget this part ;-)

...

Are you suggesting a redesign of Cabal, or just changing the way something works?

I don't think that a large redesign is necessary. It should just try to be as implementation-agnostic as possible. Ciao, Kili -- do yourself a favor and let the le(4)s rot on the junkpile of history... -- Henning Brauer

Matthias Kilian

8:37 p.m.

On Tue, Aug 12, 2008 at 10:29:03PM +0200, Matthias Kilian wrote:

...

Basically, the .cabal file is just converted into some other format that may be included by another Makefile.

Oops! I again read your (SimonM's) proposal on changing Cabal and the GHC build system in exactly this way. Sorry for the noise. Ciao, Kili

Simon Marlow

13 Aug 13 Aug

8:03 a.m.

Matthias Kilian wrote:

...

I mean the GHC-specific template used for building the Makefile (Distribution/Simple/GHC/Makefile.in) and the function `makefile` in Distribution/Simple/GHC.hs (this function even spills out some some make rules in addition to what's in Makefile.in, which looks very wrong to me).

Yes, it relies only on the Cabal metadata, but the output is a Makefile only useful for building GHC.

Ok, this statement is plainly not true, since I can use 'cabal makefile' to build any package outside of the GHC build tree. So perhaps I've misunderstood your point?

...

It'd be better to be able to run

$ ./Setup mkmetadata -prefix foo-

which just produces some simple variable declarations like

foo-impl-ghc-build-depends = rts foo-impl-ghc-exposed-modules = Data.Generics Data.Generics.Aliases ... ... foo-exposed-modules = Control.Applicative Control.Arrow ... ... foo-c-sources = cbits/PrelIOUtils.c cbits/WCsubst.c ... ... foo-windows-extra-libaries = wsock32 msvcrt kernel32 user32 shell32 foo-extensions = CPP foo-ghc-options = -package-name base foo-nhc98-options = -H4M -K3M

Yes, we could use this to implement GHC's build system. It's somewhat similar to the scheme I suggested in the other thread, but more generic. I'd be completely happy to do it this way if the functionality would be useful to others outside GHC too. Cheers, Simon

Matthias Kilian

7:22 p.m.

On Wed, Aug 13, 2008 at 09:03:34AM +0100, Simon Marlow wrote:

...

...
Yes, it relies only on the Cabal metadata, but the output is a Makefile only useful for building GHC.

Ok, this statement is plainly not true, since I can use 'cabal makefile' to build any package outside of the GHC build tree. So perhaps I've misunderstood your point?

No, I was confused (a little bit over-worked).

...

...
$ ./Setup mkmetadata -prefix foo-

which just produces some simple variable declarations like

foo-impl-ghc-build-depends = rts foo-impl-ghc-exposed-modules = Data.Generics Data.Generics.Aliases [...] Yes, we could use this to implement GHC's build system. It's somewhat similar to the scheme I suggested in the other thread, but more generic. I'd be completely happy to do it this way if the functionality would be useful to others outside GHC too.

I've a little bit spare time from august 25th to august 31th. This should be enough time for implementing it (in Cabal and in the GHC build system) to see how it feels. Ciao, Kili -- CRM114 isn't ugly like PERL. It's a whole different kind of ugly. -- John Bowker

Ian Lynagh

11 Aug 11 Aug

11 p.m.

On Mon, Aug 11, 2008 at 04:17:59PM +0100, Simon Marlow wrote:

...

One way we could create the forks would be to create a git repo for each package with two branches: the master branch that GHC builds, and a separate branch that tracks the main darcs repository, and is synced automatically whenever patches are pushed to the main darcs repo. We'd have to explicitly merge the tracking branch into the master branch from time to time. When we want to make changes locally, we can just commit them to the GHC branch and push the changes upstream in a batch later (and then we'd end up having to merge them back in to the GHC branch... but hopefully git's merge is clever enough to avoid manual intervention here). This is complicated and ugly of course; better suggestions welcome.

I don't think that this will work well. At some point we'll have to resolve conflicts that accumulate (by rebasing GHC-only patches after conflicting upstream patches come in? This may not be trivial, as the rebaser may not be familiar with the patches), and I suspect that the forks will end up diverging for significant periods of time, as there's little impetus to merge them. Even the current situation with Cabal is a bit of a pain, as it's easy to forget to push patches upstream as well as GHC's repo, and that's just with 2 repos of the same VCS. I actually think the original plan, where only the ghc repo (plus one or two others) is in git is preferable. You may have to use a different VCS for different subprojects, but after that it's downhill all the way. You don't have to worry about patches being converted from one VCS to another, moved to another repo, converted back and merged back into the first repo. I expect people will be using different VCSs for different /projects/, if not subprojects, anyway. Thanks Ian

Manuel M T Chakravarty

12 Aug 12 Aug

12:46 a.m.

Ian Lynagh:

...

Even the current situation with Cabal is a bit of a pain, as it's easy to forget to push patches upstream as well as GHC's repo, and that's just with 2 repos of the same VCS.

As I said before, IMHO it is a big mistake for ghc to depend on development versions of Cabal. GHC should only depend on stable Cabal versions.

...

I actually think the original plan, where only the ghc repo (plus one or two others) is in git is preferable. You may have to use a different VCS for different subprojects, but after that it's downhill all the way. You don't have to worry about patches being converted from one VCS to another, moved to another repo, converted back and merged back into the first repo.

Having two vcs for one project is bad. One reason to switch to git (I am told) is that people had problems with darcs on some platforms (windows and Solaris, for example). How is that going to be any better if part of the project is still in darcs? So, can we please make up our mind? If darcs has problems on some platforms, then we should not use darcs at all for ghc. If darcs does not have problems on some platforms, then there is one less reason to switch. All core library developers need to use git anyway to validate their core library patches. So, let's just move the ghc repo and *all core libraries* over to git. If git is good enough for the ghc repo, it should be good enough for the core library repos as well, shouldn't it. Manuel

Johan Tibell

8:54 a.m.

On Tue, Aug 12, 2008 at 2:46 AM, Manuel M T Chakravarty wrote:

...

Ian Lynagh: Having two vcs for one project is bad. One reason to switch to git (I am told) is that people had problems with darcs on some platforms (windows and Solaris, for example). How is that going to be any better if part of the project is still in darcs? So, can we please make up our mind? If darcs has problems on some platforms, then we should not use darcs at all for ghc. If darcs does not have problems on some platforms, then there is one less reason to switch.

I switched all my Haskell projects over to Git, as a developer on Linux, is that I wasted way too much time fighting with Darcs that I should have spent programming. It's way too slow. I've run in to exponential merges with only two developers commiting to the same repository. It randomly freezes -- exponential merge, general sluggishness, who knows! -- or crashes. And merging, perhaps the most important operation in a DVCS, is a pain. I don't trust Darcs to keep my source code safe anymore. Cheers, Johan

Manuel M T Chakravarty

12:20 a.m.

Simon Marlow:

...

Manuel M T Chakravarty wrote:

...
I think all *core* libraries must switch. Seriously, requiring GHC developer to use a mix of two vcs during development is a Very Bad Idea. Don was excited about getting more people to look at the source when it is in git (see the comments he posted from reddit). By requiring two vcs you will get *less* people to look at the source. This is not only to get the sources to hack them, but you effectively require developers to learn the commands for two vcs (when they are already reluctant to learn one). For example, often enough somebody who changes something in GHC will modify the base package, too. Then, to commit the overall work, you need to commit using both vcs. If you need to branch for your work, you need to create branches in two vcs (no idea whether the semantics of a branch in git and darcs is anywhere similar). When you merge your branch, you need to merge in both vcs. You can't seriously propose such a set up!

I completely agree this is a problem. The main obstacle with just switching the core libraries is that they are shared by other implementations and other maintainers. So I see no alternative but to create forks of those repositories for use by GHC, unless/until the other projects/maintainers want to migrate to git. Some of the repositories are not shared - for example ghc-prim, integer-gmp, template-haskell, and these don't need to be forked.

One way we could create the forks would be to create a git repo for each package with two branches: the master branch that GHC builds, and a separate branch that tracks the main darcs repository, and is synced automatically whenever patches are pushed to the main darcs repo. We'd have to explicitly merge the tracking branch into the master branch from time to time. When we want to make changes locally, we can just commit them to the GHC branch and push the changes upstream in a batch later (and then we'd end up having to merge them back in to the GHC branch... but hopefully git's merge is clever enough to avoid manual intervention here). This is complicated and ugly of course; better suggestions welcome.

Yes, it's a pain. However, it is better than two vcs for one project.

...

...
I *strongly* object to moving to git before this isn't sorted out. As Roman said before, GHC is heading into a dangerous direction. It gets progressively harder to contribute to the project at the moment. First, changing the build system to Cabal. Now, proposing to use two vcs. Somebody who is new to the project not only has to learn the internals of GHC, but they also have to learn two new vcs, and if they need to change the build system, they need to learn a new build tool. Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I'm not completely convinced we need to have this all worked out before GHC switches, although it would be nice of course. We currently have infastructure in place for the build to work with a mixture of darcs and git repositories, and existing developers already have to learn git anyway. They just need to remember to use darcs for libraries and git for the main GHC repo, and this is only a temporary situation.

As far as I am concerned, building GHC is turning into a big mess. We discussed ways to improve it again, BUT I'd rather not see it getting any messier before it gets better. Hence, please let's have a complete plan that we are convinced will work before making any more changes.

...

As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Yes, we need cabal for packages because we don't want two build systems. However, this does not justify the use of Cabal outside of libraries/. Nobody explained to me why that was necessary. Why change all the rest of the build system. What is the benefit for the ghc project? To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages. Manuel PS: Just for some more collateral damage. Did anybody check whether the Mac OS installer support and the -unfortunately, only partially working- support to compile for older OS X versions that I added to the *makefiles* still works with the Cabal-based system? I doubt it. Took me quite a while to get all this going, and I am not very well motivated to spend a lot of time to figure out how it might work with Cabal. IMHO using Cabal for anything but the libraries was a step back for no good reason.

Donnie Jones

3:45 a.m.

Hello, On Mon, Aug 11, 2008 at 8:20 PM, Manuel M T Chakravarty < chak@cse.unsw.edu.au> wrote:

...

Simon Marlow:

...
Manuel M T Chakravarty wrote:

I think all *core* libraries must switch. Seriously, requiring GHC

...
developer to use a mix of two vcs during development is a Very Bad Idea. Don was excited about getting more people to look at the source when it is in git (see the comments he posted from reddit). By requiring two vcs you will get *less* people to look at the source. This is not only to get the sources to hack them, but you effectively require developers to learn the commands for two vcs (when they are already reluctant to learn one). For example, often enough somebody who changes something in GHC will modify the base package, too. Then, to commit the overall work, you need to commit using both vcs. If you need to branch for your work, you need to create branches in two vcs (no idea whether the semantics of a branch in git and darcs is anywhere similar). When you merge your branch, you need to merge in both vcs. You can't seriously propose such a set up!

I completely agree this is a problem. The main obstacle with just switching the core libraries is that they are shared by other implementations and other maintainers. So I see no alternative but to create forks of those repositories for use by GHC, unless/until the other projects/maintainers want to migrate to git. Some of the repositories are not shared - for example ghc-prim, integer-gmp, template-haskell, and these don't need to be forked.

One way we could create the forks would be to create a git repo for each package with two branches: the master branch that GHC builds, and a separate branch that tracks the main darcs repository, and is synced automatically whenever patches are pushed to the main darcs repo. We'd have to explicitly merge the tracking branch into the master branch from time to time. When we want to make changes locally, we can just commit them to the GHC branch and push the changes upstream in a batch later (and then we'd end up having to merge them back in to the GHC branch... but hopefully git's merge is clever enough to avoid manual intervention here). This is complicated and ugly of course; better suggestions welcome.

Yes, it's a pain. However, it is better than two vcs for one project.

I *strongly* object to moving to git before this isn't sorted out. As

...
...
Roman said before, GHC is heading into a dangerous direction. It gets progressively harder to contribute to the project at the moment. First, changing the build system to Cabal. Now, proposing to use two vcs. Somebody who is new to the project not only has to learn the internals of GHC, but they also have to learn two new vcs, and if they need to change the build system, they need to learn a new build tool. Raising the bar for developers to contribute to a project has been proven to be a very bad idea many times. Let's not take GHC down that path.

I'm not completely convinced we need to have this all worked out before GHC switches, although it would be nice of course. We currently have infastructure in place for the build to work with a mixture of darcs and git repositories, and existing developers already have to learn git anyway. They just need to remember to use darcs for libraries and git for the main GHC repo, and this is only a temporary situation.

As far as I am concerned, building GHC is turning into a big mess. We discussed ways to improve it again, BUT I'd rather not see it getting any messier before it gets better. Hence, please let's have a complete plan that we are convinced will work before making any more changes.

...

As for Cabal - we had a thread on cvs-ghc last week, and as I said there

...
we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Yes, we need cabal for packages because we don't want two build systems. However, this does not justify the use of Cabal outside of libraries/. Nobody explained to me why that was necessary. Why change all the rest of the build system. What is the benefit for the ghc project?

To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

Manuel

I am a new developer with GHC and most of my background is with C programming and Makefile based build systems, such as with the Linux Kernel. Thus, it was much easier for me to get started hacking on GHC to only need to modify Makefiles as compared to learning an entirely different build system; therefore, I think you lower the barrier to entry for gaining new GHC developers if you stick with the Makefile build system, which is far more common, stable, robust, and definitely taught/used in many university projects. (Some of these reasons may be the same reasons GHC repo is switching to git.) As I said, I am new to hacking on GHC, so I am not sure what reasons there are to switch to Cabal for the build system; but, I am not an expert in build systems and I was able to figure out the GHC Makefiles to add static and run-time flags to GHC, etc. I definitely think the current Makefile build system could be improved, but overall I did find it quite manageable for my needs. Disclaimer: I really have very little experience with Cabal other than using it for installing packages, so take everything I have said with a grain of salt. __ Donnie Jones

Simon Marlow

8:19 a.m.

Manuel M T Chakravarty wrote:

...

As far as I am concerned, building GHC is turning into a big mess. We discussed ways to improve it again, BUT I'd rather not see it getting any messier before it gets better. Hence, please let's have a complete plan that we are convinced will work before making any more changes.

...
As for Cabal - we had a thread on cvs-ghc last week, and as I said there we'd love to hear suggestions for how to improve things, including wild and crazy ideas for throwing it all away and starting again. However, as I explained, there are good reasons for the way things are done now, the main one being that the build system for packages is not written twice.

Yes, we need cabal for packages because we don't want two build systems. However, this does not justify the use of Cabal outside of libraries/. Nobody explained to me why that was necessary. Why change all the rest of the build system. What is the benefit for the ghc project?

GHC is a package, just like any other. The GHC package was the main reason we still had a lot of the old infrastructure for building packages still in the build system, so there was a compelling reason to switch the compiler itself to Cabal, at least. It's true that this change wasn't all win. We gained in some places and lost in others - the build system is more unfriendly to developers now, as opposed to people just building GHC, and that really is something we need to address.

...

To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

I wouldn't object to dropping the use of Cabal for other tools in the build tree; the reasons for using it elsewhere are certainly not as compelling as for packages. Ian, I realise this means backing out a lot of the work you've been doing recently, and it would mean that we'd lose a lot of time in the runup to 6.10.1, but perhaps it's a step that we need to take to get us back on the right track again? Cheers, Simon

Manuel M T Chakravarty

13 Aug 13 Aug

6:19 a.m.

Simon Marlow:

...

Manuel M T Chakravarty wrote:

...
To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

I wouldn't object to dropping the use of Cabal for other tools in the build tree; the reasons for using it elsewhere are certainly not as compelling as for packages.

Ian, I realise this means backing out a lot of the work you've been doing recently, and it would mean that we'd lose a lot of time in the runup to 6.10.1, but perhaps it's a step that we need to take to get us back on the right track again?

I do realise that this would mean backing out a lot of Ian recent work, and that's why I haven't proposed going back to the old system before you explicitly asked. However, I am increasingly getting the feeling that the move to Cabal was pre-mature, and the overall loss will be minimised by backing out now. In a sense, it was an interesting experiment and it should still be useful to the development of Cabal. In fact, I see no reason why the experiment cannot be continued on a branch. Who knows, maybe Cabal is sufficiently mature in a year to make a switch worthwhile? I just object to using the whole GHC developer community as guinea pigs. Manuel

Duncan Coutts

11:45 a.m.

On Wed, 2008-08-13 at 16:19 +1000, Manuel M T Chakravarty wrote:

...

In a sense, it was an interesting experiment and it should still be useful to the development of Cabal. In fact, I see no reason why the experiment cannot be continued on a branch. Who knows, maybe Cabal is sufficiently mature in a year to make a switch worthwhile? I just object to using the whole GHC developer community as guinea pigs.

Sadly, I'm not so sure we've really learnt much that helps Cabal itself. While there's been a lot of general pain I can't think of many specific issues we've discovered in Cabal. We added a couple of minor features, some of which we'd have needed anyway for building the libs for 6.10 (eg due to the base-3/4 thing). As far as I can see, most of the problems have been in the change itself and the makefile glue code. I may well me missing some things since I've not been intimately involved in the changes. I would most appreciate specific problems or missing features being filed as tickets in the Cabal trac so that we can learn things and not forget them. Roman filed #276 "Add support for convenience libraries" and I appreciate that. I know about the longer term need for dph for a more general 'ways' system in ghc's package system, which will need support in Cabal. I'll file a ticket for that one. Duncan

Ian Lynagh

14 Aug 14 Aug

11:07 a.m.

On Wed, Aug 13, 2008 at 04:19:37PM +1000, Manuel M T Chakravarty wrote:

...

Simon Marlow:

...
Manuel M T Chakravarty wrote:

...
To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

I wouldn't object to dropping the use of Cabal for other tools in the build tree; the reasons for using it elsewhere are certainly not as compelling as for packages.

Ian, I realise this means backing out a lot of the work you've been doing recently, and it would mean that we'd lose a lot of time in the runup to 6.10.1, but perhaps it's a step that we need to take to get us back on the right track again?

I do realise that this would mean backing out a lot of Ian recent work, and that's why I haven't proposed going back to the old system before you explicitly asked. However, I am increasingly getting the feeling that the move to Cabal was pre-mature, and the overall loss will be minimised by backing out now.

We're only talking about "other tools", not the libraries (including the GHC library), right? This seems like it would be a step backwards to me (after all, I wouldn't have spent the time moving to Cabal if I didn't think it was a step forwards), and I'm not really sure what benefit you see in it: how much time do you spend working in utils/? Thanks Ian

Ian Lynagh

12 Aug 12 Aug

12:18 p.m.

On Tue, Aug 12, 2008 at 10:20:14AM +1000, Manuel M T Chakravarty wrote:

...

To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

Manuel

PS: Just for some more collateral damage. Did anybody check whether the Mac OS installer support and the -unfortunately, only partially working- support to compile for older OS X versions that I added to the *makefiles* still works with the Cabal-based system? I doubt it. Took me quite a while to get all this going, and I am not very well motivated to spend a lot of time to figure out how it might work with Cabal. IMHO using Cabal for anything but the libraries was a step back for no good reason.

Do you mean the "rebuilding the tools with stage2" stuff? If so, that's an interesting example to pick, as that was the impetus behind changing how the build system worked for all the non-libraries/ghc. Those changes made the build non-idempotent: we would build something with the bootstrapping compiler, build some other stuff, then come back, clean it, and build it again with the in-tree compiler. This was a little annoying at the best of times, as e.g. rerunning make at the top level would needlessly rebuild some stuff. However, when my local changes meant that programs built by GHC segfaulted, it was especially irritating to find that after (hopefully) fixing the bug I couldn't just run make in compiler/ or rts/, because ghc-pkg etc now just segfaulted! It was at that point that I half-reverted the changes, and later I reimplemented something similar using Cabal. Now we make, for example, ghc-pkg with the bootstrapping compiler in utils/ghc-pkg/dist-inplace, and then later on we make it with the stage1 compiler in utils/ghc-pkg/dist-install. To answer your actual question: No, not having OS X yet I haven't tested it, but I did make an effort to keep it working. In mk/cabal-flags.mk we say: USE_STAGE_CONFIGURE_FLAGS = \ ... $(addprefix --cc-option=,$(MACOSX_DEPLOYMENT_CC_OPTS)) \ $(addprefix --ld-option=,$(MACOSX_DEPLOYMENT_LD_OPTS)) which will hopefully do the trick, and (IMO) in a much cleaner, more maintainable way than would have been possible with the old build system. Thanks Ian

Manuel M T Chakravarty

13 Aug 13 Aug

6:35 a.m.

Ian Lynagh:

...

On Tue, Aug 12, 2008 at 10:20:14AM +1000, Manuel M T Chakravarty wrote:

...
To be honest, if you ask me, I'd go back to the old makefile based system and remove Cabal from everywhere except building of the library packages.

Manuel

PS: Just for some more collateral damage. Did anybody check whether the Mac OS installer support and the -unfortunately, only partially working- support to compile for older OS X versions that I added to the *makefiles* still works with the Cabal-based system? I doubt it. Took me quite a while to get all this going, and I am not very well motivated to spend a lot of time to figure out how it might work with Cabal. IMHO using Cabal for anything but the libraries was a step back for no good reason.

Do you mean the "rebuilding the tools with stage2" stuff? If so, that's an interesting example to pick, as that was the impetus behind changing how the build system worked for all the non-libraries/ghc.

Rebuilding with stage1 was already needed to build GHC with a builtin readline. In general, it is a bad idea to build distributed binaries of Haskell programs with the *bootstrap compiler*. It must be done with the stage1 compiler. (If you are unsure why, I'll happily elaborate.) What I was mainly refer to is the building of GHC.framework with xcodebuild and the accompanying packing with packagemaker. Building for older versions of Mac OS X requires the MACOSX_DEPLOYMENT_TARGET and related infrastructure.

...

Those changes made the build non-idempotent: we would build something with the bootstrapping compiler, build some other stuff, then come back, clean it, and build it again with the in-tree compiler. This was a little annoying at the best of times, as e.g. rerunning make at the top level would needlessly rebuild some stuff.

However, when my local changes meant that programs built by GHC segfaulted, it was especially irritating to find that after (hopefully) fixing the bug I couldn't just run make in compiler/ or rts/, because ghc-pkg etc now just segfaulted!

It was at that point that I half-reverted the changes, and later I reimplemented something similar using Cabal. Now we make, for example, ghc-pkg with the bootstrapping compiler in utils/ghc-pkg/dist-inplace, and then later on we make it with the stage1 compiler in utils/ghc-pkg/dist-install.

It's of course much cleaner to build inplace versions of everything with the bootstrap compiler and separate distributeable versions with stage1. I think we briefly talked about that during the run up to 6.8.3.

...

To answer your actual question: No, not having OS X yet I haven't tested it, but I did make an effort to keep it working. In mk/cabal- flags.mk we say:

USE_STAGE_CONFIGURE_FLAGS = \ ... $(addprefix --cc-option=,$(MACOSX_DEPLOYMENT_CC_OPTS)) \ $(addprefix --ld-option=,$(MACOSX_DEPLOYMENT_LD_OPTS))

which will hopefully do the trick, and (IMO) in a much cleaner, more maintainable way than would have been possible with the old build system.

I appreciate that you tried to preserve it, but things like those usually don't work until explicitly tested and debugged. I think this illustrates the issue I am having with the current process. I don't think large changes that have not been properly tested should be committed to the head. I appreciate that you cannot test everything for every patch and don't have all the platforms at hand. That's why major rejigging of the build system should be done on a branch. Then, you can ask other people to test it, once it is all working well for you. Ripping the guts out of the head and leaving some of them on the floor just means everybody else is going to trip over them. Manuel

Ian Lynagh

14 Aug 14 Aug

11:14 a.m.

On Wed, Aug 13, 2008 at 04:35:42PM +1000, Manuel M T Chakravarty wrote:

...

Rebuilding with stage1 was already needed to build GHC with a builtin readline. In general, it is a bad idea to build distributed binaries of Haskell programs with the *bootstrap compiler*. It must be done with the stage1 compiler. (If you are unsure why, I'll happily elaborate.)

No, I understand why it was necessary, I just had problems with the way it was done. Of course, it would have been possible to extend the old build system to build the two versions in separate directories, but it would have meant adding more complexity to an already-complex system. I believe that using a Cabal-based build system instead has made things simpler. Thanks Ian

Simon Peyton-Jones

12 Aug 12 Aug

1:46 p.m.

Friends | > I see more and more workarounds for workarounds for an unmaintainable | > (and unusable) build system, and after the latest discussions about | > git vs. darcs, maintaining GHC-specific branches of libraries etc., | > I think I'll just drop maintainership from all GHC-related OpenBSD | > ports until the GHC build system chaos settles down a little bit. | | Ian, please read this. | The inability to build GHC reliably is a problem. | | Can someone with a plan please describe what measures are in place | to ensure GHC emerges buildable, and the tree regains the status of a | tree that *does not break*? I don't think we should over-react here. There's been lots of email on this thread, some of which IMHO makes things sound rather worse than they really are. Let me say how it looks to me. There are two separate but loosely-related conversations going on. 1. Changes to GHC's build system. Cabal is used to build Haskell libraries. We started to use it to build the libraries that come with GHC; and we recently moved over to Cabal to build GHC itself (which is, these days, just another library). The old makefile-based system was essentially duplicating much of the functionality of Cabal, and that duplication was painful. In retrospect, we should have made this change in a branch, and tested it thoroughly before applying it to the HEAD. Build systems tend to be platform dependent, so testing on one platform isn't enough. Nor did we consult, or even communicate, enough before going ahead. And we need more Wiki documentation about how to drive the new system. The net effect of these omissions has been a lot of pain to our collaborators. I am very sorry about that. But I think it'd be a pity to confuse the pain of transition with the destination. The build system is settling down. For the moment, it probably makes sense not to aggressively pull patches from the GHC repo if you don't have to, but we absolutely do not expect that situation to persist. We'll make an announcement when we're ready for you to give it a try. The clear goal is: it simply builds flawlessly. There is an element of "dogfoooding" here. GHC is a stress test for what Cabal can do, and is itself not fully mature. But the pain we experience thereby leads to bug-fixes and significant features for Cabal that are useful for everyone. Perhaps we made the move too early though! The new design is not set in stone, and we are actively thinking about ways to improve it, *including* backing off from Cabal in places where it appears too inflexible. Of course, any such further changes would extend the period of upheaval, but (a) we'll publish a design before executing, and (b) we'll do it on a branch. 2. The version control system (VCS) At the same time, we had an extended conversation about changing the version control system we use for GHC. There was a lot of consultation here, as a result of which we chose git. I won't rehearse again the reasons we are unhappy with darcs, except to say that darcs is a thing of beauty, but the scale of GHC's repository seems to flush out many darcs bugs and performance problems that have proved difficult to fix. Unlike the build system, we have not yet executed this decision. In particular, the earlier discussions focused mainly on the relative merits of the various systems. But it's more complicated than that. GHC needs "core libraries" without which it cannot be built. It is obviously highly desirable that a developer can build GHC with just one VCS, which suggests that the core libraries should be in git too. But those same core libraries are used by nhc98 and Hugs (I think that's all), and the last thing we want to do is to impose new costs on other implementations. Diversity in implementation is a Good Thing. It's unclear exactly what to do about this. The most plausible possibility is to keep the core libraries that are shared with other implementations in darcs as now, and mirror them in git for GHC developers. That will impose pain on GHC developers to keep the git stuff in sync with the darcs master copies; but at least other developers would be unaffected. It's a hard judgement call to say which pain is greatest: the pain of staying with darcs, or the pain of managing the two-VCS problem. Regardless, though, if all you want to do is build GHC from scratch, it absolutely will be a question of getting the relevant VCS, installing support software (Happy, Alex, an earlier GHC), and typing 'make'. You won't have to know about funny branches. We (GHC HQ) are still learning the transition to wider participation in building and hacking on GHC, which we *very much* welcome. Bear with us if we don't get it right first time. We're trying! Simon

Bulat Ziganshin

5:42 p.m.

New subject: Re[2]: Version control systems

Hello Simon, Tuesday, August 12, 2008, 5:46:59 PM, you wrote:

...

GHC needs "core libraries" without which it cannot be built. It is obviously highly desirable that a developer can build GHC with just one VCS, which suggests that the core libraries should be in git too. But those same core libraries are used by nhc98 and Hugs (I think that's all), and the last thing we want to do is to impose new costs on other implementations. Diversity in implementation is a Good Thing.

why not ask hugs/nhc maintainers to switch to git too? it seems that darcs while being good solution for small/medium programs, hardly may be used for ghc. so, probably it should be divided to 2 parts: "large things" including compilers and corelibs should go into git and "small things" including all the 3rd-party libs should stay with darcs -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Thomas Schilling

6:53 p.m.

On 12 Aug 2008, at 15:46, Simon Peyton-Jones wrote:

...

It's unclear exactly what to do about this. The most plausible possibility is to keep the core libraries that are shared with other implementations in darcs as now, and mirror them in git for GHC developers. That will impose pain on GHC developers to keep the git stuff in sync with the darcs master copies; but at least other developers would be unaffected.

FWIW, I started a wiki page that tries a direct comparison between Darcs and Git: http://hackage.haskell.org/trac/ghc/wiki/GitForDarcsUsers Some mappings are simple, some are more complicated and will require adopting a different workflow. I still recommend reading a tutorial, but this cheat sheet should be a good start if you don't want to spend much time to learn Git just yet. Where no directly corresponding command exists or emulating it would be too messy, I try to hint towards other work flows. I encourage everyone to add useful tips and examples both from users who already use Git and later on, once we have gathered more experience. I believe that Git has some features which can improve our productivity and I'd like this page to also collect tips how to do so. / Thomas -- Push the envelope. Watch it bend.

Isaac Dupree

7:17 p.m.

Thomas Schilling wrote:

...

I encourage everyone to add useful tips and examples both from users who already use Git and later on, once we have gathered more experience. I believe that Git has some features which can improve our productivity and I'd like this page to also collect tips how to do so.

what about `darcs send --dry-run`? It's not perfect, but I use it in my old repos in conjunction with `darcs wh [-l]` to find out what of value I'd lose by deleting an old checkout. (e.g., patches merged into HEAD aren't of value. But they still aren't of value even if they've been amend-recorded, rewritten, or equivalent by simon/ian/etc., but Darcs can't tell this, unfortunately.) -Isaac

Marc Weber

22 Aug 22 Aug

2:48 a.m.

New subject: Version control systems - git example find changes which could be lost

On Tue, Aug 12, 2008 at 03:17:59PM -0400, Isaac Dupree wrote:

...

Thomas Schilling wrote:

...
I encourage everyone to add useful tips and examples both from users who already use Git and later on, once we have gathered more experience. I believe that Git has some features which can improve our productivity and I'd like this page to also collect tips how to do so.

what about `darcs send --dry-run`? It's not perfect, but I use it in my old repos in conjunction with `darcs wh [-l]` to find out what of value I'd lose by deleting an old checkout. (e.g., patches merged into HEAD aren't of value. But they still aren't of value even if they've been amend-recorded, rewritten, or equivalent by simon/ian/etc., but Darcs can't tell this, unfortunately.)

-Isaac

Hi Isaac, git rebase can do this partially. See this example that's what I know about (make sure you don't have important data in /tmp/xx) How intelligent git behaves on partially applied / cherry picked commits I don't know. #!/bin/sh echO(){ echo; echo " >>>> $@"; echo 'return to continue'; read; } evaL(){ echo; echo "cmd: $@"; eval "$@"; } cd /tmp/xx || exit 1 rm -fr * .* set -e git init addfile(){ echo $1 > $1 git add $1 git commit -m $1 -a } evaL 'addfile a' evaL 'addfile b' evaL 'addfile c' evaL 'addfile d' echO 'a,b,c,d recorded succesfully' evaL 'git checkout HEAD~2' echO 'gone back two commits' evaL 'git checkout -b mutate' echO 'branch mutate created' evaL 'addfile new' echO 'new file new added which would be lost' evaL 'git cherry-pick master' evaL 'git cherry-pick master^' echO 'cherry picked d c in reverse order, look at popping up gitk now (you may want to keep it open)' evaL 'gitk --all &' echO 'continue after gitk has popped up, you should see one branch' evaL 'git checkout -b rebased' evaL 'git rebase master rebased' echO 'tried rebasing, data which would be lost should be ahead of master now' echO 'opening second gitk showing current repo state' evaL 'gitk --all' echO 'if this is not enough, you can always use git-diff:' evaL 'git diff mutate master'

Simon Peyton-Jones

13 Aug 13 Aug

7:26 a.m.

| FWIW, I started a wiki page that tries a direct comparison between | Darcs and Git: | | http://hackage.haskell.org/trac/ghc/wiki/GitForDarcsUsers Very helpful thank you! Simon

Marc Weber

22 Aug 22 Aug

3:34 a.m.

Isaac see third

...

FWIW, I started a wiki page that tries a direct comparison between Darcs and Git:

http://hackage.haskell.org/trac/ghc/wiki/GitForDarcsUsers

Some mappings are simple, some are more complicated and will require adopting a different workflow. I still recommend reading a tutorial, but this cheat sheet should be a good start if you don't want to spend much time to learn Git just yet. Where no directly corresponding command exists or emulating it would be too messy, I try to hint towards other work flows.

I encourage everyone to add useful tips and examples both from users who already use Git and later on, once we have gathered more experience. I believe that Git has some features which can improve our productivity and I'd like this page to also collect tips how to do so.

Hi Thomas: Great work! There is not much I could add (although I've used git during the last weeks quite often..) However I'm missing four small tips: first man git-rev-parse (or git rev-parse help) HEAD HEAD^ HEAD^^ .. is equal to HEAD HEAD~1 HEAD~2 ..> So to drop one of the last ten commits (don't remember which one) git rebase -i HEAD~10 ... second : you forgot to mention gitk. It helps you getting an overview about when which branches have been created You can use google pictures search to see how it looks like or just play around (try the script in my other post).. You can have a look at the history and branches easily.. You can even highlight commits by changes made to filepath (must be relative to repo path!) or by adding/ removing strings etc.. And it's a nice tool to just keep all hashes in memory in case you mess up your repo by accident :-) But recent gitk can do more. When getting some conflicts on git merge or git rebase gitk --merge will show you all commits causing this conflict. third: #git on freenode.. I bet you'll get help there as well.. I got the last tip there as well < doener> MarcWeber: you could, for example, do "git log or git rev-list or gitk --left-right --cherry-pick A...B" lists all commits beeing present on the one or the other branch, but not in both 4th: You should know one thing about git history There used to be no difference between git-log (now depreceated, does no longer work in the git git version) and git log Thus git log --help = git-log --help = man git-log (more convinient to type) The only execption: git-clone doesn't work in all cases, git clone does (?) (Don't ask me why) maybe git show commit-id:file is of interest as well (you told about git show) Sincerly Marc Weber

Manuel M T Chakravarty

13 Aug 13 Aug

6:54 a.m.

Simon Peyton-Jones:

...

2. The version control system (VCS)

GHC needs "core libraries" without which it cannot be built. It is obviously highly desirable that a developer can build GHC with just one VCS, which suggests that the core libraries should be in git too. But those same core libraries are used by nhc98 and Hugs (I think that's all), and the last thing we want to do is to impose new costs on other implementations.

What are these costs? I don't believe there are serious costs for those developers. Malcolm told us that all he contributes to the core libraries is fixing them for nhc when they break. He doesn't even validate, so I am sure he doesn't use branches or anything similar. The cost for him is to learn how to get, record & push with git. AFAIK, the only person who works on Hugs is Ross. He contributes to GHC, too, and hopefully validates his library patches before pushing. So, he'll have to learn to use git anyway.

...

It's unclear exactly what to do about this. The most plausible possibility is to keep the core libraries that are shared with other implementations in darcs as now, and mirror them in git for GHC developers. That will impose pain on GHC developers to keep the git stuff in sync with the darcs master copies; but at least other developers would be unaffected.

Everybody who contributes to the boot/core libraries needs to validate their patches. If the GHC version of the libraries is in git, then all library code needs to be validated against the git version of the libraries before it can enter the master repository. I don't see how that makes anything easier for anybody. As I said before, I believe there is exactly one sane solution: all boot libraries use the same vcs as ghc. Manuel

Simon Marlow

7:36 a.m.

Manuel M T Chakravarty wrote:

...

Everybody who contributes to the boot/core libraries needs to validate their patches. If the GHC version of the libraries is in git, then all library code needs to be validated against the git version of the libraries before it can enter the master repository. I don't see how that makes anything easier for anybody.

As I said before, I believe there is exactly one sane solution: all boot libraries use the same vcs as ghc.

I don't think this completely sane :-) It's not fair or reasonable for the GHC project to require everyone else contributing to the core libraries to validate their changes against GHC. We need to be branching the shared repositories, so that we can keep the GHC branches working and sync up with the shared repositories as necessary. Morally this is the right thing; technically its a lot more difficult than not forking. What would make it much easier is for both the original shared repository and GHC's branch to be git repositories (or branches of the same repo): git is really good at having two parallel lines of development that sync occasionally. Having one in darcs and one in git would be painful, but doable. So I suggest we propose moving all the core packages to git, and we translate all those for which nobody objects to the change. For the others, we'll keep them in darcs and live with the pain. Cheers, Simon

Neil Mitchell

14 Aug 14 Aug

10:34 p.m.

...

So I suggest we propose moving all the core packages to git, and we translate all those for which nobody objects to the change. For the others, we'll keep them in darcs and live with the pain.

Does this mean my (now the communities) FilePath library is going to get moved over to git? I personally don't know Git, and while I'm sure I'll be learning at some point, I'm always nervous about learning a VCS on something I care about, as mistakes can go quite wrong. In addition, things like the Yhc build scripts already check out the darcs version, so will have to be modified*. If it really makes the life easier for people who are having lots of VCS pain at the moment, then its hard to object. But many of the comments in this discussion, about how everyone is going to flock to GHC just as soon as it switches to Git, seem overly optimistic. I think GHC is a few years off becoming drive-by hacker friendly, for many other reasons. The halfway house of switching the compiler, and leaving the libraries in darcs, seems desirable. If Git turns out to be wonderful, as people claim, moving the whole way over is fairly easy and a simple choice. Thanks Neil * Modifying the Yhc build scripts is much harder than modifying the GHC build script, as they are 10,000 lines of Python (a language I don't know) in a very complex framework (which I also don't know)! Of course, this is something for the Yhc team to deal with...

Manuel M T Chakravarty

11:59 p.m.

Neil Mitchell:

...

If it really makes the life easier for people who are having lots of VCS pain at the moment, then its hard to object. But many of the comments in this discussion, about how everyone is going to flock to GHC just as soon as it switches to Git, seem overly optimistic. I think GHC is a few years off becoming drive-by hacker friendly, for many other reasons.

It's not about becoming "drive-by hacker friendly". It is about not becoming even less friendly as it is right now. Manuel

Thomas Schilling

15 Aug 15 Aug

12:12 a.m.

Are you advocating for ease of use by new developers or for existing developers? Current GHC hackers have to learn Git anyways and know Darcs already. Library patches still have to be recorded separately, so it would be a bit weird, but not much harder, really. On Fri, Aug 15, 2008 at 1:59 AM, Manuel M T Chakravarty wrote:

...

Neil Mitchell:

...
If it really makes the life easier for people who are having lots of VCS pain at the moment, then its hard to object. But many of the comments in this discussion, about how everyone is going to flock to GHC just as soon as it switches to Git, seem overly optimistic. I think GHC is a few years off becoming drive-by hacker friendly, for many other reasons.

It's not about becoming "drive-by hacker friendly". It is about not becoming even less friendly as it is right now.

Manuel

_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Manuel M T Chakravarty

1:12 a.m.

Thomas Schilling:

...

Are you advocating for ease of use by new developers or for existing developers? Current GHC hackers have to learn Git anyways and know Darcs already. Library patches still have to be recorded separately, so it would be a bit weird, but not much harder, really.

I am arguing for both. It would be more than weird. For example, if you branch ghc, you usually need to branch the core libraries, too. Doing that in two different vcs sounds like a mess to me. Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines) or is buggy (on Mac OS with MacPorts), and hence people have trouble getting the sources out of darcs in the first place. How is that going to be addressed if some crucial code still needs to be obtained using darcs? Manuel

...

On Fri, Aug 15, 2008 at 1:59 AM, Manuel M T Chakravarty wrote:

...
Neil Mitchell:

...
If it really makes the life easier for people who are having lots of VCS pain at the moment, then its hard to object. But many of the comments in this discussion, about how everyone is going to flock to GHC just as soon as it switches to Git, seem overly optimistic. I think GHC is a few years off becoming drive-by hacker friendly, for many other reasons.

It's not about becoming "drive-by hacker friendly". It is about not becoming even less friendly as it is right now.

Gregory Wright

6:48 a.m.

Hi Manuel, On Aug 14, 2008, at 9:12 PM, Manuel M T Chakravarty wrote:

...

Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines) or is buggy (on Mac OS with MacPorts), and hence people have trouble getting the sources out of darcs in the first place. How is that going to be addressed if some crucial code still needs to be obtained using darcs?

Regarding darcs on OS X from MacPorts, I am not aware (or have been sent any bug reports) that there are problems with the latest darcs-2.0.0 port. Is there something that I should know (and try to fix)? The latest port defaults to wget instead of libcurl since I have noticed darcs spinning endlessly when using libcurl. I haven't had time to dtrace what is going on but I'm guessing the underlying problem is likely some misunderstanding of the signal handling API or some corner case of blocking/nonblocking IO. -Greg

Manuel M T Chakravarty

18 Aug 18 Aug

2 a.m.

Gregory Wright:

...

On Aug 14, 2008, at 9:12 PM, Manuel M T Chakravarty wrote:

...
Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines) or is buggy (on Mac OS with MacPorts), and hence people have trouble getting the sources out of darcs in the first place. How is that going to be addressed if some crucial code still needs to be obtained using darcs?

Regarding darcs on OS X from MacPorts, I am not aware (or have been sent any bug reports) that there are problems with the latest darcs-2.0.0 port. Is there something that I should know (and try to fix)?

The latest port defaults to wget instead of libcurl since I have noticed darcs spinning endlessly when using libcurl. I haven't had time to dtrace what is going on but I'm guessing the underlying problem is likely some misunderstanding of the signal handling API or some corner case of blocking/nonblocking IO.

Well, that "spinning endlessly" is the bug I am referring to. I re- checked my MacPorts darcs2 installation and, you are right, there was an update that removes the use of libcurl. It seems to work *much* better now. Thanks for the fix! You may want to publicise this a bit further. When I asked on #darcs about the problem a few days ago, nobody knew about this update to the port. Manuel

Isaac Dupree

15 Aug 15 Aug

11:13 a.m.

Manuel M T Chakravarty wrote:

...

Thomas Schilling:

...
Are you advocating for ease of use by new developers or for existing developers? Current GHC hackers have to learn Git anyways and know Darcs already. Library patches still have to be recorded separately, so it would be a bit weird, but not much harder, really.

I am arguing for both. It would be more than weird. For example, if you branch ghc, you usually need to branch the core libraries, too. Doing that in two different vcs sounds like a mess to me.

So let's figure out how it would work (I have doubts too!) So, within the directory that's a git repo (ghc), we have some other repos, git (testsuite) and darcs (some libraries). Does anyone know how git handles nested repos even natively? Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches? (Now, I wouldn't be surprised if git, the monstrosity that it is, has already invented answers for these sort of questions :-) But we need to figure out the answers for whatever situation we choose for the 6.11 development cycle, and probably document them somewhere on the wiki (that I lazily didn't bother to check again before writing this message). -Isaac

Max Bolingbroke

12:01 p.m.

2008/8/15 Isaac Dupree :

...

So let's figure out how it would work (I have doubts too!) So, within the directory that's a git repo (ghc), we have some other repos, git (testsuite) and darcs (some libraries). Does anyone know how git handles nested repos even natively?

You can explicitly tell Git about nested Git repos using http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html. This essentially associates a particular version of each subrepo with every version of the repo that contains them, so e.g. checking out GHC from 2 weeks ago could check out the libraries from the same point in time. AFAIK, nothing in Git caters for subrepos of a different VCS.

...

Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches?

Since we will set up Git to ignore the contents of the Darcs repos, it will simply leave them unmodified. This is exactly like the current situation, where rolling back / patching the GHC repo does not affect the others. If you want Darcs-like behaviour (one branch per repo) you are free to do this in Git as well, in which case since you never switch branches the nested Darcs repos should never be inappropriate for your branch. Personally, since I only ever hack GHC and tend to leave the libraries alone, I could still use the in-place branching without difficulty.

...

(Now, I wouldn't be surprised if git, the monstrosity that it is, has already invented answers for these sort of questions :-) But we need to figure out the answers for whatever situation we choose for the 6.11 development cycle, and probably document them somewhere on the wiki (that I lazily didn't bother to check again before writing this message).

The situation above is pretty much the whole story, if we are taking the route where we just convert the GHC+testsuite repo to Git. I don't think it's particularly confusing, but maybe that's because I've spent too long thinking about VCSs :-). This thread has got quite large, and doesn't appear to have made much progress towards a resolution. Let me try and sum up the discussion so far. There seem to be four stakeholders in this switch: a) Current GHC developers b) Future GHC developers c) People who just contribute to the libaries d) Maintainers of other compilers GHC shares repos with And there are at least 5 options for how to proceed: 1) Convert just GHC and Testsuite to Git, leave everything else in Darcs Pros: - No change in habits required for stakeholders c, d - Resolves all Darcs issues discussed at length before, pleasing stakeholders a, b Cons: - Requires two VCSs to be installed and learnt (more points of failure, makes source tree less accessible, doesn't solve any Darcs' build+install problems), affecting stakeholders a and b - Difficult to check out a consistent version of the source tree (no submodules), affecting stakeholders a and b 2) Wait for Darcs2 to get better Pros: - No change in habits required for any stakeholders (though we still have one-off switching cost) - Potentially resolves all Darcs issues, pleasing stakeholders a, b - Only option that will not require a workflow change for GHC developers (more topic branches rather than "spontaneous branches" and cherry-picking), pleasing stakeholders a Cons: - Darcs will probably continue to be less popular and well supported than Git (see Debian popcon graphs for the trend difference). Reduced popularity will affect the ability of stakeholders b to contribute (learning barrier), and less support/real world use may potentially lead to a higher incidence of bugs encountered, affecting stakeholders a-d. This point is certainly debatable. - Apparently somewhat vaporware at the moment 3) Convert all repos to Git Pros: - Native Git submodule integration, makes life easier for stakeholders a-b - Single (popular) command set to learn, single thing to install: makes life better for stakeholder b at least Cons: - Significant inconvenience for stakeholders c-d as they have to change their own projects 4) Branch all repos into Git but leave the Darcs repos alone and push Darcs patches into the Git repos automatically. Never push to these Git repos in any other way, similar to Cabal repo currently Pros: - As option 3 - Stakeholders c-d do not need to do anything Cons: - Makes it harder to hack on the libraries within a GHC checkout, affecting a, b - Automatic synchronisation will require occasional maintenance by someone 5) Branch all repos into Git and then set up a manual merging / sync process that tries to turn Git commits into Darcs patches and vice-versa Pros: - As option 3 - Hack on the libraries in a GHC checkout with ease, pleasing a, b - Stakeholders c-d do not need to do anything Cons: - Synchronisation much more fragile than 4), will likely require constant maintenance This summary is probably incomplete and inaccurate. However, if people find it useful for organising the various lines of discussion on this issue, perhaps someone could Wikify it so we can get a complete, clear picture? My personal preference is for 3), but that's because I'm a stakeholder "a" who isn't a great fan of spontaneous branches! Anyway, there are good arguments on every side, so I don't want to advocate a particular position (and indeed, my opinions quite rightly do not carry any weight! :-). However I'd really like for us to work out what is going on so we have a clear plan for moving away from Darcs 1, which is an inadequate VCS for GHC for reasons that have been discussed to death. I hope (perhaps naively) that this email can provide a framework for reaching a consensus agreeable to all parties. All the best, Max

Ian Lynagh

2:38 p.m.

On Fri, Aug 15, 2008 at 01:01:08PM +0100, Max Bolingbroke wrote:

...

2008/8/15 Isaac Dupree :

...
So let's figure out how it would work (I have doubts too!) So, within the directory that's a git repo (ghc), we have some other repos, git (testsuite) and darcs (some libraries). Does anyone know how git handles nested repos even natively?

You can explicitly tell Git about nested Git repos using http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html. This essentially associates a particular version of each subrepo with every version of the repo that contains them, so e.g. checking out GHC from 2 weeks ago could check out the libraries from the same point in time.

We were talking about this last night on #ghc, and AIUI this doesn't play well with the in-tree branching style that is advocated, e.g. if you want to branch ghc and base then as you change between ghc branch X and Y, git won't automatically change base between branches X' and Y'.

...

...
Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches?

The in-tree branching style also sounds like it won't work well with trees you are working in: If you have a tree built with branch X, and then you swap to branch Y for a minute and then back to branch X, then the timestamps on any source files that differ between the branches will have changed, so the build won't think it is up-to-date any more and you will get needless recompilation. Working only in the "master" branch, and using different repos for branches (i.e. doing what we do with darcs), is an option, although git users seem to think it is a worse way to work; I'm not really clear on the main reasons why. One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?). Hopefully a git person will correct me if I've got something wrong! Thanks Ian

Thomas Schilling

3:09 p.m.

On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...

One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

This is the use case for "git pull --rebase". Instead of creating an automatic merge commit, it rebases your local changes on top of the newly pulled changes (ignoring patches already present, which could happen if you had sent one change as a patch via mail.) The timestamp issue seems tricky, though.

Ian Lynagh

3:24 p.m.

On Fri, Aug 15, 2008 at 05:09:55PM +0200, Thomas Schilling wrote:

...

On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...
One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

This is the use case for "git pull --rebase". Instead of creating an automatic merge commit, it rebases your local changes on top of the newly pulled changes

Hmm, last night the conversation went: < nominolo> malcolmw: so i'm advocating "git pull --rebase" for that use case < glguy_> rebasing can be less successful than merging when dealing with big changes < glguy_> since the rebase happens one commit at a time so I'm confused as to what the best practice is. Thanks Ian

Ian Lynagh

6:19 p.m.

On Fri, Aug 15, 2008 at 04:24:12PM +0100, Ian Lynagh wrote:

...

On Fri, Aug 15, 2008 at 05:09:55PM +0200, Thomas Schilling wrote:

...
On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...
One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

This is the use case for "git pull --rebase". Instead of creating an automatic merge commit, it rebases your local changes on top of the newly pulled changes

Hmm, last night the conversation went:

< nominolo> malcolmw: so i'm advocating "git pull --rebase" for that use case < glguy_> rebasing can be less successful than merging when dealing with big changes < glguy_> since the rebase happens one commit at a time

so I'm confused as to what the best practice is.

We discussed this in #ghc, and the conclusion seems to be: If you have lots of local changes (e.g. the sorts of long-running branch that gives darcs 1 problems), then you need to use merge. If you use rebase then you might end up with lots of conflicts to manually resolve. Using merge gives you automatic merge commits, If you think these are ugly (opinion is divided on that amongst git people; I guess for GHC we'd want to make a global decision about that) then you can use rebase when you have few local changes, and thus you are unlikely to get many conflicts. Using merge you also get a more accurate reflection of the project history, i.e. you can see that the two branches were being developed independently. Thanks Ian

Thomas Schilling

7:53 p.m.

...

If you have lots of local changes (e.g. the sorts of long-running branch that gives darcs 1 problems), then you need to use merge. If you use rebase then you might end up with lots of conflicts to manually resolve.

Using merge gives you automatic merge commits, If you think these are ugly (opinion is divided on that amongst git people; I guess for GHC we'd want to make a global decision about that) then you can use rebase when you have few local changes, and thus you are unlikely to get many conflicts.

Using merge you also get a more accurate reflection of the project history, i.e. you can see that the two branches were being developed independently.

That's not quite accurate: If you have conflicts, you have conflicts and have to resolve them manually. In case of a branch, however, you only have to resolve them once you do the merge, so when _you_ decide, not whenever some upstream change breaks things. Some projects encourage to have one development branch and periodically update the master branch and rebase the development branch on top of it. I think it's a matter of taste and we should probably advocate one usage. I think rebase should only be used for smaller changes. The automatic usefulness of the automatic merge message is varying. I think it makes sense if it contains public repos, like, e.g. "Merge 'master' from git://github.com/chak/ghc", but less useful for pulls from local repos like, e.g. "Merge 'master' from '/home/igloo/tmp/trash/ghc/fix-stupid-osx-bug/'". However, if we prefer merges we get those pretty git history graphs: http://www.flickr.com/photos/malcolmtredinnick/1516857444/

Manuel M T Chakravarty

18 Aug 18 Aug

2:28 a.m.

Ian Lynagh:

...

On Fri, Aug 15, 2008 at 04:24:12PM +0100, Ian Lynagh wrote:

...
On Fri, Aug 15, 2008 at 05:09:55PM +0200, Thomas Schilling wrote:

...
On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...
One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

This is the use case for "git pull --rebase". Instead of creating an automatic merge commit, it rebases your local changes on top of the newly pulled changes

Hmm, last night the conversation went:

< nominolo> malcolmw: so i'm advocating "git pull --rebase" for that use case < glguy_> rebasing can be less successful than merging when dealing with big changes < glguy_> since the rebase happens one commit at a time

so I'm confused as to what the best practice is.

We discussed this in #ghc, and the conclusion seems to be:

If you have lots of local changes (e.g. the sorts of long-running branch that gives darcs 1 problems), then you need to use merge. If you use rebase then you might end up with lots of conflicts to manually resolve.

Using merge gives you automatic merge commits, If you think these are ugly (opinion is divided on that amongst git people; I guess for GHC we'd want to make a global decision about that) then you can use rebase when you have few local changes, and thus you are unlikely to get many conflicts.

Using merge you also get a more accurate reflection of the project history, i.e. you can see that the two branches were being developed independently.

Sorry for being a git n00b, but does using merge mean that we need to use in-place branch switching (which you earlier said won't work well for ghc anyways)? Manuel

Ian Lynagh

9:31 a.m.

On Mon, Aug 18, 2008 at 12:28:03PM +1000, Manuel M T Chakravarty wrote:

...

does using merge mean that we need to use in-place branch switching

No; when you "git pull" (the equivalent of darcs pull -a) it will pull and merge the changes (unless you ask it to rebase them instead of merging them). Thanks Ian

Marc Weber

22 Aug 22 Aug

4:26 a.m.

...

Sorry for being a git n00b, but does using merge mean that we need to use in-place branch switching (which you earlier said won't work well for ghc anyways)?

You have to kind of "branches" : local ones and remote ones. remote ones represent the state of remote ones. The only way I know of to change them is by using git-fetch (which is called by git pull as well) or by editing the files manually On the other hand you normally push your local ones. So if you have /tmp/a/.git (heads master and mybranch) than do cd /tmp git clone a b git will setup .git/refs/remotes/{master,mybranch} .git/refs/heads/master Now you can make mybranch local as well by git branch mybranch remotes/mybranch (indeed you have 4 branches, 2 tracking the remote repo and two you are working with) Now you can do git checktout master; git merge mybranch # merge in place within the same repo or git merge remotes/mybranch # merge with remote branch which is what you'll do when using darcs branch style etc.. After comitting to mybrach git push will change the head of the remote repository. You can setup each local head branch to "track" a remote oone automaticaly so that git pull will rebase or merge depending on your settings (AFAIK) So if you have an active project you end up having dozens of remote branches but only some "heads" you are working on or you want to backup (in case someone else rewrits history or such) HTH Marc Weber

Marc Weber

4:15 a.m.

...

Using merge you also get a more accurate reflection of the project history, i.e. you can see that the two branches were being developed independently. Timestamps will be preserved so not all information is lost..

Marc

Max Bolingbroke

15 Aug 15 Aug

9:17 p.m.

2008/8/15 Ian Lynagh :

...

...
You can explicitly tell Git about nested Git repos using http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html. This essentially associates a particular version of each subrepo with every version of the repo that contains them, so e.g. checking out GHC from 2 weeks ago could check out the libraries from the same point in time.

We were talking about this last night on #ghc, and AIUI this doesn't play well with the in-tree branching style that is advocated, e.g. if you want to branch ghc and base then as you change between ghc branch X and Y, git won't automatically change base between branches X' and Y'.

If you change the submodules in branch X to point to the X' commit in base, and do the corresponding thing for Y and Y', I believe you /would/ get this behaviour (though you might have to remember to do "git submodule update" when switching.. this can probably be automated). Provisio: I'm also not a Git expert, but this is my understanding of how it works. Cheers, Max

Johan Tibell

10:04 p.m.

On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...

One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

I'm not sure if this is what you want but I always use git pull --rebase when I'm pulling to have my local commits lie on top of the one in the published repo. -- Johan

Thomas Schilling

10:21 p.m.

you don't use local branches? On Sat, Aug 16, 2008 at 12:04 AM, Johan Tibell wrote:

...

On Fri, Aug 15, 2008 at 4:38 PM, Ian Lynagh wrote:

...
One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

I'm not sure if this is what you want but I always use git pull --rebase when I'm pulling to have my local commits lie on top of the one in the published repo.

-- Johan _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Johan Tibell

16 Aug 16 Aug

6:23 a.m.

On Sat, Aug 16, 2008 at 12:21 AM, Thomas Schilling wrote:

...

you don't use local branches?

I do. I like to keep a clean linear history on top of the upstream repo. So I might do work in a topic branch, rebase it on my master branch which is synced with upstream and then push. -- Johan

Manuel M T Chakravarty

18 Aug 18 Aug

2:21 a.m.

From what you are saying, it seems that one "advantage" of git (in- place branch switching) is not going to be useful to GHC in any case (because we use nested repositories). Manuel Ian Lynagh:

...

On Fri, Aug 15, 2008 at 01:01:08PM +0100, Max Bolingbroke wrote:

...
2008/8/15 Isaac Dupree :

...
So let's figure out how it would work (I have doubts too!) So, within the directory that's a git repo (ghc), we have some other repos, git (testsuite) and darcs (some libraries). Does anyone know how git handles nested repos even natively?

You can explicitly tell Git about nested Git repos using http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html. This essentially associates a particular version of each subrepo with every version of the repo that contains them, so e.g. checking out GHC from 2 weeks ago could check out the libraries from the same point in time.

We were talking about this last night on #ghc, and AIUI this doesn't play well with the in-tree branching style that is advocated, e.g. if you want to branch ghc and base then as you change between ghc branch X and Y, git won't automatically change base between branches X' and Y'.

...
...
Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches?

The in-tree branching style also sounds like it won't work well with trees you are working in: If you have a tree built with branch X, and then you swap to branch Y for a minute and then back to branch X, then the timestamps on any source files that differ between the branches will have changed, so the build won't think it is up-to-date any more and you will get needless recompilation.

Working only in the "master" branch, and using different repos for branches (i.e. doing what we do with darcs), is an option, although git users seem to think it is a worse way to work; I'm not really clear on the main reasons why.

One way that it is worse is that you will get a lot more "automatic merge" commits when you pull changes from the central repo into a repo in which you have local commits. I don't think that there is anything bad about these, as such; they're just noise in the history. (I'm not sure if it's possible to automatically rebase these away, or something?).

Hopefully a git person will correct me if I've got something wrong!

Thanks Ian

_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Ian Lynagh

9:28 a.m.

On Mon, Aug 18, 2008 at 12:21:47PM +1000, Manuel M T Chakravarty wrote:

...

From what you are saying, it seems that one "advantage" of git (in- place branch switching) is not going to be useful to GHC in any case

Yes.

...

(because we use nested repositories).

That does make it harder, but the main problem is that switching between branches changes the timestamp of files that differ, meaning the build system thinks that recompilation needs to be done. Also, if you have 2 in-place branches of GHC then only one of them can be built at any one time, as they share a working directory. Thanks Ian

Manuel M T Chakravarty

28 Aug 28 Aug

6:31 a.m.

Ian Lynagh:

...

On Mon, Aug 18, 2008 at 12:21:47PM +1000, Manuel M T Chakravarty wrote:

...
From what you are saying, it seems that one "advantage" of git (in- place branch switching) is not going to be useful to GHC in any case

Yes.

...
(because we use nested repositories).

That does make it harder, but the main problem is that switching between branches changes the timestamp of files that differ, meaning the build system thinks that recompilation needs to be done.

Also, if you have 2 in-place branches of GHC then only one of them can be built at any one time, as they share a working directory.

That doesn't sound like GHC-specific issues. So, if inplace branches are useful for other projects -such as the Linux kernel- why shouldn't it be useful for us? Manuel

Ian Lynagh

29 Aug 29 Aug

4:50 p.m.

On Thu, Aug 28, 2008 at 04:31:16PM +1000, Manuel M T Chakravarty wrote:

...

Ian Lynagh:

...
On Mon, Aug 18, 2008 at 12:21:47PM +1000, Manuel M T Chakravarty wrote:

...
From what you are saying, it seems that one "advantage" of git (in- place branch switching) is not going to be useful to GHC in any case

Yes.

...
(because we use nested repositories).

That does make it harder, but the main problem is that switching between branches changes the timestamp of files that differ, meaning the build system thinks that recompilation needs to be done.

Also, if you have 2 in-place branches of GHC then only one of them can be built at any one time, as they share a working directory.

That doesn't sound like GHC-specific issues. So, if inplace branches are useful for other projects -such as the Linux kernel- why shouldn't it be useful for us?

I don't know. Git people, can you fill us in please? Thanks Ian

Simon Marlow

18 Aug 18 Aug

9:55 a.m.

Manuel M T Chakravarty wrote:

...

From what you are saying, it seems that one "advantage" of git (in-place branch switching) is not going to be useful to GHC in any case (because we use nested repositories).

As far as I can tell, in-place branches are not a lot of use to us compared to just having separate checkouts for each local branch. For one thing, having separate source trees lets you keep multiple builds, whereas with in-place branches you can only have one build at a time, and switching branches probably requires a complete rebuild. However, I think I am convinced that using in-place branches for the master repo makes sense. That way we don't need to publish the names of new branches when we make them, and everyone can easily see which branches of GHC are available from the main repo. Cheers, Simon

Marc Weber

22 Aug 22 Aug

4:42 a.m.

On Mon, Aug 18, 2008 at 12:21:47PM +1000, Manuel M T Chakravarty wrote:

...

From what you are saying, it seems that one "advantage" of git (in-place branch switching) is not going to be useful to GHC in any case (because we use nested repositories). Manuel I don't agree. I feel it's convinient. But I make full copies as well because switching the shell with my window manager is faster than checking out another branch.. But this depends on what I want to do.

Ian Lynagh:

...
The in-tree branching style also sounds like it won't work well with trees you are working in: If you have a tree built with branch X, and then you swap to branch Y for a minute and then back to branch X, then the timestamps on any source files that differ between the branches will have changed, so the build won't think it is up-to-date any more and you will get needless recompilation. Which is the fault of make not of git.. Why can't we configure make to use checksum based recompilations (that's possile using scons) Maybe it's possible to hack this in some way? I'd recommend having one working clone and one for browsing. Than you need two clones, but not n (you would have to mantain with darcs..) Why do you want to switch for a minute? There are tools such as gitk/ qgit letting you browse the repository (and all file contents) without switching. I don't think that recompilation is a real issue.

Marc Weber

Manuel M T Chakravarty

16 Aug 16 Aug

5:52 a.m.

Max Bolingbroke:

...

...
Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches?

Since we will set up Git to ignore the contents of the Darcs repos, it will simply leave them unmodified. This is exactly like the current situation, where rolling back / patching the GHC repo does not affect the others. If you want Darcs-like behaviour (one branch per repo) you are free to do this in Git as well, in which case since you never switch branches the nested Darcs repos should never be inappropriate for your branch.

This ignores that the ability to have branches, switch between them, and merge has been cited as one of the reasons for switching to git. Embedded darcs library repos would hence nullify, or at least reduce, one of the advantages. Manuel

Ian Lynagh

15 Aug 15 Aug

2:12 p.m.

On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty wrote:

...

Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines)

I don't remember seeing this mentioned before, and googling for "Solaris T1" darcs doesn't find anything. What goes wrong? I'd expect darcs to build anywhere GHC does. Thanks Ian

Roman Leshchinskiy

3:28 p.m.

On 16/08/2008, at 00:12, Ian Lynagh wrote:

...

On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty wrote:

...
Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines)

I don't remember seeing this mentioned before, and googling for "Solaris T1" darcs doesn't find anything. What goes wrong? I'd expect darcs to build anywhere GHC does.

I only vaguely remember what was wrong but IIRC, the problem was that darcs 1.0.? didn't build with GHC 6.8.? because of some incompatibility in the libs and darcs 2 built ok but didn't work, probably because of libcurl issues. At that point I gave up. Roman

Duncan Coutts

16 Aug 16 Aug

12:01 p.m.

On Fri, 2008-08-15 at 15:12 +0100, Ian Lynagh wrote:

...

On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty wrote:

...
Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines)

I don't remember seeing this mentioned before, and googling for "Solaris T1" darcs doesn't find anything.

That's probably because there entire world are probably only two T1/T2 machines that people are using to run ghc. :-) One of them is at UNSW and the other was recently donated by Sun to the community and is just about to go online at Chalmers.

...

What goes wrong? I'd expect darcs to build anywhere GHC does.

So would I usually, though I've had to turn down cc flags to get darcs to build on ia64 before (SHA1.hs generates enormous register pressure). Duncan

Simon Marlow

18 Aug 18 Aug

10:13 a.m.

Duncan Coutts wrote:

...

On Fri, 2008-08-15 at 15:12 +0100, Ian Lynagh wrote:

...
On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty wrote:

...
Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines) I don't remember seeing this mentioned before, and googling for "Solaris T1" darcs doesn't find anything.

That's probably because there entire world are probably only two T1/T2 machines that people are using to run ghc. :-)

One of them is at UNSW and the other was recently donated by Sun to the community and is just about to go online at Chalmers.

...
What goes wrong? I'd expect darcs to build anywhere GHC does.

So would I usually, though I've had to turn down cc flags to get darcs to build on ia64 before (SHA1.hs generates enormous register pressure).

We should really use a C implementation of SHA1, the Haskell version isn't buying us anything beyond being a stress test of the register allocator. Cheers, Simon

Ben Lippmeier

11:20 a.m.

On 18/08/2008, at 8:13 PM, Simon Marlow wrote:

...

...
So would I usually, though I've had to turn down cc flags to get darcs to build on ia64 before (SHA1.hs generates enormous register pressure).

We should really use a C implementation of SHA1, the Haskell version isn't buying us anything beyond being a stress test of the register allocator.

.. and perhaps a test case for too much code unfolding in GHC? Sounds like bugs to me. :) If you turn down GHC flags the pressure also goes away. Ian: Did this problem result in Intel CC / GCC register allocator freakouts? Ben.

Johan Tibell

19 Aug 19 Aug

9:45 a.m.

Git 1.6.0 was just released [1]. Might be of interest given the current discussion. I cherry picked some highlights that might matter to us: * Source changes needed for porting to MinGW environment are now all in the main git.git codebase. * even more documentation pages are now accessible via "man" and "git help". * "git-add -i" has a new action 'e/dit' to allow you edit the patch hunk manually. 1. http://lkml.org/lkml/2008/8/17/174 Cheers, Johan

Ian Lynagh

10:57 a.m.

On Mon, Aug 18, 2008 at 09:20:54PM +1000, Ben Lippmeier wrote:

...

Ian: Did this problem result in Intel CC / GCC register allocator freakouts?

Have you got me confused with someone else? I don't think I've ever used Intel CC. Thanks Ian

Ben Lippmeier

1:55 p.m.

New subject: SHA1.hs woes, was Version control systems

On 19/08/2008, at 8:57 PM, Ian Lynagh wrote:

...

On Mon, Aug 18, 2008 at 09:20:54PM +1000, Ben Lippmeier wrote:

...
Ian: Did this problem result in Intel CC / GCC register allocator freakouts?

Have you got me confused with someone else? I don't think I've ever used Intel CC.

Sorry, I couldn't find the rest of the preceding message. Someone wrote that they had to turn down cc flags to get SHA1.hs to compile on IA64. What C compiler was being used, and what were the symptoms? SHA1.hs creates vastly more register pressure than any other code I know of (or could find), but only when -O or -O2 is enabled in GHC. If -O and -prof are enabled then the linear allocator runs out of stack slots (last time I checked). I'm wondering three things: 1) If the C compiler could not compile the C code emitted by GHC then maybe we should file a bug report with the CC people. 2) If the register pressure in SHA1.hs is more due to excessive code unfolding than the actual SHA algorithm, then maybe this should be treated as a bug in the simplifier(?) (sorry, I'm not familiar with the core level stuff) 3) Ticket #1993 says that the linear allocator runs out of stack slots, and the graph coloring allocator stack overflows when trying to compile SHA1.hs with -funfolding-use-threshold20. I'm a bit worried about the stack over-flow part. The graph size is O(n^2) in the number of vreg conflicts, which isn't a problem for most code. However, if register pressure in SHA1.hs is proportional to the unfolding threshold (especially if more than linearly) then you could always blow up the graph allocator by setting the threshold arbitrarily high. In this case maybe the allocator should give a warning when the pressure is high and suggest turning the threshold down. Then we could close this issue and prevent it from being re-opened. Cheers, Ben.

Duncan Coutts

7:43 p.m.

New subject: SHA1.hs woes, was Version control systems

On Tue, 2008-08-19 at 23:55 +1000, Ben Lippmeier wrote:

...

On 19/08/2008, at 8:57 PM, Ian Lynagh wrote:

...
On Mon, Aug 18, 2008 at 09:20:54PM +1000, Ben Lippmeier wrote:

...
Ian: Did this problem result in Intel CC / GCC register allocator freakouts?

Have you got me confused with someone else? I don't think I've ever used Intel CC.

Sorry, I couldn't find the rest of the preceding message. Someone wrote that they had to turn down cc flags to get SHA1.hs to compile on IA64.

Yep.

...

What C compiler was being used, and what were the symptoms?

GCC. As I recall the symptoms were that gcc used more than 32 registers and then the mangler balked. The reason is that a registerised ia64 build expects to only use the first 32 registers but does not take any precautions to make sure that this is the case. It just relies on the fact that most code coming out of the ghc backend cannot make use of more than a handful of registers. If gcc does actually use more then the mangler catches this. We tried some flags to make gcc restrict itself to a subset of the registers but could not get it to obey. Duncan

Marc Weber

22 Aug 22 Aug

3:52 a.m.

New subject: Version control systems - no need to fear git

...

I personally don't know Git, and while I'm sure I'll be learning at some point, I'm always nervous about learning a VCS on something I care about, as mistakes can go quite wrong. If I can lend you (or someone else) a hand don't hesitate to contact me. (I'm not a git guru though..) With git you can't get too much wrong because it's very cheap to create additional pointerns / branches.. So if you clone a branch before taking any action you can always reset the messed up branch to the "backup" by git reset --{soft or hard} backupbranchname Or you could write a two line sh script writing all hashes to a temp file etc.. If you just start gitk it will keep all hashes in memory.. so you can recover from those as well (unless using the update menu item or running the garbage collector)

Sincerly Marc Weber

Claus Reinke

13 Aug 13 Aug

9:15 a.m.

New subject: GHC project blog? (Re: Version control systems)

...

We (GHC HQ) are still learning the transition to wider participation in building and hacking on GHC, which we *very much* welcome. Bear with us if we don't get it right first time. We're trying!

And I very much like the steps I've seen recently in explaining what you're doing (sometimes even before you're doing it;-). However, there are so many lists to choose from, and many often opinionated discussions on many of them that will bury those informative messages of yours rather quickly. So if anyone joins the world of GHC in a few weeks, they will be just as lost as everyone was before you started outlining your plans and giving high-level summaries of on-going work in more detail. Perhaps it would be useful for GHC HQ to have a GHC project blog, like other non-trivial projects that like to talk about what they are doing/planning and how the pieces fit together (for examples, see: google, opera, ..)? The follow-on discussions should still be on cvs-ghc, or on cvs-libraries, or on libraries, or on glasgow-haskell-users, or on ghc wiki or trac, or whatever the topic requires. But the original information would be collected in a single place, on a blog with rss feed to which interested parties could be refered. Given the number of things going on, I'm sure such a blog would become required reading rather quickly, even for those not subscribed to cvs-ghc, etc. Just another suggestion;-) Claus

Simon Marlow

9:27 a.m.

New subject: GHC project blog? (Re: Version control systems)

Claus Reinke wrote:

...

Perhaps it would be useful for GHC HQ to have a GHC project blog,

Actually we have talked about doing that, and it's highly likely we'll set one up in due course. I think it's worth letting the current discussion(s) run their course and then we'll have a set of concrete decisions to act upon, one of which will probably be to set up a blog so that GHC devs can communicate what they're up to. Cheers, Simon

Alexander Dunlap

6 Aug 6 Aug

5:36 a.m.

On Tue, Aug 5, 2008 at 2:23 AM, Simon Marlow wrote:

...

(notice how fast that is :-)

git clone has gone about 45 minutes so far without finishing...is that an improvement over darcs? Alex

Simon Marlow

8:08 a.m.

Alexander Dunlap wrote:

...

On Tue, Aug 5, 2008 at 2:23 AM, Simon Marlow wrote:

...
(notice how fast that is :-)

git clone has gone about 45 minutes so far without finishing...is that an improvement over darcs?

I think http is still bandwidth-throttled on darcs.haskell.org. You should get better results cloning the github mirror: git://github.com/ghc-hq/ghc.git Thomas Schilling set this up yesterday. Cheers, Simon

david48

8:29 a.m.

Simon Marlow writes: Hello, This is how I get on : $ git clone http://darcs.haskell.org/ghc.git ok, and fast ! $ cd ghc $ sh boot Looks like you're missing libraries/utils/hsc2hs. Maybe you haven't done './darcs-all get'? $ ./darcs-all get cat: _darcs/prefs/defaultrepo: No such file or directory Couldn't work out defaultrepo at ./darcs-all line 27. Cheers,

Max Bolingbroke

10:29 a.m.

2008/8/6 david48 :

...

cat: _darcs/prefs/defaultrepo: No such file or directory Couldn't work out defaultrepo at ./darcs-all line 27.

You can't yet build from the Git repo, alas. I've added the necessary patches and scripts (you need sync-all, not darcs-all) to http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion but they haven't yet been commited to HEAD. Cheers, Max

Samuel Tardieu

10:35 a.m.

...

...
...
...
...
"Simon" == Simon Marlow writes:

Simon> We already have an up-to-date git mirror thanks to Thomas Simon> Schilling: Simon> git clone http://darcs.haskell.org/ghc.git Simon> (notice how fast that is :-) It would be even much faster if you (Thomas?) setup a git server. It is as easy as "touch git-daemon-export-ok" in the GIT repository and launching "git-daemon /path/to/parent/of/git/repo" at boot time, as shown by Chris Double at http://www.bluishcoder.co.nz/2007/09/how-to-publish-git-repository.html Then the "git://" protocol can be used, which makes intelligent decisions on what needs to be transferred. Sam -- Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/

Thomas Schilling

3:12 p.m.

On 6 Aug 2008, at 12:35, Samuel Tardieu wrote:

...

...
...
...
...
...
"Simon" == Simon Marlow writes:

Simon> We already have an up-to-date git mirror thanks to Thomas Simon> Schilling:

Simon> git clone http://darcs.haskell.org/ghc.git

Simon> (notice how fast that is :-)

It would be even much faster if you (Thomas?) setup a git server. It is as easy as "touch git-daemon-export-ok" in the GIT repository and launching "git-daemon /path/to/parent/of/git/repo" at boot time, as shown by Chris Double at

http://www.bluishcoder.co.nz/2007/09/how-to-publish-git- repository.html

Then the "git://" protocol can be used, which makes intelligent decisions on what needs to be transferred.

Thanks, I will look into it. I need to talk to our admin anyway. / Thomas -- My shadow / Change is coming. / Now is my time. / Listen to my muscle memory. / Contemplate what I've been clinging to. / Forty-six and two ahead of me.

Marc Weber

7 Aug 7 Aug

9:44 a.m.

New subject: somewhat OT: maybe useful git script "git-test-merge"

Hi @ll, I'd like to tell you about a small script I've written to make life easier with git: git clone git://mawercer.de/git-test-merge It remembers test merge setups so that you can merge different feature branches by typing: $ gtm set setup1 branch1 remotes/branch2 branch3 $ gtm update setup1 $ gtm continue # after resolving conflicts To remove brances from the setup you have to edit .git/config Additionally it uses a commit message warning about it beeing a test merge only. Unfortunately I don't know a nice way to share the git-rerere cache yet which remembers conflict resolutions automatically. It works best on orthogonal branches of course :) Read about Linus complaint in man git-rerere to find out why I've written this script Sincerly Marc Weber

6155

Age (days ago)

6179

Last active (days ago)

List overview

Download

157 comments

35 participants

participants (35)

Alexander Dunlap
Austin Seipp
Ben Lippmeier
Brandon S. Allbery KF8NH
Bryan Donlan
Bulat Ziganshin
Claus Reinke
david48
Don Stewart
Donnie Jones
Duncan Coutts
Gour
Gregory Wright
Ian Lynagh
Iavor Diatchki
Isaac Dupree
Jason Dagit
Johan Henriksson
Johan Tibell
Malcolm Wallace
Malcolm Wallace
Manuel M T Chakravarty
Marc Weber
Matthias Kilian
Max Bolingbroke
Neil Mitchell
Norman Ramsey
Roman Leshchinskiy
Ross Paterson
Samuel Tardieu
Sean Leather
Simon Marlow
Simon Peyton-Jones
Sittampalam, Ganesh
Thomas Schilling