How to develop on a (GHC) branch with darcs

Hello, I am doing some work on a GHC branch and I am having a lot of troubles (and spending a lot of time) trying to keep my branch up to date with HEAD, so I would be very grateful for any suggestions by fellow developers of how I might improve the process. Here is what I have tried so far: First Attempt ~~~~~~~~~~~~~ My branch, called 'ghc-tn', was an ordinary darcs repo. I recorded my changes as needed, and every now and then would pull from the HEAD repo. If conflicts occurred, I would resolve them and record a patch. Very quickly I run into what, apparently, is a well-known darcs problem where trying to pull from HEAD would not terminate in a reasonable amount of time. Second Attempt ~~~~~~~~~~~~~~ Avoid "conflict patches" by constantly changing my patches. This is how I've been doing this: Initial state: ghc: a repository with an up-to-date version of GHC head ghc-tn: my feature repo based on a slightly out-of-date GHC HEAD. Goal: Merge ghc-tn with ghc (i.e., integrate developments in GHC HEAD into my branch) Process: 1. Create a temporary repository for the merge: darcs clone --lazy ghc ghc-tn-merge 2. Create a backup of the feature branch (strictly speaking not necessary but past experience shows that it is a good idea to have one of those). darcs clone --lazy ghc-tn ghc-tn-backup 3. Pull features patches from 'ghc-tn' into 'ghc-tn-merge', one at a time. darcs pull ghc-tn y d 3.1. If a feature patch causes a conflict, then resolve the conflict and create a new patch, obliterating the old one: darcs amend-record (creates a new patch, not a conflict patch, I think) After repeating this for all branch patches, I have an updated branch in 'ghc-tn-merge' with two caveats: 1. The new repository does not contain my previous build so I have to re-build the entire GHC and libraries from scratch. This is a problem because GHC is a large project and rebuilding everything takes a while, even on a pretty fast machine. I work around this problem like this: 1.1 Obliterate all branch patches from 'ghc-tn'. This, essentially, rewinds the repository to the last point when I synchronised with HEAD. To do this properly I need to know which patches belong to my branch, and which ones are from GHC. (I've been a bit sloppy about this--- I just use the e-mails of the branch developers to identify these and then look at the patches. A better way would be to have some kind of naming convention which marks all branch patches). 1.2 Pull from 'ghc-tn-merge' into 'ghc-tn'. By construction we know that this will succeed and reintroduce the feature changes, together with any new updates to GHC into 'ghc-tn'. Now 'ghc-tn-merge' and 'ghc-tn-backup' can be deleted. 2. The new repository contains rewritten versions of the branch patches so---if I understand correctly---it is not compatible with the old one (i.e., I cannot just push from my newly updated branch to the public repo for my branch as there will be confusion between the old feature patches and the new ones). I can think of only one solution to this problem, and it is not great: 2.1 Delete the original public repo, and publish the new updated repo, preferably with a new name. In this way, other developers who have the old patches can either just clone the new repo, or go through steps 1.1--1.2 but will not accidentally get in a confused state by mixing up the new feature patches with the old ones. For background, my solution is essentially a manual implementation of what is done by git's "rebase" command---except that there "branch patches" and various "repository states" are automatically managed by the system so there is no need to follow various naming conventions which tend to be error prone. Apologies for the longish e-mail but this seems like an important problem and I am hoping that there's a better way to do things. -Iavor

Hi,
On 6 December 2010 01:57, Iavor Diatchki
I am doing some work on a GHC branch and I am having a lot of troubles (and spending a lot of time) trying to keep my branch up to date with HEAD, so I would be very grateful for any suggestions by fellow developers of how I might improve the process.
Unfortunately I don't have any useful advice on how to avoid the problem. I've found exactly the same issues for every GHC branch I've ever developed, but haven't found a nice workaround. I'd really like to see some solution though. Back in the day this is the reason that I pushed for moving GHC to Git (though that process sort of ran out of steam), and I still think that would be a good move, but any sort of solution would be good. Cheers, Max

It seems a shame that it would be so difficult to maintain a separate GHC branch. Having no long-term branches myself, I haven't yet felt the pain, but reading this email chain I am rather discouraged from attempting it. It makes sense to me to have tool support for branching and merging in a large open source project like GHC and doing so could increase the number of people interested in maintaining active branches. I would certainly support a move to git. P.S. Apparently Linus used to use Lennart's method of diff and patch for version control before switching to bitkeeper and then git: http://www.youtube.com/watch?v=4XpnKHJAok8 about 10:30 minutes in. I guess it's a sign of a true hacker :) On Dec 6, 2010, at 11:05 AM, Max Bolingbroke wrote:
Hi,
On 6 December 2010 01:57, Iavor Diatchki
wrote: I am doing some work on a GHC branch and I am having a lot of troubles (and spending a lot of time) trying to keep my branch up to date with HEAD, so I would be very grateful for any suggestions by fellow developers of how I might improve the process.
Unfortunately I don't have any useful advice on how to avoid the problem. I've found exactly the same issues for every GHC branch I've ever developed, but haven't found a nice workaround.
I'd really like to see some solution though. Back in the day this is the reason that I pushed for moving GHC to Git (though that process sort of ran out of steam), and I still think that would be a good move, but any sort of solution would be good.
Cheers, Max
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/7/10 21:42 , David Peixotto wrote:
P.S. Apparently Linus used to use Lennart's method of diff and patch for version control before switching to bitkeeper and then git: http://www.youtube.com/watch?v=4XpnKHJAok8 about 10:30 minutes in. I guess it's a sign of a true hacker :)
Right up until it bit him in the butt and he released a trashed kernel source tree as a result. (When is about when most people finally figure out that VCSes aren't pointless busywork.) - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkz/AVgACgkQIn7hlCsL25WorwCggF2OuwKWnVufktcfvA3rUZTw kqcAnj5K9UxzhE8/Fx8npAqNOvG39r1d =W2g7 -----END PGP SIGNATURE-----

I too wish there was a good solution here. I've taken to making dated repos, thus http://darcs.haskell.org/ghc-new-co-17Nov10 When it becomes unusable, I make a brand new repo, with a new date starting from HEAD, pull all the old patches, unrecord them all, rerecord a mega-patch, and commit. This is darcs's primary shortcoming. It is well known, and the darcs folk are working on it. But I don't think they expect to have a solution anytime soon. (Please correct me if I'm wrong.) Is the pain of this more than the pain of switching to git? Until now we have not had many active collaborators with their own trees. Now we have at least three: Iavor (numeric types), Brent (new coercions), Pedro (new generics). So it's becoming a much bigger issue. One thing: | Pull features patches from 'ghc-tn' into 'ghc-tn-merge', one at a time. | darcs pull ghc-tn | y | d Darcs can help with that. Use 'darcs pull --skip-conflicts' to pull all non-conflicting patches. Then you can pull a single conflicting patch. That speeds things up quite a bit. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Iavor Diatchki | Sent: 06 December 2010 01:57 | To: GHC Users Mailing List; darcs-users@darcs.net | Subject: How to develop on a (GHC) branch with darcs | | Hello, | | I am doing some work on a GHC branch and I am having a lot of troubles | (and spending a lot of time) trying to keep my branch up to date with HEAD, | so I would be very grateful for any suggestions by fellow developers of how | I might improve the process. Here is what I have tried so far: | | First Attempt | ~~~~~~~~~~~~~ | | My branch, called 'ghc-tn', was an ordinary darcs repo. I recorded | my changes as needed, and every now and then would pull from the HEAD repo. | If conflicts occurred, I would resolve them and record a patch. | | Very quickly I run into what, apparently, is a well-known darcs problem | where trying to pull from HEAD would not terminate in a reasonable | amount of time. | | | Second Attempt | ~~~~~~~~~~~~~~ | | Avoid "conflict patches" by constantly changing my patches. This is how | I've been doing this: | | Initial state: | ghc: a repository with an up-to-date version of GHC head | ghc-tn: my feature repo based on a slightly out-of-date GHC HEAD. | | Goal: | Merge ghc-tn with ghc (i.e., integrate developments in GHC HEAD into my branch) | | Process: | 1. Create a temporary repository for the merge: | darcs clone --lazy ghc ghc-tn-merge | | 2. Create a backup of the feature branch (strictly speaking not necessary | but past experience shows that it is a good idea to have one of those). | darcs clone --lazy ghc-tn ghc-tn-backup | | 3. Pull features patches from 'ghc-tn' into 'ghc-tn-merge', one at a time. | darcs pull ghc-tn | y | d | | 3.1. If a feature patch causes a conflict, then resolve the conflict | and create a new patch, obliterating the old one: | darcs amend-record (creates a new patch, not a conflict patch, I think) | | After repeating this for all branch patches, I have an updated branch | in 'ghc-tn-merge' with two caveats: | | 1. The new repository does not contain my previous build so I have to | re-build the entire GHC and libraries from scratch. This is a problem | because GHC is a large project and rebuilding everything takes a while, | even on a pretty fast machine. I work around this problem like this: | | 1.1 Obliterate all branch patches from 'ghc-tn'. This, essentially, | rewinds the repository to the last point when I synchronised with HEAD. | To do this properly I need to know which patches belong to my branch, | and which ones are from GHC. (I've been a bit sloppy about this--- | I just use the e-mails of the branch developers to identify these and | then look at the patches. A better way would be to have some kind | of naming convention which marks all branch patches). | | 1.2 Pull from 'ghc-tn-merge' into 'ghc-tn'. By construction we know that | this will succeed and reintroduce the feature changes, together with | any new updates to GHC into 'ghc-tn'. Now 'ghc-tn-merge' and | 'ghc-tn-backup' can be deleted. | | 2. The new repository contains rewritten versions of the branch patches | so---if I understand correctly---it is not compatible with the old one | (i.e., I cannot just push from my newly updated branch to the public repo | for my branch as there will be confusion between the old feature patches | and the new ones). I can think of only one solution to this problem, | and it is not great: | | 2.1 Delete the original public repo, and publish the new updated repo, | preferably with a new name. In this way, other developers who have | the old patches can either just clone the new repo, or go through | steps 1.1--1.2 but will not accidentally get in a confused state | by mixing up the new feature patches with the old ones. | | For background, my solution is essentially a manual implementation of what | is done by git's "rebase" command---except that there "branch patches" and | various "repository states" are automatically managed by the system so there | is no need to follow various naming conventions which tend to be error prone. | | Apologies for the longish e-mail but this seems like an important | problem and I am hoping that there's a better way to do things. | | -Iavor | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Mon, 6 Dec 2010, Simon Peyton-Jones wrote:
I too wish there was a good solution here. I've taken to making dated repos, thus http://darcs.haskell.org/ghc-new-co-17Nov10
When it becomes unusable, I make a brand new repo, with a new date starting from HEAD, pull all the old patches, unrecord them all, rerecord a mega-patch, and commit.
This is darcs's primary shortcoming. It is well known, and the darcs folk are working on it. But I don't think they expect to have a solution anytime soon. (Please correct me if I'm wrong.)
I think there are three things that can help with this problem: 1) a darcs rebase command. This will give you a nice way to manage the workflow already discussed, and you won't have to squish everything through into a mega-patch. You'll still have to periodically abandon one branch for another though (but I think that's also the case with git rebase). I also have some hope, though this is more speculative, of offering a clean way of tracking the relationship between the old branch and the new branch so that any stray patches against the old branch can be cleanly rebased to the new branch later on. I'm actively working on rebase (with some gaps to refactor the darcs codebase to make working on it easier) and very much hope to have it in the next darcs release. Simon M has already tried out an experimental version and was quite positive about it, though there's significant work yet to do. If anyone else wants to try it, please do: see the thread at http://lists.osuosl.org/pipermail/darcs-users/2010-August/024924.html 2) multi-branch repos. We've pretty much agreed we need these; I think the strongest motivation is being able to keep the same build products around when switching branches. No concrete plans, but perhaps the release after next if we can manage it? 3) Better performance when there are conflicts, so you don't have to rebase as often/ever. For this you need a new patch format. GHC is using v1 patches, but darcs also now has v2 patches, which get into exponential merges much less often - but it's still possible, and we know of bugs in the merging which can hit in complex cases (v1 patches also have a few buggy corner cases). You also have to go through an explicit conversion step to switch to v2. I think we need to have another go at figuring out the problem once and for all (i.e. v3) but we don't know for sure how to do this. Something related, but not exactly addressing the problems you all describe is: 4) Better UI around managing conflicts - one frequently requested thing is to be able to see the names of the patches that caused the conflicts. I'm working on this actively (it's also useful for rebase) and I also hope/expect to have this in the next release. Another thing that'll definitely be in the next release is that conflict marks will include the original text as well, so you can work out what changes each side of the conflict made. In my experience that actually makes a huge difference and it's very annoying we didn't do it earlier. and, once we've got better at the basics, 5) we'd love to add new patch types that reduce the number of conflicts you get at all. Some ideas include "hunk move" patches that track when you move code from place to place, identation patches, and patches that track character changes to an individual line. Again, no timescale, but having refactored some of the core patch code recently it's now much clearer how we could do this. Finally, I think the future holds more hybrid environments where different people use different VCSes and bridge between them. (At least, I hope so, it's the only hope darcs has of staying relevant in the wider world :-). Petr Rockai's recent darcs-fastconvert tool offers incremental darcs-git conversions, which I think should allow people who are happier with git to use that instead and only convert back to darcs to submit their patches. [It may be that previous tools also offered this, I'm not certain.] Cheers, Ganesh

How could a darcs guy educate himself about this problem, by following your workflow and trying out some things ? Is there an accessible developer's repo I could pull from to produce conflicts at a similar rate to you ? My usual repos are not so conflictful.

Like everyone else I have no good solution.
When I had a ghc branch I used diff and patch to move my patches forward.
Not exactly what you expect to have to do with a version control system.
On Mon, Dec 6, 2010 at 1:57 AM, Iavor Diatchki
Hello,
I am doing some work on a GHC branch and I am having a lot of troubles (and spending a lot of time) trying to keep my branch up to date with HEAD, so I would be very grateful for any suggestions by fellow developers of how I might improve the process. Here is what I have tried so far:
First Attempt ~~~~~~~~~~~~~
My branch, called 'ghc-tn', was an ordinary darcs repo. I recorded my changes as needed, and every now and then would pull from the HEAD repo. If conflicts occurred, I would resolve them and record a patch.
Very quickly I run into what, apparently, is a well-known darcs problem where trying to pull from HEAD would not terminate in a reasonable amount of time.
Second Attempt ~~~~~~~~~~~~~~
Avoid "conflict patches" by constantly changing my patches. This is how I've been doing this:
Initial state: ghc: a repository with an up-to-date version of GHC head ghc-tn: my feature repo based on a slightly out-of-date GHC HEAD.
Goal: Merge ghc-tn with ghc (i.e., integrate developments in GHC HEAD into my branch)
Process: 1. Create a temporary repository for the merge: darcs clone --lazy ghc ghc-tn-merge
2. Create a backup of the feature branch (strictly speaking not necessary but past experience shows that it is a good idea to have one of those). darcs clone --lazy ghc-tn ghc-tn-backup
3. Pull features patches from 'ghc-tn' into 'ghc-tn-merge', one at a time. darcs pull ghc-tn y d
3.1. If a feature patch causes a conflict, then resolve the conflict and create a new patch, obliterating the old one: darcs amend-record (creates a new patch, not a conflict patch, I think)
After repeating this for all branch patches, I have an updated branch in 'ghc-tn-merge' with two caveats:
1. The new repository does not contain my previous build so I have to re-build the entire GHC and libraries from scratch. This is a problem because GHC is a large project and rebuilding everything takes a while, even on a pretty fast machine. I work around this problem like this:
1.1 Obliterate all branch patches from 'ghc-tn'. This, essentially, rewinds the repository to the last point when I synchronised with HEAD. To do this properly I need to know which patches belong to my branch, and which ones are from GHC. (I've been a bit sloppy about this--- I just use the e-mails of the branch developers to identify these and then look at the patches. A better way would be to have some kind of naming convention which marks all branch patches).
1.2 Pull from 'ghc-tn-merge' into 'ghc-tn'. By construction we know that this will succeed and reintroduce the feature changes, together with any new updates to GHC into 'ghc-tn'. Now 'ghc-tn-merge' and 'ghc-tn-backup' can be deleted.
2. The new repository contains rewritten versions of the branch patches so---if I understand correctly---it is not compatible with the old one (i.e., I cannot just push from my newly updated branch to the public repo for my branch as there will be confusion between the old feature patches and the new ones). I can think of only one solution to this problem, and it is not great:
2.1 Delete the original public repo, and publish the new updated repo, preferably with a new name. In this way, other developers who have the old patches can either just clone the new repo, or go through steps 1.1--1.2 but will not accidentally get in a confused state by mixing up the new feature patches with the old ones.
For background, my solution is essentially a manual implementation of what is done by git's "rebase" command---except that there "branch patches" and various "repository states" are automatically managed by the system so there is no need to follow various naming conventions which tend to be error prone.
Apologies for the longish e-mail but this seems like an important problem and I am hoping that there's a better way to do things.
-Iavor
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 06/12/2010 01:57, Iavor Diatchki wrote:
Hello,
I am doing some work on a GHC branch and I am having a lot of troubles (and spending a lot of time) trying to keep my branch up to date with HEAD, so I would be very grateful for any suggestions by fellow developers of how I might improve the process.
Firstly, in GHC we never have conflicting patches in the main trunk, and we never commit conflict resolutions. This is due to darcs' performance and UI issues with conflicts - life is much easier if we have no conflicts in the trunk. (one or two have slipped in by accident in the past, though). In case you haven't seen this, there are some guidelines for using darcs with GHC in the wiki: http://hackage.haskell.org/trac/ghc/wiki/WorkingConventions/Darcs So, when merging a branch with HEAD, you have to rebase, as you noticed. With darcs as it stands, you can't rebase a series of patches with dependencies, so you have to squash your local patch history into one big patch. For my branches, however, I've been using Ganesh's pre-release rebase support. The UI has a few issues, but I've found that if you follow the workflow carefully, it does the job. http://wiki.darcs.net/Ideas/RebaseStatus Don't forget about --skip-conflicts. I have it on by default for pulls in my ~/.darcs/defaults. Cheers, Simon
participants (9)
-
Brandon S Allbery KF8NH
-
David Peixotto
-
Ganesh Sittampalam
-
Iavor Diatchki
-
Lennart Augustsson
-
Max Bolingbroke
-
Simon Marlow
-
Simon Michael
-
Simon Peyton-Jones