
2008/8/15 Isaac Dupree
So let's figure out how it would work (I have doubts too!) So, within the directory that's a git repo (ghc), we have some other repos, git (testsuite) and darcs (some libraries). Does anyone know how git handles nested repos even natively?
You can explicitly tell Git about nested Git repos using http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html. This essentially associates a particular version of each subrepo with every version of the repo that contains them, so e.g. checking out GHC from 2 weeks ago could check out the libraries from the same point in time. AFAIK, nothing in Git caters for subrepos of a different VCS.
Then, adding complexity, git branches are normally done by switching in-place. So how does this interact with VCS like darcs that doesn't have a concept of in-place switching of branches?
Since we will set up Git to ignore the contents of the Darcs repos, it will simply leave them unmodified. This is exactly like the current situation, where rolling back / patching the GHC repo does not affect the others. If you want Darcs-like behaviour (one branch per repo) you are free to do this in Git as well, in which case since you never switch branches the nested Darcs repos should never be inappropriate for your branch. Personally, since I only ever hack GHC and tend to leave the libraries alone, I could still use the in-place branching without difficulty.
(Now, I wouldn't be surprised if git, the monstrosity that it is, has already invented answers for these sort of questions :-) But we need to figure out the answers for whatever situation we choose for the 6.11 development cycle, and probably document them somewhere on the wiki (that I lazily didn't bother to check again before writing this message).
The situation above is pretty much the whole story, if we are taking the route where we just convert the GHC+testsuite repo to Git. I don't think it's particularly confusing, but maybe that's because I've spent too long thinking about VCSs :-). This thread has got quite large, and doesn't appear to have made much progress towards a resolution. Let me try and sum up the discussion so far. There seem to be four stakeholders in this switch: a) Current GHC developers b) Future GHC developers c) People who just contribute to the libaries d) Maintainers of other compilers GHC shares repos with And there are at least 5 options for how to proceed: 1) Convert just GHC and Testsuite to Git, leave everything else in Darcs Pros: - No change in habits required for stakeholders c, d - Resolves all Darcs issues discussed at length before, pleasing stakeholders a, b Cons: - Requires two VCSs to be installed and learnt (more points of failure, makes source tree less accessible, doesn't solve any Darcs' build+install problems), affecting stakeholders a and b - Difficult to check out a consistent version of the source tree (no submodules), affecting stakeholders a and b 2) Wait for Darcs2 to get better Pros: - No change in habits required for any stakeholders (though we still have one-off switching cost) - Potentially resolves all Darcs issues, pleasing stakeholders a, b - Only option that will not require a workflow change for GHC developers (more topic branches rather than "spontaneous branches" and cherry-picking), pleasing stakeholders a Cons: - Darcs will probably continue to be less popular and well supported than Git (see Debian popcon graphs for the trend difference). Reduced popularity will affect the ability of stakeholders b to contribute (learning barrier), and less support/real world use may potentially lead to a higher incidence of bugs encountered, affecting stakeholders a-d. This point is certainly debatable. - Apparently somewhat vaporware at the moment 3) Convert all repos to Git Pros: - Native Git submodule integration, makes life easier for stakeholders a-b - Single (popular) command set to learn, single thing to install: makes life better for stakeholder b at least Cons: - Significant inconvenience for stakeholders c-d as they have to change their own projects 4) Branch all repos into Git but leave the Darcs repos alone and push Darcs patches into the Git repos automatically. Never push to these Git repos in any other way, similar to Cabal repo currently Pros: - As option 3 - Stakeholders c-d do not need to do anything Cons: - Makes it harder to hack on the libraries within a GHC checkout, affecting a, b - Automatic synchronisation will require occasional maintenance by someone 5) Branch all repos into Git and then set up a manual merging / sync process that tries to turn Git commits into Darcs patches and vice-versa Pros: - As option 3 - Hack on the libraries in a GHC checkout with ease, pleasing a, b - Stakeholders c-d do not need to do anything Cons: - Synchronisation much more fragile than 4), will likely require constant maintenance This summary is probably incomplete and inaccurate. However, if people find it useful for organising the various lines of discussion on this issue, perhaps someone could Wikify it so we can get a complete, clear picture? My personal preference is for 3), but that's because I'm a stakeholder "a" who isn't a great fan of spontaneous branches! Anyway, there are good arguments on every side, so I don't want to advocate a particular position (and indeed, my opinions quite rightly do not carry any weight! :-). However I'd really like for us to work out what is going on so we have a clear plan for moving away from Darcs 1, which is an inadequate VCS for GHC for reasons that have been discussed to death. I hope (perhaps naively) that this email can provide a framework for reaching a consensus agreeable to all parties. All the best, Max