
* Simon Marlow:
Thanks for this. I distilled your example into a shell script that uses git, and demonstrates that git gets the merge wrong:
http://hpaste.org/42953/git_mismerge
Still, git could get this merge right, it just doesn't (I know there are more complex cases that would be very hard for git to get right). I suspect that in practice this rarely matters, because context-based merging usually does the right thing.
Git will have a very hard time getting this right because it is not that history-aware. It's also unlikely that this is implemented because this mismatching of changes happens only rarely, unless you have a coding style which heavily relies on copy-and-paste. (It has happened in real-world merges, though. It is also easy to construct similar examples involving file renames, I believe.) I know only one criterion for merge correctness: developers working serially on the code base would end up with the same result. (This is based on the concept of a serializability in transaction processing systems.) It is clear that no system can satisfy this. For instance, suppose you have a LaTeX document for one-page flyer. Obviously, there is a very hard requirement that you can have only one page of text. Two parallel edits can satisfy this constraint, but their automatic merge might not. (Zooko's example is different in that there is an apparently correct solution, so it is not absolutely necessary to bail out, but of course, the authors could likely squeeze their content on a single page, too.) Inevitably, you have to make trade-offs. The Git approach seems to suit more developers and codebases than the darcs approach. Git mismerges are much rarer than non-completing darcs merges. On the other hand, speaking as a non-contributor, the requirement to deal with multiple version control systems seems awkward. But the current sub-tree approach also feels a bit clunky (same as for OpenJDK, by the way).