[ANNOUNCE] git-darcs-import 0.1

Hi, I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500+), which is quite fast [3]. Incremental updates work, too. The program is still rough around the edges, and there's some cosmetical work to do, especially with respect to converting author names. The program should recover from most errors, as long as nobody else modifies the destination repository. Nevertheless, it seems quite useable already. I hope somebody finds this useful. You can grab the source at http://int-e.home.tlink.de/haskell/git-darcs-import-0.1.tar.bz2 Look at the README for further information. Credits go to: David Roundy and all contributors for darcs2. The code base is surprisingly pleasant to work with. And of course, Linus Torvalds, Junio Hamano and all other git contributors. Enjoy, Bertram [1] http://repo.or.cz/w/darcs2git.git?a=shortlog [2] http://git.sanityinc.com/?p=darcs-to-git.git [3] http://nominolo.blogspot.com/2008/05/thing-that-should-not-be-or-how-to.html

On 1 jun 2008, at 20.44, Bertram Felgenhauer wrote:
Hi,
I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500 +), which is quite fast [3]. Incremental updates work, too.
Nice! Do you happen to also have a darcs (or Git) repository somewhere? / Thomas -- Monkey killing monkey killing monkey over pieces of the ground. Silly monkeys give them thumbs they forge a blade And where there's one they're bound to divide it Right in two

Thomas Schilling wrote:
On 1 jun 2008, at 20.44, Bertram Felgenhauer wrote:
[git-darcs-import]
Nice! Do you happen to also have a darcs (or Git) repository somewhere?
I've uploaded my (git) repo to repo.or.cz, see http://repo.or.cz/w/git-darcs-import.git Patches are welcome. enjoy, Bertram

On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer
Hi,
I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500+), which is quite fast [3]. Incremental updates work, too.
What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that. -- Darrin

2008/6/3 Darrin Thompson
On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer
wrote: Hi,
I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500+), which is quite fast [3]. Incremental updates work, too.
What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that.
Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following? Basically, git is waaay faster than Darcs on a number of use cases. So, maybe the point of using this converter is when you just cannot use Darcs any more (too old/big project, merging huge branch with loads of conflicts, I don't know). Another point may be "broadcast-ability": It is possible to expose two repositories: one Darcs, one Git. If I use Git and not Darcs (please don't sue me), it will be simpler for me to get the source from the Git snapshot, provided there is one. Well, if I want to contribute back... maybe I should switch. I think the True Heresy (and most useful, if practical) would be to convert back and forth between the two version control systems, accepting patches from both :-) Loup

Loup Vaillant wrote:
2008/6/3 Darrin Thompson
: <--cut--> What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that.
Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following?
Basically, git is waaay faster than Darcs on a number of use cases.
Other reason can be "git rebase". Of course there is a question how good practice it is ... but it is being used. Peter.

On 2008-06-03, Peter Hercek
Loup Vaillant wrote:
2008/6/3 Darrin Thompson
: <--cut--> What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that.
Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following?
Basically, git is waaay faster than Darcs on a number of use cases.
Other reason can be "git rebase". Of course there is a question how good practice it is ... but it is being used.
Darcs patches are pretty much an implicit rebase. -- Aaron Denney -><-

Aaron Denney wrote:
On 2008-06-03, Peter Hercek
wrote: Loup Vaillant wrote:
2008/6/3 Darrin Thompson
: <--cut--> What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that. Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following?
Basically, git is waaay faster than Darcs on a number of use cases. Other reason can be "git rebase". Of course there is a question how good practice it is ... but it is being used.
Darcs patches are pretty much an implicit rebase.
You cannot push patch B if it depends on patch A without also pushing A. And darcs currently does not alow you to reorder B before A (which is what git rebase actually does). Git rebase works quite well even in cloned repositories. See: http://bugs.darcs.net/issue891 Some discussin about it is also here: http://lists.osuosl.org/pipermail/darcs-users/2008-February/011564.html When the issue is fixed then darcs will be really patch based and will become the ultimate DSCM :-)

This is drifting off-topic, but...
On 2008-06-03, Peter Hercek
Aaron Denney wrote:
On 2008-06-03, Peter Hercek
wrote: Loup Vaillant wrote:
2008/6/3 Darrin Thompson
: <--cut--> What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that. Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following?
Basically, git is waaay faster than Darcs on a number of use cases. Other reason can be "git rebase". Of course there is a question how good practice it is ... but it is being used.
Darcs patches are pretty much an implicit rebase.
You cannot push patch B if it depends on patch A without also pushing A. And darcs currently does not alow you to reorder B before A
True. This is a *feature* not a bug. You shouldn't be able to do this automatically, because it can't be done right. You need to do this sort of thing manually. If you don't, the heuristics used will bite you at some point. When they do commute, there is no problem.
Git rebase works quite well even in cloned repositories.
Meh. It can, if you're really really lucky.
See: http://bugs.darcs.net/issue891 Some discussin about it is also here: http://lists.osuosl.org/pipermail/darcs-users/2008-February/011564.html
When the issue is fixed then darcs will be really patch based and will become the ultimate DSCM :-)
Rebasing is doable in git as a one-repository operation because each repository has multiple branches. As darcs has one repo per branch, it fundamentally needs to be done in multiple repos. There are naturally two repos, upstream, and your-feature-development. your-feature-development has a patch A that you want to rebase. What you should do is pull upstream into new-tracking, then pull patch A from your-feature-development into new-tracking. If it applies with no problem, great: mv your-feature-development your-feature-development-old; new-tracking your-feature-development. Of course, in this case, you could have just pulled into your-feature-development. If there weren't any other patches to save in the old your-feature-development, you can delete it instead of moving it. When there is a conflict, then you need to handle it somehow. Neither git nor darcs can do it automatically. You can just record the merge conflict and your resolution. This keeps repos that pulled from you valid, but this won't give you the "clean history" that you presumably want. So you need to combine the merger and cleanup into a new patch with the same log message, etc. It's true that git does make *this* process very nice. There is one thing that git rebase does easily (and correctly) that darcs doesn't do nicely: rewriting history by merging commits "prior" to the head. I put prior in quotes, because darcs doesn't preserve history in the first place. I don't find that a compelling use, as opposed to maintaing topic branches. -- Aaron Denney -><-

Aaron Denney wrote:
This is drifting off-topic, but... On 2008-06-03, Peter Hercek
wrote: Darcs patches are pretty much an implicit rebase. You cannot push patch B if it depends on patch A without also
Aaron Denney wrote: <--- cut ---> pushing A. And darcs currently does not alow you to reorder B before A
True. This is a *feature* not a bug. You shouldn't be able to do this automatically, because it can't be done right. You need to do this sort of thing manually. If you don't, the heuristics used will bite you at some point. When they do commute, there is no problem.
Sorry, I did not intend to indicate it should be done without doing the reordering first (by providing manual conflict resolution).
Git rebase works quite well even in cloned repositories.
Meh. It can, if you're really really lucky.
Actually you are probably right, I needed to use a non-complicated workaround once (but I did it only about two times!). I might have been just lucky. I liked though that it did tell me what was wrong, in contrast to mercurial queues which just replicated both original branch and the rebased branch (so I finished with two copies on both sides at the end :-( ). <--- cut --->
Rebasing is doable in git as a one-repository operation because each repository has multiple branches. As darcs has one repo per branch, it fundamentally needs to be done in multiple repos.
There are naturally two repos, upstream, and your-feature-development.
your-feature-development has a patch A that you want to rebase.
What you should do is pull upstream into new-tracking, then pull patch A from your-feature-development into new-tracking.
If it applies with no problem, great: mv your-feature-development your-feature-development-old; new-tracking your-feature-development. Of course, in this case, you could have just pulled into your-feature-development. If there weren't any other patches to save in the old your-feature-development, you can delete it instead of moving it.
When there is a conflict, then you need to handle it somehow. Neither git nor darcs can do it automatically. You can just record the merge conflict and your resolution. This keeps repos that pulled from you valid, but this won't give you the "clean history" that you presumably want. So you need to combine the merger and cleanup into a new patch with the same log message, etc. It's true that git does make *this* process very nice.
Ok, in such a simple case darcs can preserve the message too if the repository is not cloned (and you indicated that it does not really work with cloned repositories in git - I'm not an experienced git user). Just pull to the original repository and use amend-record to resolve the conflict and the message will be preserved. So I would tell that for *this* *simple* case darcs is better. But what about this git rebasing option? How to do it more easily (than the solution I know and I described it later) in darcs? using "git-rebase --onto master next topic" to get from: o---o---o---o---o master \ o---o---o---o---o next \ o---o---o topic to: o---o---o---o---o master | \ | o'--o'--o' topic \ o---o---o---o---o next This is the reason why I mentioned reordering depending patches AB to BA (with manual conflict resolution) would be needed in darcs to support (I believe a better) alternative to git rebase. I do not know how to do this in darcs (without doing manual addition of "topic" changes with gnu patch utility in a new darcs repository clone which would not have "topic" changes (and "next" changes as well) pulled in and throwing avay the old one at the end).
There is one thing that git rebase does easily (and correctly) that darcs doesn't do nicely: rewriting history by merging commits "prior" to the head. I put prior in quotes, because darcs doesn't preserve history in the first place. I don't find that a compelling use, as opposed to maintaing topic branches.
I do not know what you mean here. Can you point me to some example? I hope that this is not too off-topic for haskell cafe ... and so far I believe this is not a flame war :-) I just like that Bertram's code exists and I think it (as well as git) should not be dismissed, since AFAIK there is more than performance to git as well as there is more to darcs than it not imposing patch order on us (which is the darcs feature I like). Peter.

Peter Hercek wrote:
But what about this git rebasing option? How to do it more easily (than the solution I know and I described it later) in darcs?
using "git-rebase --onto master next topic" to get from: o---o---o---o---o master \ o---o---o---o---o next \ o---o---o topic to:
o---o---o---o---o master | \ | o'--o'--o' topic \ o---o---o---o---o next
This is the reason why I mentioned reordering depending patches AB to BA (with manual conflict resolution) would be needed in darcs to support (I believe a better) alternative to git rebase.
I don't understand (probably because I haven't use either dvcs). Either the changes in the next->topic path don't depend on the changes in the fork->next path. Then, the patches commute and it's no problem for darcs. Or the next->topic path relies on features from next that are not present in master . But then, you're screwed anyway and should merge some parts from next into master so as to advance the point where master and next fork. o---o---o---o---o master \ x---x---o---o---o next \ o---o---o topic (Of course, you don't actually advance the fork but rather add patches at the end of master . Hm, set of patches semantics seem to be a lot nicer here anyway. To me, the whole point of rebasing seems to be to somehow bring set semantics into the tree semantics.) Regards, apfelmus

On 2008-06-04, apfelmus
Peter Hercek wrote:
But what about this git rebasing option? How to do it more easily (than the solution I know and I described it later) in darcs?
using "git-rebase --onto master next topic" to get from: to:
o---o---o---o---o master \ o---o---o---o---o next \ o---o---o topic
o---o---o---o---o master | \ | o'--o'--o' topic \ o---o---o---o---o next
This is the reason why I mentioned reordering depending patches AB to BA (with manual conflict resolution) would be needed in darcs to support (I believe a better) alternative to git rebase.
I don't understand (probably because I haven't use either dvcs).
Either the changes in the next->topic path don't depend on the changes in the fork->next path. Then, the patches commute and it's no problem for darcs.
Right. Then
o---o---o---o---o master \ o---o---o---o---o next \ o---o---o topic
is not a good model for what darcs has. What it has is more like
o---o---o---o---o master |\ | o---o---o---o---o next \ | o---o---o--------+ topic
The patches in "topic" that are in "next" are indepent of the ones that aren't in "next", so it's another (virtual) line-of-development, that darcs can lazily construct as needed. These lines-of-development are similar to branches of git that have been merged, but you also have access to the "unmerged" versions until a patch comes in that depends on the merger. If I commit three new features that don't interact, a darcs repo will essentially look like: ---- topicA - / \ history --- topicB --+-- \ / ---- topicC - Where the merger is "virtual". Darcs will implicitly linearize this to any of history --- topicA --- topicB --- topicC --- history --- topicA --- topicC --- topicB --- history --- topicB --- topicA --- topicC --- history --- topicB --- topicC --- topicA --- history --- topicC --- topicA --- topicB --- history --- topicC --- topicB --- topicA --- /as needed/. git constructs one of these, based on how you did the commits, and gives you ways to alter it to the others.
Or the next->topic path relies on features from next that are not present in master . But then, you're screwed anyway
Yep.
and should merge some parts from next into master so as to advance the point where master and next fork.
That's one solution. Of course, darcs doesn't have semantic dependency, but syntactic dependency. (You can add extra dependencies to model semantic dependencies, but you can't take away the syntactic dependencies.) Another solution, if there's syntactic, but not semantic dependencies, is to manually use patch and diff to get 90% there, and then cleanup and record. -- Aaron Denney -><-

Aaron Denney wrote:
On 2008-06-04, apfelmus
wrote: <-- cut --> Or the next->topic path relies on features from next that are not present in master . But then, you're screwed anyway
Yep.
Well not really, depends what kind the dependency is, this kind of rebase is useful when "topic" depends only syntactically (as you pointed later) on "next" or when the semantic dependency is only on a small part of "next". Git rebase allows you get the syntax or the small part of semantics to the rebased "topic" by asking you for (manual) conflict resolution. This would correspond to commuting darcs patches which depend on each other (again possible by providing manual conflict resolution). Of course this happens only when it was anticipated that upstream merge of "next" happens before "topic", but then the upstream maintainers decided that "topic" should go upstream first. So, not often.
and should merge some parts from next into master so as to advance the point where master and next fork.
That's one solution. Of course, darcs doesn't have semantic dependency, but syntactic dependency. (You can add extra dependencies to model semantic dependencies, but you can't take away the syntactic dependencies.) Another solution, if there's syntactic, but not semantic dependencies, is to manually use patch and diff to get 90% there, and then cleanup and record.
OK, so I think this is what I expected for such a case. Thanks for the explanation of the meaning of "merging patches prior head". Peter.

On 2008-06-04, Peter Hercek
But what about this git rebasing option? How to do it more easily (than the solution I know and I described it later) in darcs?
using "git-rebase --onto master next topic" to get from: o---o---o---o---o master \ o---o---o---o---o next \ o---o---o topic to:
o---o---o---o---o master | \ | o'--o'--o' topic \ o---o---o---o---o next
apfelmus answered this. I might expand on his reply.
There is one thing that git rebase does easily (and correctly) that darcs doesn't do nicely: rewriting history by merging commits "prior" to the head. I put prior in quotes, because darcs doesn't preserve history in the first place. I don't find that a compelling use, as opposed to maintaing topic branches.
I do not know what you mean here. Can you point me to some example?
Letting capitals be commits, and lowercase be trees at the point of these commits. Suppose your history is: A -> B -> C -> D | | | | a b c d And that B somehow doesn't make sense except with the additional changes in C. You don't want to deal with this, or have anyone see B. All it does is clutter up the history. So you want to expunged it from the history. git rebase can rewrite this to A ------> C' -> D' | | | a c d Doing this in darcs would require unrecording B and C, and then rerecording C'. But, if D is in the repo, then it is likely that B and C can't be commuted past it to be unrecorded. (If they can, no problem!) Unrecording D (and possible E, F, G, etc.) lets you do this, but if you then pull it back from another repo, it will depend on B and C, and pull these in, which are now doppelgangers of C'. Not having used darcs 2, I'm not sure if that's still quite so fatal, but it remains bad news AIUI. The bottom line is that darcs is a tool for managing sets of always existing patches. and ordering them lazily, as needed. In particular, no history generally exists, unless each patch depends on exactly one previous. It has a "differential" view of software development, in that the changes, and not the sum at each point matter (though of course, the current sum does matter.) On the other hand, git is a tool for managing (and munging) histories of development in many weird and wacky ways. It has an "integral" view of software development, the changes are lazily derived from the saved state at each point, and are strictly ordered even when they're independent. It can, when needed, work with these changes to accomplish fairly interesting history-altering tasks, but as soon as they're used to construct a new history, they're discarded. (Yes, git uses deltas, but this is "merely" an optimization.) The two models are dual to each other in many ways. -- Aaron Denney -><-

Loup Vaillant wrote:
2008/6/3 Darrin Thompson
: On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer
wrote: Hi,
I'm pleased to announce yet another tool for importing darcs repositories to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in Haskell, on top of the darcs2 source code. The result is a much faster program - it can convert the complete ghc 6.9 branch (without libraries) in less than 15 minutes on my slightly dated machine (Athlon XP 2500+), which is quite fast [3]. Incremental updates work, too.
What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that.
Disclaimer: I'm no expert, this is what I've heard. Anyone please confirm or deny the following?
I've never been a cool kid at school, but I switched from Darcs to Git recently. I have not regretted it. Git has quite a few features Darcs doesn't by now, and there is a little bit (but not much) in the other direction. That and the lack of the indempotent merge bug. Git's interface has really cleaned up in the last year, and it seems to be well on the way to becoming the defacto DVCS of choice. Maybe next week, when it's picked up the last of the superdelegates, we can say for sure, but of course bzr won't conceed anything at this point.... (OK, so we've had mind-numbing election coverage here in the US for too long) I've blogged about this. http://changelog.complete.org/plugin/tag/git will get you most of the relevant posts. -- John

Darrin Thompson wrote:
On Sun, Jun 1, 2008 at 2:44 PM, Bertram Felgenhauer
wrote: I'm pleased to announce yet another tool for importing darcs repositories to git. [...]
What's the appeal of this? I personally love git, but I thought all the cool kids at this school used darcs and that was that.
For myself, git-darcs-import itself is an opportunity to learn more about both darcs and git. It wasn't meant to be argument in the git vs. darcs discussion, although it was inevitable that it would be seen as such. I really like darcs' concepts, but in my opinion, darcs doesn't get enough power out of the theory of patches to really shine so far. This is a hard problem, and I can't offer solutions. Ideally, you'd have semantic patches which just commute with virtually all other patches because they "know" what they are about. The only thing that darcs offers in that direction - besides handling conflicts, mergers and undos gracefully, which is quite useful in itself - is a keyword substitution patch type. In the meantime, I prefer git to darcs, mainly because I'm sort of attached to seeing the development history, i.e. I prefer to think of patches as (partially) ordered instead of being a cloud of patches that darcs uses as a model. Bertram
participants (8)
-
Aaron Denney
-
apfelmus
-
Bertram Felgenhauer
-
Darrin Thompson
-
John Goerzen
-
Loup Vaillant
-
Peter Hercek
-
Thomas Schilling