
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?

On Thu, Apr 21, 2011 at 1:29 PM, Andrew Coppin
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?
It's been documented in the GHC discussions, on reddit, and elsewhere. I would encourage you to look at the darcs-users mailing list archives and the ghc archives. My personal summary is as follows: * There is religion/fan-boy-ism around git and in general vcs is subject to network effects. * Github enables a level of collaboration that is hard to get with darcs. Some people say this as: Github is the best thing about git. * Performance concerns (for example, darcs annotate needs too much time/memory). * Conflict merging issues (patch theory has flaws that lead to exponential time merges). Darcs has some additional flaws that people complain about, but which I don't think are core to the issue: * Conflict markers are hard to understand * patches as a set instead of linear history (patch soup complaints) * It's written in Haskell * It's not popular enough * People say they just don't get patch theory I hope that helps, Jason

My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with. Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging. I think it's been kept alive in the Haskell community out of pure "eat our dogfood" instinct; IMO if having a VCS written in Haskell is important, it would be better to just write a new implementation of an existing tool. Of course, nobody cares that much about what language their VCS is written in, generally. Beyond that, the feeling I get of the three major DVCS alternatives is: git: Used by Linux kernel hackers, and Rails plugin developers who think they're more important than Linux kernel hackers hg/bzr: Used by people who don't like git's UI, and flipped heads/tails when picking a DVCS (hg and bzr are basically equivalent)

On Thu, Apr 21, 2011 at 7:16 PM, John Millikin
My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with.
For me its greatest asset is the patch theory. I find it much easier to understand than git's model. I don't know about hg/bzr, but I guess they have a model similar to git's. So I guess this point is up to debate. =) Cheers, -- Felipe.

John Millikin wrote:
My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with.
Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging.
I have two projects, one has about 50k lines of C code thats kept in Bzr and the other has 50k lines of Haskell code thats kept in Darcs. They both have similar sized commit and branch histories. I find the speed on Bzr and Darcs on those two projects to be pretty much the same. Most operations on a local repo take well less than 5 seconds. Git may be faster but if its under 5 seconds who cares. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Um, the patch theory is what makes darcs "just work". There is no need
to understand it any more than you have to know VLSI design to
understand how your computer works. The end result is that darcs
repositories don't get corrupted and the order you integrate patches
doesn't affect things meaning cherrypicking is painless.
I think the main problem with patch theory is with its PR. It is a
super cool algorithm and rightly droundy should be proud of it so he
highlighted it. I think this caused people to think they had to
understand the patch theory rather than just sit back and enjoy it.
Incidentally, I wrote a github like site based around darcs a few
years ago at codehole.org. It is just used internally by me for
certain projects. but if people were interested, I could resume work
on it and make it public.
John
On Thu, Apr 21, 2011 at 3:16 PM, John Millikin
My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with.
Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging.
I think it's been kept alive in the Haskell community out of pure "eat our dogfood" instinct; IMO if having a VCS written in Haskell is important, it would be better to just write a new implementation of an existing tool. Of course, nobody cares that much about what language their VCS is written in, generally.
Beyond that, the feeling I get of the three major DVCS alternatives is:
git: Used by Linux kernel hackers, and Rails plugin developers who think they're more important than Linux kernel hackers
hg/bzr: Used by people who don't like git's UI, and flipped heads/tails when picking a DVCS (hg and bzr are basically equivalent)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 2011-04-21 at 16:16 -0700, John Meacham wrote:
Um, the patch theory is what makes darcs "just work". There is no need to understand it any more than you have to know VLSI design to understand how your computer works. The end result is that darcs repositories don't get corrupted and the order you integrate patches doesn't affect things meaning cherrypicking is painless.
While I appriciate the patch theory I don't think darcs fits the workflow of at least some people Assume following changes 1. Feature X - file x.hs 2. Feature Y - file y.hs and x.hs 3. Feature Z - file z.hs and x.hs 4. Fix to feature Y (changes x.hs) 5. Fix to feature X (changes x.hs) Now before pushing I would like to have 3 nice commits. In git I can rewrite history by single command: # git rebase -i origin/master and edit the file to look like pick 1 fixup 5 pick 2 fixup 4 pick 3 Manually/automatically check everything is ok. Regards

On Thu, Apr 21, 2011 at 6:32 PM, Maciej Marcin Piechotka
Assume following changes 1. Feature X - file x.hs 2. Feature Y - file y.hs and x.hs 3. Feature Z - file z.hs and x.hs 4. Fix to feature Y (changes x.hs) 5. Fix to feature X (changes x.hs)
Now before pushing I would like to have 3 nice commits. In git I can rewrite history by single command:
# git rebase -i origin/master
and edit the file to look like
pick 1 fixup 5 pick 2 fixup 4 pick 3
Manually/automatically check everything is ok.
% darcs unrec -a -p 'Fix to feature X'
Finished unrecording.
% darcs amend -a -p 'Feature X'
Thu Apr 21 19:11:54 CDT 2011 Jake McArthur

On Thu, 2011-04-21 at 19:19 -0500, Jake McArthur wrote:
On Thu, Apr 21, 2011 at 6:32 PM, Maciej Marcin Piechotka
wrote: Assume following changes 1. Feature X - file x.hs 2. Feature Y - file y.hs and x.hs 3. Feature Z - file z.hs and x.hs 4. Fix to feature Y (changes x.hs) 5. Fix to feature X (changes x.hs)
Now before pushing I would like to have 3 nice commits. In git I can rewrite history by single command:
# git rebase -i origin/master
and edit the file to look like
pick 1 fixup 5 pick 2 fixup 4 pick 3
Manually/automatically check everything is ok.
% darcs unrec -a -p 'Fix to feature X' Finished unrecording. % darcs amend -a -p 'Feature X' Thu Apr 21 19:11:54 CDT 2011 Jake McArthur
* Feature X Shall I amend this patch? [yN...], or ? for more options: y Finished amending patch: Thu Apr 21 19:14:41 CDT 2011 Jake McArthur * Feature X % darcs unrec -a -p 'Fix to feature Y' Finished unrecording. % darcs amend -a -p 'Feature Y' Thu Apr 21 19:12:12 CDT 2011 Jake McArthur * Feature Y Shall I amend this patch? [yN...], or ? for more options: y Finished amending patch: Thu Apr 21 19:14:50 CDT 2011 Jake McArthur * Feature Y - Jake
Last time I checked it disallowed my as 5 depended on 4 which depended
on 3 which depended on 2 which depended on 1 as all changed x.hs:
Fri Apr 22 02:30:03 CEST 2011 Maciej Piechotka

On Thu, Apr 21, 2011 at 7:31 PM, Maciej Marcin Piechotka
Last time I checked it disallowed my as 5 depended on 4 which depended on 3 which depended on 2 which depended on 1 as all changed x.hs
Merely changing the same file is not sufficient for that. In order for Darcs to say patch A depends on patch B they must change the same lines. That said, rebase has its uses. It's due in an upcoming version of Darcs, actually. - Jake

On Thu, 2011-04-21 at 19:39 -0500, Jake McArthur wrote:
On Thu, Apr 21, 2011 at 7:31 PM, Maciej Marcin Piechotka
wrote: Last time I checked it disallowed my as 5 depended on 4 which depended on 3 which depended on 2 which depended on 1 as all changed x.hs
Merely changing the same file is not sufficient for that. In order for Darcs to say patch A depends on patch B they must change the same lines.
Or nearby lines...
That said, rebase has its uses. It's due in an upcoming version of Darcs, actually.
- Jake
Great Regards

+1 to what you said. On 4/21/11 4:16 PM, John Meacham wrote:
Incidentally, I wrote a github like site based around darcs a few years ago at codehole.org. It is just used internally by me for certain projects. but if people were interested, I could resume work on it and make it public.
John, please do - darcs folks are longing for a really good hub. You're probably aware of patch-tag and darcsden - perhaps you can exceed or reuse those ? Both are good but have maintainers now focussed elsewhere. Running a robust scalable public darcs hub is difficult. I think darcs developers are keen to help anyone working on that. FYI codehole seems access-restricted. -Simon

Codehole doesn't sound like a good name. Don't lose stuff in codehole!
Sent from my iPhone
On Apr 21, 2011, at 7:33 PM, Simon Michael
+1 to what you said.
On 4/21/11 4:16 PM, John Meacham wrote:
Incidentally, I wrote a github like site based around darcs a few years ago at codehole.org. It is just used internally by me for certain projects. but if people were interested, I could resume work on it and make it public.
John, please do - darcs folks are longing for a really good hub. You're probably aware of patch-tag and darcsden - perhaps you can exceed or reuse those ? Both are good but have maintainers now focussed elsewhere. Running a robust scalable public darcs hub is difficult. I think darcs developers are keen to help anyone working on that. FYI codehole seems access-restricted.
-Simon
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 4/21/11 10:33 PM, Simon Michael wrote:
+1 to what you said.
On 4/21/11 4:16 PM, John Meacham wrote:
Incidentally, I wrote a github like site based around darcs a few years ago at codehole.org. It is just used internally by me for certain projects. but if people were interested, I could resume work on it and make it public.
John, please do - darcs folks are longing for a really good hub. You're probably aware of patch-tag and darcsden - perhaps you can exceed or reuse those ? Both are good but have maintainers now focussed elsewhere.
Agreed. I'd love to see a good code host for darcs. Actually, for my uses, the code hosting itself isn't that important ---since darcs can fabulously be hosted from any http server--- but rather, what I'd like is someplace to keep my code which also provides a good bugtracker. Unfortunately, neither darcsden nor patchtag offer bugtrackers (AFAIK) and neither github, bitbucket, nor googlecode offer darcs. -- Live well, ~wren

On 4/22/11 11:39 AM, Simon Michael wrote:
On 4/21/11 10:16 PM, wren ng thornton wrote:
rather, what I'd like is someplace to keep my code which also provides a good bugtracker. Unfortunately, neither darcsden nor patchtag offer
darcsden does include a simple issue tracker now.
Ah, excellent. I'll have to take a look again. -- Live well, ~wren

On Thursday, April 21, 2011 4:16:07 PM UTC-7, John Meacham wrote:
Um, the patch theory is what makes darcs "just work". There is no need to understand it any more than you have to know VLSI design to understand how your computer works. The end result is that darcs repositories don't get corrupted and the order you integrate patches doesn't affect things meaning cherrypicking is painless.
This is how it's *supposed* to work. My chief complaints with PT are: - Metadata about branches and merges gets lost. This makes later examination of the merge history impossible, or at least unfeasibly difficult. - Every commit needs --ask-deps , because the automatic dependency detector can only detect automatic changes (and not things like adding a new function in a different module) - The order patches are integrated still matters (it's impossible for it to not matter), but there's no longer any direct support for ordering them, so large merges become very manual. - If you ever merge in the wrong order, future merges will begin consuming more and more CPU time until the repository "dies". Undoing this requires using darcs-fastconvert and performing manual surgery on the export files.

On Thu, Apr 21, 2011 at 8:42 PM, John Millikin
On Thursday, April 21, 2011 4:16:07 PM UTC-7, John Meacham wrote:
Um, the patch theory is what makes darcs "just work". There is no need to understand it any more than you have to know VLSI design to understand how your computer works. The end result is that darcs repositories don't get corrupted and the order you integrate patches doesn't affect things meaning cherrypicking is painless.
This is how it's *supposed* to work. My chief complaints with PT are:
- Metadata about branches and merges gets lost. This makes later examination of the merge history impossible, or at least unfeasibly difficult.
That's not an issue with patch theory though. Darcs could still track that and I believe some people have been playing with the idea.
- Every commit needs --ask-deps , because the automatic dependency detector can only detect automatic changes (and not things like adding a new function in a different module)
You mean it can only detect dependencies that depend on each other with respect to a diff of the changes. Detecting most anything else would be undecidable in the general case. As a divergent data point, I've been using darcs since 2003 and I have yet to use --ask-deps except to learn how it works.
- The order patches are integrated still matters (it's impossible for it to not matter), but there's no longer any direct support for ordering them, so large merges become very manual.
Can you give an example where you need to control the order of the changes in a merge with git/bzr/svn/etc but that it was not possible with darcs? I'm trying to understand what you mean.
- If you ever merge in the wrong order, future merges will begin consuming more and more CPU time until the repository "dies". Undoing this requires using darcs-fastconvert and performing manual surgery on the export files.
Yes, this is true. Exponential merges still exist, although they are relatively rare with a darcs-2 formated repository. Jason

Jason Dagit schrieb:
* Every commit needs --ask-deps , because the automatic dependency detector can only detect automatic changes (and not things like adding a new function in a different module)
You mean it can only detect dependencies that depend on each other with respect to a diff of the changes. Detecting most anything else would be undecidable in the general case. As a divergent data point, I've been using darcs since 2003 and I have yet to use --ask-deps except to learn how it works.
I think that other version control systems just assume that a new patch depends on _all_ existing patches. Mentally I am still comparing Darcs with Subversion, that is, what my colleagues use. Thus I am even happy with Darcs-1. :-) Sure, sometimes I have to re-record several patches, when I forgot to download latest patches from a repository before recording my own ones. My rule of thumb is, that I must avoid merges at all costs.

On 21/04/2011 11:16 PM, John Millikin wrote:
My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with.
Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging.
I think it's been kept alive in the Haskell community out of pure "eat our dogfood" instinct; IMO if having a VCS written in Haskell is important, it would be better to just write a new implementation of an existing tool. Of course, nobody cares that much about what language their VCS is written in, generally.
Ah, how silly of me. I should have known a question like this was highly likely to provoke a flamewar. I had assumed that the way Darcs was is *the definition of* what "distributed version control" is. So it was a bit of a shock to read about how Git works, and discovered that it does it totally wrong. So I want and read about Mercural and all the others, and discovered that they all do it wrong too. Given that the way Darcs works is so superior to the way everything else works, I was just puzzled as to why even GHC is trying to get rid of it. It seems the answer is some combination of "performance issues" (I've never seen any) and "reliability issues" (which again I've never come across).

I'm a great fan of darcs, and also have never run into the performance and
reliability issues that GHC has. That said, it's clear that they *have* run
into them, and if something else makes GHC development go more smoothly,
then I'm 100% supportive of their using it.
It is disappointing, though that (I agree with you here) git and others have
a fundamentally bad model for performing the task. They chose that model
for pragmatic reasons... it's operationally clearer, even if the meaning of
things is a bit more muddled. Making a working znd pragmatic version
control system using a darcs-ish model is simply a harder job than doing the
same in the git/hg way. I use darcs whenever I can, and think they have
done an excellent job by and large; but you won't find a single darcs
developer who thinks they have completely accomplished the task.
On Apr 23, 2011 5:57 AM, "Andrew Coppin"
On 21/04/2011 11:16 PM, John Millikin wrote:
My chief complaint is that it's built on "patch theory", which is ill-defined and doesn't seem particularly useful. The Bazaar/Git/Mercurial DAG model is much easier to understand and work with.
Possibly as a consequence of its shaky foundation, Darcs is much slower than the competition -- this becomes noticeable for even very small repositories, when doing a lot of branching and merging.
I think it's been kept alive in the Haskell community out of pure "eat our dogfood" instinct; IMO if having a VCS written in Haskell is important, it would be better to just write a new implementation of an existing tool. Of course, nobody cares that much about what language their VCS is written in, generally.
Ah, how silly of me. I should have known a question like this was highly likely to provoke a flamewar.
I had assumed that the way Darcs was is *the definition of* what "distributed version control" is. So it was a bit of a shock to read about how Git works, and discovered that it does it totally wrong. So I want and read about Mercural and all the others, and discovered that they all do it wrong too.
Given that the way Darcs works is so superior to the way everything else works, I was just puzzled as to why even GHC is trying to get rid of it.
It seems the answer is some combination of "performance issues" (I've never seen any) and "reliability issues" (which again I've never come across).
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Andrew Coppin wrote:
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects.
I not sure what constitues a "real" project, but I have found Darcs to be more than satisfactory for Ben Lippmeier's DDC compiler project. Thats 50k lines of Haskell code with a commit history of 2500+ commits. I also do use other VCSs, in order of frequency Bzr, SVN, Git Darcs and Hg. However, my order of preference is Bzr, Darcs, Hg, Git and then SVN. The only reason I slight prefer Bzr over Darcs is that Bzr has a slightly easier and more intuitive (at least for me) user interface. However, I do find Bzr (written in Python) slightly fragile in that I ocassionally get a huge Python backtrace when something blows up Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Thu, Apr 21, 2011 at 3:29 PM, Andrew Coppin
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?
Yi, a fairly large and old repository, recently moved to (primarily) Git. Our motivation was not flaws in Darcs, but rather GitHub. -- Jeff Wheeler Undergraduate, Electrical Engineering University of Illinois at Urbana-Champaign

On Thu, 2011-04-21 at 21:29 +0100, Andrew Coppin wrote:
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?
I believe the biggest problem was (i.e. when migration started) that there is no big-name-hosting supporting darcs. When code.haskell.org went down people were cut off from code. Regards

On 04/22/11 01:34 AM, Maciej Marcin Piechotka wrote:
On Thu, 2011-04-21 at 21:29 +0100, Andrew Coppin wrote:
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?
I believe the biggest problem was (i.e. when migration started) that there is no big-name-hosting supporting darcs. When code.haskell.org went down people were cut off from code.
http://www.patch-tag.com/ is not enough for you? Cheers, Karel

Maciej> I believe the biggest problem was (i.e. when migration started) Maciej> that there is no big-name-hosting supporting darcs. When Maciej> code.haskell.org went down people were cut off from code. Please forgive me if the answer is obvious : is Darcs storage "backend agnostic", or must it really store things in local filesystem ? AFAIK, BitBucket team patched (internally) mercurial to allow storage to amz s3. I don't want to advertise for a particular key-value storage provider, but most of them are quiet reliable and cheap. Designing a hub that would store everything in such a key-value persistent store should not be too hard to get right, reliable and distributed. Hosting it would be cheap also. -- Paul

Good chance you've already read this but if not here is a good post by
Linus about his take on the problems with darcs:
http://markmail.org/message/vk3gf7ap5auxcxnb
I personally think he is right on the money here. The other problem
with Darcs is performance. While it has improved a lot its still not
good enough. When GHC was using darcs you couldn't use the annotate
command because it took far too long to run. If you can't use certain
features of your vcs because of performance its a big fail. GHC isn't
even really that large a code base, imagine trying to use darcs for
say the Linux kernel. I also don't think darcs handles branches and
merging well enough.
Cheers,
David
On 21 April 2011 13:29, Andrew Coppin
I'm sure this must be a VFAQ, but... There seems to be universal agreement that Darcs is a nice idea, but is unsuitable for "real" projects. Even GHC keeps talking about getting rid of Darcs. Can anybody tell me what the "problems" with Darcs actually are?
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

David Terei wrote:
Good chance you've already read this but if not here is a good post by Linus about his take on the problems with darcs:
I always have to smile at the complaint that something is "academic". :D You know, like purely functional programming, that's soo academic. It's centered around some academic ideas, like mathematical functions, higher-rank types, monads and zygohistomorphic prepromorphisms, that have absolutely no relevance in real life, and that just don't work in practice. You do *not* want to write whole programs that way. At some point, you need something that works at another level than pure functions. What the *hell* do you do? I think a better invective would be "amazing". Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On Sat, 23 Apr 2011, Heinrich Apfelmus wrote:
David Terei wrote:
Good chance you've already read this but if not here is a good post by Linus about his take on the problems with darcs:
I always have to smile at the complaint that something is "academic". :D
You know, like purely functional programming, that's soo academic. It's centered around some academic ideas, like mathematical functions, higher-rank types, monads and zygohistomorphic prepromorphisms, that have absolutely no relevance in real life, and that just don't work in practice. You do *not* want to write whole programs that way. At some point, you need something that works at another level than pure functions. What the *hell* do you do?
I also found the introduction about 'darcs' being too academic quite silly. However at the end of his invited rant Linus proposes a requirement (or may we call it 'axiom'?), that would be nice to be have: An identifier (a 'version') that can be uniquely mapped to a set of files and their contents. In Darcs this is the darcs history and it is usually the largest part of submitted darcs patches.

On Sat, 2011-04-23 at 12:31 +0200, Heinrich Apfelmus wrote:
David Terei wrote:
Good chance you've already read this but if not here is a good post by Linus about his take on the problems with darcs:
I always have to smile at the complaint that something is "academic". :D
You know, like purely functional programming, that's soo academic. It's centered around some academic ideas, like mathematical functions, higher-rank types, monads and zygohistomorphic prepromorphisms, that have absolutely no relevance in real life, and that just don't work in practice. You do *not* want to write whole programs that way. At some point, you need something that works at another level than pure functions. What the *hell* do you do?
I think a better invective would be "amazing".
Best regards, Heinrich Apfelmus
To be fair he "realize[s] that's a pretty weak flame, and I'm sorry". He gives reason why he thinks it have no meaning in real live like: "Fundmantal example: somebody has a problem/bug. Tell me how to tell a developer what his exact version is - without creating new tags, and without having to synchronize the archives. Just tell the developer what version he is at." I'm not English native speaker but there are 2 reasons why we may assume that he does not think "academic == irrelevant": - quotes around word usually denotes non-literal meaning (in this context) - IIRC "that" can be used only in defing relative clauses. Therefore he does not think (or he might not think) academic idea have no meaning in real life but those particular ideas. I'm not saying that he's right but he implied much less that it one would assume from quote <<"academic">>. Regards I think that quotes around the word + structure of sentence (relative clause with that which implies it is non-defining)

I've discovered something interesting. Darcs stores history as a partially-ordered set of changes. This is a beautiful and elegant idea. In theory, this lets me apply any combination of changes, possibly generating file "versions" which have never actually existed before. (E.g., the new type checker from GHC 7.0 embedded in the GHC 6.6 codebase - not that I imagine it would compile, but in principle I can do it.) So I was a little surprised to discover that... Darcs doesn't actually support doing this. Darcs is only really interested in the result of applying *all* changes in a repo. If you want to apply some subset of changes, you need to make a seperate repo containing only the changes you want applied. It seems daft to me that you would design a sophisticated system for splitting history into independent chunks, and then not let me manipulate them independently. (If you think about it, the difference between, say, GHC 7.0 and GHC 6.6 is which set of changes are applied. Yet because Darcs doesn't support looking at it like this, you must have a completely seperate repo for each one...)

On Sun, 24 Apr 2011, Andrew Coppin wrote:
(If you think about it, the difference between, say, GHC 7.0 and GHC 6.6 is which set of changes are applied. Yet because Darcs doesn't support looking at it like this, you must have a completely seperate repo for each one...)
But darcs shares the patch files between repositories on the same file system using hard links.

On Sun, Apr 24, 2011 at 2:05 AM, Andrew Coppin
I've discovered something interesting.
Darcs stores history as a partially-ordered set of changes. This is a beautiful and elegant idea. In theory, this lets me apply any combination of changes, possibly generating file "versions" which have never actually existed before. (E.g., the new type checker from GHC 7.0 embedded in the GHC 6.6 codebase - not that I imagine it would compile, but in principle I can do it.)
So I was a little surprised to discover that... Darcs doesn't actually support doing this. Darcs is only really interested in the result of applying *all* changes in a repo. If you want to apply some subset of changes, you need to make a seperate repo containing only the changes you want applied.
It seems daft to me that you would design a sophisticated system for splitting history into independent chunks, and then not let me manipulate them independently.
This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree. To get the effect you want, you simply create two repositories. One having only the patches for ghc 6.6 and one having the patches of ghc 7.0 and then you pull just the patches you want from 7.0 into 6.6. There are options to 'darcs get' that help you select the right set of patches to help you create the two repositories. If you're interested in the details of how to do it, I would suggest asking on the darcs-users mailing list. Jason

On 24/04/2011 06:33 PM, Jason Dagit wrote:
On Sun, Apr 24, 2011 at 2:05 AM, Andrew Coppin
mailto:andrewcoppin@btinternet.com> wrote: So I was a little surprised to discover that... Darcs doesn't actually support doing this. Darcs is only really interested in the result of applying *all* changes in a repo.
It seems daft to me that you would design a sophisticated system for splitting history into independent chunks, and then not let me manipulate them independently.
This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree.
Yes, this seems clear. I'm just wondering whether or not it's the best design choice.
To get the effect you want, you simply create two repositories. One having only the patches for ghc 6.6 and one having the patches of ghc 7.0 and then you pull just the patches you want from 7.0 into 6.6. There are options to 'darcs get' that help you select the right set of patches to help you create the two repositories.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice, and you're not really storing the history of the relationship between two branches, only the history of the branch itself.

On 25 Apr 2011, at 11:13, Andrew Coppin wrote:
On 24/04/2011 06:33 PM, Jason Dagit wrote:
This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree.
Yes, this seems clear. I'm just wondering whether or not it's the best design choice.
It seems to me to be a considerable insight. Branches and repositories are the same thing. There is no need for two separate concepts. The main reason other VCSes have two concepts is because one of them is often more efficiently implemented (internally) than the other. But that's silly - how much better to abstract over the mental clutter, and let the implementation decide how its internals look! So in darcs, two repositories on the same machine share the same files (like a branch), but if they are on different machines, they have separate copies of the files. The difference is a detail that you really don't need to know or care about.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice,
No, if on the same machine, the patches only appear once, it is just the index that duplicates some information (I think). In fact just as if it were a branch in another VCS. Regards, Malcolm

This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree.
Yes, this seems clear. I'm just wondering whether or not it's the best design choice.
It seems to me to be a considerable insight.
Presumably David thought the same. I won't deny that there is a certain simplifying elegance to it.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice,
No, if on the same machine, the patches only appear once, it is just the index that duplicates some information (I think). In fact just as if it were a branch in another VCS.
1. Conceptually, you have the same information twice. 2. I have no idea how to make Darcs do the thing with "hard links" (is that even supported under Windows?) I just copy the whole folder using the normal OS file tools. Either way, you lose the ability to see how branches are related to each other, which might be useful in some cases.

On 26 April 2011 13:16, Andrew Coppin
2. I have no idea how to make Darcs do the thing with "hard links" (is that even supported under Windows?) I just copy the whole folder using the normal OS file tools.
darcs get path/to/other/local/repo
Either way, you lose the ability to see how branches are related to each other, which might be useful in some cases.
How do you "see" how git branches are related to each other? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On Tue, Apr 26, 2011 at 6:35 AM, Ivan Lazar Miljenovic < ivan.miljenovic@gmail.com> wrote:
On 26 April 2011 13:16, Andrew Coppin
wrote: 2. I have no idea how to make Darcs do the thing with "hard links" (is
that
even supported under Windows?) I just copy the whole folder using the normal OS file tools.
darcs get path/to/other/local/repo
More specifically than that. This is the workflow I follow with darcs repos. Say, that I want to get the Foo repo: mkdir ~/repos/Foo cd ~/repos/Foo darcs get http://example.com/Foo HEAD darcs get HEAD feature-branch Then I can send the patches from feature-branch to the official Foo repo at any time. I can also merge them back into HEAD doing a darcs pull from feature-branch to HEAD. I think this is quite comparable to the git workflow.
Either way, you lose the ability to see how branches are related to each other, which might be useful in some cases.
How do you "see" how git branches are related to each other?
You can use gitk to see how the histories have interacted. Jason

On Tuesday 26 April 2011 15:35:42, Ivan Lazar Miljenovic wrote:
How do you "see" how git branches are related to each other?
To some extent, you can see such a relation in gitk. For mercurial, hg glog also shows a bit. I suppose there's also something to visualise branches in bazaar, but I've never used that, so I don't know. So, with gitk/glog, you can see that foo branched off bar after commit 0de8793fa1bc..., then checkout/update to that commit [or bar's head], checkout/update to foo's head/tip and compare. Or there could be some feature I've never heard of yet.

On 2011-04-26 15:51 +0200, Daniel Fischer wrote:
On Tuesday 26 April 2011 15:35:42, Ivan Lazar Miljenovic wrote:
How do you "see" how git branches are related to each other?
To some extent, you can see such a relation in gitk. For mercurial, hg glog also shows a bit. I suppose there's also something to visualise branches in bazaar, but I've never used that, so I don't know.
So, with gitk/glog, you can see that foo branched off bar after commit 0de8793fa1bc..., then checkout/update to that commit [or bar's head], checkout/update to foo's head/tip and compare.
No need to do a checkout; gitk can visualize any or all branches of the repository simultaneously. -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

On Tuesday 26 April 2011 16:04:55, Nick Bowler wrote:
On 2011-04-26 15:51 +0200, Daniel Fischer wrote:
On Tuesday 26 April 2011 15:35:42, Ivan Lazar Miljenovic wrote:
How do you "see" how git branches are related to each other?
To some extent, you can see such a relation in gitk. For mercurial, hg glog also shows a bit. I suppose there's also something to visualise branches in bazaar, but I've never used that, so I don't know.
So, with gitk/glog, you can see that foo branched off bar after commit 0de8793fa1bc..., then checkout/update to that commit [or bar's head], checkout/update to foo's head/tip and compare.
No need to do a checkout; gitk can visualize any or all branches of the repository simultaneously.
Yes, at least if you're only interested in the genealogy. When I think about how branches are related, I think of contents at least as much as of genealogy. Can gitk show the code next to each other? I wouldn't be surprised, but I haven't yet found a way to do it (but I've only taken a couple of short looks, so that doesn't say much).

On Tue, 2011-04-26 at 16:34 +0200, Daniel Fischer wrote:
On Tuesday 26 April 2011 16:04:55, Nick Bowler wrote:
On 2011-04-26 15:51 +0200, Daniel Fischer wrote:
On Tuesday 26 April 2011 15:35:42, Ivan Lazar Miljenovic wrote:
How do you "see" how git branches are related to each other?
To some extent, you can see such a relation in gitk. For mercurial, hg glog also shows a bit. I suppose there's also something to visualise branches in bazaar, but I've never used that, so I don't know.
So, with gitk/glog, you can see that foo branched off bar after commit 0de8793fa1bc..., then checkout/update to that commit [or bar's head], checkout/update to foo's head/tip and compare.
No need to do a checkout; gitk can visualize any or all branches of the repository simultaneously.
Yes, at least if you're only interested in the genealogy. When I think about how branches are related, I think of contents at least as much as of genealogy. Can gitk show the code next to each other? I wouldn't be surprised, but I haven't yet found a way to do it (but I've only taken a couple of short looks, so that doesn't say much).
I cannot say for gitk but gitview does it does it for sure: gitview --all --date-order Regards

2. I have no idea how to make Darcs do the thing with "hard links" (is that even supported under Windows?) I just copy the whole folder using the normal OS file tools.
darcs get path/to/other/local/repo
Either way, you lose the ability to see how branches are related to each other, which might be useful in some cases.
How do you "see" how git branches are related to each other?
git show-branch [branches] It was one of nicest things in git for me when I stared to use it. xmms2-devel $ git show-branch ! [error-on-implicit] OTHER: wscript: make implicit function declarations an error in C code * [master] FEATURE(2184): Update pre-generated cython files. ! [missing-protos] OTHER: one more me in AUTHORS --- + [error-on-implicit] OTHER: wscript: make implicit function declarations an error in C code + [missing-protos] OTHER: one more me in AUTHORS + [missing-protos^] OTHER: explicitely declare g_sprintf() + [missing-protos~2] OTHER: explicitely declare semtimedop() +*+ [master] FEATURE(2184): Update pre-generated cython files. The simpler things are: git [log|diff] from..to git [log|diff] from...to Set of [commits|changes] to be added to another branch. -- Sergei

On Tue, Apr 26, 2011 at 3:16 PM, Andrew Coppin
Presumably David thought the same. I won't deny that there is a certain simplifying elegance to it.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice,
No, if on the same machine, the patches only appear once, it is just the index that duplicates some information (I think). In fact just as if it were a branch in another VCS.
1. Conceptually, you have the same information twice.
2. I have no idea how to make Darcs do the thing with "hard links" (is that even supported under Windows?) I just copy the whole folder using the normal OS file tools.
Either way, you lose the ability to see how branches are related to each other, which might be useful in some cases.
http://wiki.darcs.net/Ideas/Branches I would like to have in-place Darcs branches too but not for wasting space reason. It is nice to keep complete history of the project and with current loose coupling of branches it requires more discipline from me :)

On 26/04/2011 12:17, Malcolm Wallace wrote:
On 25 Apr 2011, at 11:13, Andrew Coppin wrote:
On 24/04/2011 06:33 PM, Jason Dagit wrote:
This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree.
Yes, this seems clear. I'm just wondering whether or not it's the best design choice.
It seems to me to be a considerable insight. Branches and repositories are the same thing. There is no need for two separate concepts. The main reason other VCSes have two concepts is because one of them is often more efficiently implemented (internally) than the other. But that's silly - how much better to abstract over the mental clutter, and let the implementation decide how its internals look!
So in darcs, two repositories on the same machine share the same files (like a branch), but if they are on different machines, they have separate copies of the files. The difference is a detail that you really don't need to know or care about.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice,
No, if on the same machine, the patches only appear once, it is just the index that duplicates some information (I think). In fact just as if it were a branch in another VCS.
Unfortunately, I don't think this is quite true, because being able to switch between multiple branches in the same working directory means you can reuse build products when switching branches. Depending on how radical the branch shift is, this can be a substantial win, and it's the main reason that darcs might in future implement in-repo branching of some form. Ganesh

On 04/28/2011 12:19 AM, Ganesh Sittampalam wrote:
On 26/04/2011 12:17, Malcolm Wallace wrote:
On 25 Apr 2011, at 11:13, Andrew Coppin wrote:
On 24/04/2011 06:33 PM, Jason Dagit wrote:
This is because of a deliberate choice that was made by David Roundy. In darcs, you never have multiple branches within a single darcs repository directory tree.
Yes, this seems clear. I'm just wondering whether or not it's the best design choice.
It seems to me to be a considerable insight. Branches and repositories are the same thing. There is no need for two separate concepts. The main reason other VCSes have two concepts is because one of them is often more efficiently implemented (internally) than the other. But that's silly - how much better to abstract over the mental clutter, and let the implementation decide how its internals look!
So in darcs, two repositories on the same machine share the same files (like a branch), but if they are on different machines, they have separate copies of the files. The difference is a detail that you really don't need to know or care about.
It does mean that you duplicate information. You have [nearly] the same set of patches stored twice,
No, if on the same machine, the patches only appear once, it is just the index that duplicates some information (I think). In fact just as if it were a branch in another VCS.
Unfortunately, I don't think this is quite true, because being able to switch between multiple branches in the same working directory means you can reuse build products when switching branches. Depending on how radical the branch shift is, this can be a substantial win, and it's the main reason that darcs might in future implement in-repo branching of some form.
There's also the fact that using in-repo branches means that all the tooling doesn't have to rely on any (fs-specific) conventions for finding branches. As someone who has admin'd a reasonably large Bazaar setup (where branch == directory similarly to Darcs) I can honestly say that this would be a HUGE boon. Cheers,

On Thu, 2011-04-28 at 08:04 +0200, Bardur Arantsson wrote:
There's also the fact that using in-repo branches means that all the tooling doesn't have to rely on any (fs-specific) conventions for finding branches.
As someone who has admin'd a reasonably large Bazaar setup (where branch == directory similarly to Darcs) I can honestly say that this would be a HUGE boon.
Just keep in mind that adding branches withing the repository is a massive increase in the conceptual complexity of the system, and it would IMO be very un-darcs-like to adopt something like that into the core mental model you need to use a darcs repository, only because of incidental conveniences (by "incidental" here, I mean that there is nothing wrong with the darcs model; it *is* true that branches and repositories are the same thing -- but it just turns out that, often, developers want several repositories for the same project). It seems to me the same problems could be solved without the necessary increase in complexity by: (a) Keeping repositories in sibling directories with names. (b) Keeping a working directory that you build in as one of these, and switching it to match various other named repositories as needed. Then your build files are still there. Surely there are things darcs could do to make some of those bits easier to do remotely (ssh to a remote machine in order to darcs-get from one directory to a new one is a pain, for sure). But those can be offered without in-repo branches, at the advantage of not really affecting people that don't use them. Convention, rather than baking answers into tools, is the right way to solve organizational problems, and that's essentially what we're talking about here. And adding complexity every time someone has an awkward use case will lead (has led, in more systems than I can count) to an unusable result in the end. -- Chris Smith

On 2011-04-28 08:21 -0600, Chris Smith wrote:
It seems to me the same problems could be solved without the necessary increase in complexity by:
(a) Keeping repositories in sibling directories with names.
(b) Keeping a working directory that you build in as one of these, and switching it to match various other named repositories as needed. Then your build files are still there.
Unfortunately, sharing a build directory between separate repositories does not work. After a build from one repository, all the outputs from that build will have modification times more recent than all the files in the other repository. When switching branches, git (and other systems) update the mtimes on all files that changed, which will cause build systems to notice that the outputs are out of date. 'cd' does not do this. Thus, if you have separate repo directories (call them A and B) with different versions of some file, and you share a build directory between them, it is very likely that after building A, a subsequent build of B will fail to work correctly. -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

Unfortunately, sharing a build directory between separate repositories does not work. After a build from one repository, all the outputs from that build will have modification times more recent than all the files in the other repository. Then I suggest that your build tools are broken. Rebuilding should not depend on an _ordering_ between modification times of source and object, merely on whether the timestamp of the source file is different to its timestamp the last time we looked. (This requires your build tools to keep a journal/log, yes, but it is the only safe way to do it.) It is relatively common to change source files to have an older timestamp rather than a newer one. This should not cause your build system to ignore them. It can happen for any number of reasons: restoring from backup, switching repository, bisecting the history of a repo, clock skew on different machines, .... Regards, Malcolm

On 04/28/2011 05:23 PM, malcolm.wallace wrote:
Unfortunately, sharing a build directory between separate repositories does not work. After a build from one repository, all the outputs from that build will have modification times more recent than all the files in the other repository. Then I suggest that your build tools are broken. Rebuilding should not depend on an _ordering_ between modification times of source and object, merely on whether the timestamp of the source file is different to its timestamp the last time we looked. (This requires your build tools to keep a journal/log, yes, but it is the only safe way to do it.)
So 'make' is broken (in this regard)? Then - I fear - everyone has to cope with that.

On 2011-04-28 15:23 +0000, malcolm.wallace wrote:
Then I suggest that your build tools are broken. Rebuilding should not depend on an _ordering_ between modification times of source and object, merely on whether the timestamp of the source file is different to its timestamp the last time we looked. (This requires your build tools to keep a journal/log, yes, but it is the only safe way to do it.)
Right. The /order/ of the timestamps is wrong when a build directory is shared between repositories (isn't that what I said?). Try it yourself with cabal: it will fail. Consider two repos, A and B, each with different versions of foo.x, that (when compiled) produces the output foo.y. We store the build in the directory "C". Initially, say A/foo.x has a mtime of 1, and B/foo.x has an mtime of 2. We do a build of A, producing the output file C/foo.y. say C/foo.y now has a mtime of 3. Now we do a build in B. The build system sees that C/foo.y has a mtime of 3, which is newer than B/foo.x's mtime of 2. The build system therefore does not rebuild C/foo.y.
It is relatively common to change source files to have an older timestamp rather than a newer one. This should not cause your build system to ignore them. It can happen for any number of reasons: restoring from backup, switching repository, bisecting the history of a repo, clock skew on different machines, ....
All of these operations update the mtimes on the files... -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

There seems to be some misunderstanding here. I didn't suggest you share a separate build directory between repositories... I suggested you have a single repository that is the one you are currently building in, and that you synchronize it with various other repositories as you swap "branches". That process should touch the modification times on the files that are changed, such that your partial builds are as trustworthy as partial builds are likely to be (that is, not very... but passable).

On 28/04/2011 03:21 PM, Chris Smith wrote:
On Thu, 2011-04-28 at 08:04 +0200, Bardur Arantsson wrote:
There's also the fact that using in-repo branches means that all the tooling doesn't have to rely on any (fs-specific) conventions for finding branches.
As someone who has admin'd a reasonably large Bazaar setup (where branch == directory similarly to Darcs) I can honestly say that this would be a HUGE boon.
Just keep in mind that adding branches withing the repository is a massive increase in the conceptual complexity of the system, and it would IMO be very un-darcs-like to adopt something like that into the core mental model you need to use a darcs repository, only because of incidental conveniences
Convention, rather than baking answers into tools, is the right way to solve organizational problems, and that's essentially what we're talking about here. And adding complexity every time someone has an awkward use case will lead (has led, in more systems than I can count) to an unusable result in the end.
It seems half the people here think that having multiple branches per repo is a fantastic idea, while the other half think it's a stupid idea. Perhaps there is room for more than one revision control system based on change-sets in this world?
participants (29)
-
Andrew Coppin
-
Bardur Arantsson
-
Chris Smith
-
Daniel Fischer
-
David Leimbach
-
David Terei
-
Erik de Castro Lopo
-
Felipe Almeida Lessa
-
Ganesh Sittampalam
-
Heinrich Apfelmus
-
Henning Thielemann
-
Henning Thielemann
-
Ivan Lazar Miljenovic
-
Jake McArthur
-
Jason Dagit
-
Jeff Wheeler
-
John Meacham
-
John Millikin
-
Karel Gardas
-
Maciej Marcin Piechotka
-
Malcolm Wallace
-
malcolm.wallace
-
Nick Bowler
-
Paul R
-
Radoslav Dorcik
-
Sergei Trofimovich
-
Simon Michael
-
Steffen Schuldenzucker
-
wren ng thornton