
I like it. git branches are nice to work with, and they don't the
conceptual pain of "creating" an new repository.
Things that make them nice:
* When switching branches, all your files magically update (if they
have not been modified).
* Easy to maintain multiple branches, say "stable" and
"experimental". That helps me avoid getting clobbered by other's
changes to APIs I depend on.
Things that are a pain:
* Comparing commits (patches) between branches. Its hard to tell
what is one and what is in another.
* When you have modified files, git is super picky about switching branches.
* Once a remote branch is pushed to a public repo, its scary to
remove it. You don't want to break somebody, but you don't want that
old junk hanging around either.
I don't mean to write about git, but if darcs was to have branches,
thats the kind of stuff I would love to see.
On Tue, Jul 21, 2009 at 2:23 PM, Eric Kow
Hi everyone,
Max Battcher had an idea that I thought I should post on the mailing list.
The idea is about making branches in darcs. Right now, we take the view that a darcs branch is a darcs repository plain and simple. If you want to create a branch, all you have to do is darcs get (darcs get --lazy to be faster). While this is very simple, a lot of us think that it's inconvenient (one because it's slow, and two because you have to think of where to put the branch).
So darcs users have been asking about in-repo branches for a while. And now, Max has come up with a way to implement them. What's nice about his approach is that it lets us keep the simplicity of darcs, while giving more demanding users a chance to work with branches. It also takes advantage of the Petr Ročkai's Summer of Code project to make darcs faster in our daily lives and for the matter, paves the way for a possible darcs plugin system in the future.
On Max's advice, I'm cross-posting to Haskell Cafe. Haskellers: here's a nice chance for you get a cool Darcs feature without not very much effort or Darcs hacking experience :-)
More info on: http://bugs.darcs.net/issue555
------------------------------------------------------------ Max's write-up ------------------------------------------------------------
Here's a quick primer: Basically, darcs >= 2.0 uses a hashed pristine store that acts as a file object cache. An interesting artifact of the pristine.hashed store, which is being pushed into a useful third-party accessible library named hashed-storage, however, is that it does (for many reasons, most co-evolutionary) resemble the git object store. There are several differences, but one of the key differences that applies to the topic at hand is that darcs generally garbage collects pristine.hashed objects much faster than git.
Darcs is very quick to garbage collect old objects partly because many aren't all that useful, but mostly because the primary representation for a repository state is the patch store (and inventory), so there is only one root pointer in the pristine store. Petr, the author of the hashed-storage library, briefly discusses this in his most recent design post about the future of hashed-storage:
http://mornfall.net/blog/designing_storage_for_darcs.html
Here's where the primer meets the topic at hand: A darcs branch consists of three major components: an inventory store, a patch store, and a pristine store. To store multiple branches "in the same place" you need to take care of: 1) storing the alternate inventories, and 2) if you want it to be relatively fast, storing additional objects in the pristine store. (The patch store will already happily hold more patches than are referenced in the current inventory.) (1) is mostly a matter of naming alternate inventories and swapping between them. Thanks to the *ahem* git-like nature of pristine.hashed/hashed-storage: darcs could easily archive (many) more pristine objects, than it will during normal operation, in pristine.hashed and it may be as simple as storing additional, useful "root pointers" visible to hashed-storage so that it knows not to garbage collect the objects from other branches.
Here's where the fun happens: It seems to me that a branch switching tool, utilizing darcs' existing repository data stores, could be built almost "purely" on top of mostly just the hashed-storage library (which has been designed for reuse), as it exists today or hopefully with only minor tweaking, and with only minimal interaction with darcs itself. That is, in-repo branching could be provided entirely, today or soon, as a second/third-party tool to darcs. (!)
I think this is great from a darcs perspective: darcs itself remains conceptually simple (1 repository == 1 branch), which is something that I for one love about darcs, and doesn't need additional commands in darcs iteslf. But yet, power users (and git escapees) would have easy access to a ``darcs-branch`` tool that provides simple and powerful in-repo switching. Potentially, such a tool is also a great candidate to be an earlier adopter for the darcs library support and can help better define and enhance darcs' public API. (It's also interesting in that it mirrors that hg's support for branches is an addon, and that both hg and git have darcs-like patch queues as addons.)
I think this is even better from a hashed-storage perspective: ``darcs-branch`` would be a strong (new) use case for hashed-storage as a public API. The tool would provide good incentive to keep hashed-storage's API clean, and better incentive (than darcs' normal operation) to keep hashed-storage's garbage collection and object compaction strong. (With the 'cheap' cost of in-repo branches primarily a consequence of how well hashed-storage stores the additional objects of multiple branches. As a bonus, normal darcs operations should benefit as well from the gc/compaction optimizations that darcs-branch operations may make more obvious.)
At a high-level, a ``darcs-branch`` tool would provide core commands to:
1) Store the current repository state as a new branch by copying the current inventory and inserting a new pristine root for the branch. (``darcs-branch new`` or ``darcs-branch freeze``, perhaps)
2) Switch to a previously stored branch, by making the branch's inventory the new current inventory and the branch's pristine root the new current pristine root; updating the working directory as necessary. (``darcs-branch switch``)
Additionally, there would be other useful management tools (``darcs-branch list``, ``darcs-branch remove`` (or unfreeze)). I think that these four commands could be done with no darcs interaction at all (unless the branch being switched to has an incomplete/lazy pristine).
Useful commands that would need darcs interaction for patch management would be things like ``darcs-branch push`` to push patches between named branches (equivalent at a high level to ``darcs send -o new.dpatch --context branchB.context; darcs-branch switch branchB; darcs apply new.dpatch``), and ``darcs-branch obtain`` to obtain new in-repo local branches from an existing context file, remote/external-local repository, tag, or other matcher (that is, darcs get from one in-repo branch to a new one).
I doubt that a ``darcs-branch get`` to download all of the branches other than "current" (or HEAD, if you prefer, or "main" as I prefer) of a remote repository would need any darcs interaction (downloading the inventories and then many/most/all of the pristine objects). We can bet that darcs' usual lazy patch-getting behavior should work out of the box even for multiple branches.
Well, that's the general idea, at least. I believe that a willing volunteer and a bit of help from Petr could build such a tool "relatively quickly" and hopefully might even possibly work with today's darcs as it is.
-- Eric Kow http://www.nltg.brighton.ac.uk/home/Eric.Kow PGP Key ID: 08AC04F9
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkpmMb4ACgkQBUrOwgisBPlvzwCfbgyQQ/fV6QfAl4NgKJpjx7Bw 7QYAoOEaF2XrNyqJ9tfUjvJpgc/KjkYI =nZFr -----END PGP SIGNATURE-----
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe