
On Thu, Apr 28, 2005 at 10:01:05AM +0100, Simon Marlow wrote:
On 28 April 2005 04:52, John Goerzen wrote:
Is it possible to set up a two-way synch so we can move over to darcs gradually? It's not really practical for us to move over in one go, we've simply accumulated too many dependencies on CVS, and there are lots of people using the repo with CVS. If we had a two-way synch, we can experiment with darcs non-destructively.
I believe so. I haven't tried that myself yet, but I think tailor.py supports it. To do that though, we should really identify a permanent home for the canonical fptools darcs repo. I'm not really set up to provide accounts for those that would need write access, and I don't want to be the gatekeeper (I suspect nobody else wants that either <g>). If cvs.haskell.org is up to the task, that would be an ideal location IMHO. I haven't yet convinced tailor.py to work with the pserver for fptools, so if it can access the repo on the local filesystem, that would be ideal. Plus, one could then cron it to run frequently. I'll volunteer to do the work to figure out how to do this and get it installed if someone wants to install darcs 1.0.2 on that box and give me a spot to plonk down the darcs repo. It uses about 355M, including the pristine and working trees. _darcs itself is 240M. And, urm, 20,000+ inodes will be needed to be safe :-) (df -i will show those) My brief look at cvs.h.o shows that /home has plenty of space free. cvs.h.o has only 256M of RAM. On a repo this size, darcs sometimes uses more than that. However, with the exception of periodic checkpointing, I think we could avoid those RAM-intensive operations on the server side. The other benefit of having it on cvs.haskell.org is that it can be cronned to run fairly frequently (say, every 15 minutes) to help minimize the possibility of conflicts.
Off the top of my head, a few other things we need before we can even think of switching over:
- split up the full fptools repo into pieces (as we discussed on #haskell).
I gave that some thought before I started. Several things occured to me: * There are quite a few commits in fptools that modify multiple projects (more than anyone estimated at first, I think) * The conversion process took a long time, so it may be best to convert it all at once and then split it up (~1 week * n projects == more time than I have to invest) * There were several "great renames" in the tree. Tracking the entire history of an invidual project across those would be difficult at best. Now, having said that, I did keep the request in mind. I figure that this big repo can be split up into smaller ones at any time after the CVS mirroring is stopped. For each smaller one, the process would be: 1. Branch (darcs get) the master repo 2. Delete all the files that don't apply to the smaller project 3. Rename the smaller project's files as appropriate 4. Checkpoint here Because darcs get hardlinks patches, this wouldn't be as costly to disk as it might seem, and still preserves the history of the smaller project. I'm about to write a new mail to darcs-users about my observations of darcs' performance on this repo. The summary is that the day-to-day operations are still fairly fast and my ext2 server is holding up far better than I expected.
- a web interface to the repo
If you mean darcs.cgi, that should be trivial enough to set up on whatever the permanent host is. I don't run it on my server because I am very resource-limited there
- commit mails to the cvs-<blah>@haskell.org lists
I figure a cron job can do this. Every x minutes, run darcs changes -s, and send copies of never-before-seen logs to the list. Should be fairly trivial. I can do this too. (But only after the CVS gateway is disabled; otherwise, you'd get two copies for every commit) Let me know your thoughts. -- John