
Daan Leijen wrote:
Maybe we should think about how patches could be easily contributed -- should we move over to darcs for the standard library code? Together with clear instructions on how to send in patches? -- maybe even allow everyone to just commit...
I have some experience (as the person whose butt was on the line for the results) with 'allow everyone to commit' on a large (commercial) library (10^6 LOC of high-level code) and ~40 developers, with 15 full-time and the rest university researchers. Our approach was (is still) to have a HUGE automated test suite, and have it properly instrumented. In other words, each test case is a pair (test, dependencies) where dependencies records *everything* needed during the run of that test. This allows two things: 1) patchers can run testall -opens file1.hs -opens file2.hs and make sure that everything they've done seems to work (locally) 2) a central repository can (automatically!) a) accept a patch, 'install' it on a copy of local stable version b) run testall -opens file1.hs -opens file2.hs on all the patched files c) accept the patch into the next stable system if _everything_ was fine, reject it otherwise (with an email to the author with the failures encountered) d) additionally, a nightly run of the complete suite is also required to catch failures due to 'new' dependencies This level of automation is needed when you have a lot of people working all at once - humans can't keep up. This also means that there is always a stable system available. Or as stable as the quality of the test suite. This makes it a first-check-in system, where a person who checks in something that works in isolation but breaks on the 'latest' stable system the one responsible to figure out why. The actual system is more sophisticated in that it also tracks resource usage (time, memory) for each test and reports on wide variations of those too. These are not currently reasons for automatic rejection, though I would make it so. I would also add an automatic check that 'new' functionality comes with new automated tests as well, else it would be rejected (automatically). The 'dependencies' list serves two purposes: 1) tell the patcher how wide an effect their changes has the potential to be, by the size of the test suite it pulls in 2) lower the load on the central machine, as full test suite usually take hours to run (if they are a decent size). The above system has successfully been used in a distributed environment for about 7 years now, and seems to be a good balance between total anarchy and central control. Once the developers are trained to always write automated tests, the system can grow very quickly without either being in a broken state or in a 'waiting for review' state all the time. I would strongly recommend against 'allow everyone to just commit' without the presence of a large automated test suite which is used to (automatically) reject code that breaks a test. Jacques