
On 01/03/2010 14:53, Thomas Schilling wrote:
On 28 February 2010 05:20, Luke Palmer
wrote: I have seen some proposals around here for SoC projects and other things to try to improve the latency of GHC's garbage collector. I'm currently developing a game in Haskell, and even 100ms pauses are unacceptable for a real-time game. I'm calling out to people who have seen or made such proposals, because I would be willing to contribute funding and/or mentor a project that would contribute to this goal. Also any ideas for reducing this latency in other ways would be very appreciated.
There is a SoC project suggestion to implement Immix's ideas [1] in GHC's GC. Both already use similar overall designs. Both split the heap into regions which may employ different collection strategies. However, Immix does not address real-time issues.
The main difficulty with real-time GC is that, while first-generation collection is usually very fast, eventually you just have to collect the old generation and you have to do it all at once. Sun's new Garbage-First ("G1") [2] collector therefore tracks pointers between regions, as opposed to just pointers from older two newer generations. This allows collecting regions independently (and in parallel). G1 is still stop-the-world, although marking phase is concurrent. Tracking pointers between all regions can result in quite substantial space overheads, however, so G1 uses some heuristics to discover "popular objects" and treats them specially. In a personal conversation Simon Marlow expressed to me that he intends to go further into this direction, but I don't know how high-priority it is. In general I don't think true real-time is the goal in any case, but rather a general effort to keep GC-pauses short.
Truly concurrent garbage collection is a whole different beast. Concurrent marking can be implemented efficiently with a write barrier. I don't know of any fully concurrent GC scheme that gets by without a read barrier and significant space overhead, however. There are certainly no plans from the GC HQ to implement a fully concurrent GC.
A good summary of concurrent GC techniques used successfully in industry was posted to the GC mailing list recently by Erez Petrank: http://permalink.gmane.org/gmane.comp.programming.garbage-collection.general... Several of those we could do in GHC, for example mostly-concurrent collection uses only a write barrier, and piggybacks on the generational write-barrier we already have, but it does require two stop-the-world phases per GC. I think you'd want to do it in conjunction with Immix-style region sweeping in the old generation, so implementing Immix would be a good first step. Cheers, Simon