
On 09/03/2012 11:44, John Meacham wrote:
Out of curiosity, Is the reason you keep track of mutable vs not mutable heap allocations in order to optimize the generational garbage collector? as in, if a non-mutable value is placed in an older generation you don't need to worry about it being updated with a link to a newer one or is there another reason you keep track of it? Is it a pure optimization or needed for correctness?
It's for correctness - this is how we track all the pointers from the old generation to the new generation. There are lots of different ways of doing generational GC write barriers, and indeeed GHC uses a mixture of different methods: a remembered set for updated thunks and mutated IORefs, and a card table for arrays. Mutable arrays normally stay in the remembered set continuously, so that the write barrier for array mutation doesn't have to worry about adding the array to the remembered set, but it does modify the card table. An immutable array can be dropped from the remembered set, but only once all its pointers are safely pointing to the old generation. But then, unsafeThawArray has to put it back in the remembered set again. Once upon a time all the mutable objects were in the remembered set, but gradually we've been moving away from this and adding them to the remembered set only when they get modified. There are a few objects that still use the old method: TVars, for example.
A weakness of jhc right now is its stop the world garbage collector, so far, I have been mitigating it by not creating garbage whenever possible, I do an escape analysis and allocate values on the stack when possible, and recognize linear uses of heap value in some circumstances and re-use heap locations directly (like when a cons cell is deconstructed and another is constructed right away I can reuse the spacue in certain cases) but eventually a full GC needs to run and it blocks the whole runtime which is particularly not good for embedded targets (where jhc seems to be thriving at the moment.). My unsafeFreeze and unsafeThaw are currently NOPs. frozen arrays are just a newtype of non-frozen ones (see http://repetae.net/dw/darcsweb.cgi?r=jhc;a=headblob;f=/lib/jhc-prim/Jhc/Prim...)
Interesting - we do generational GC instead of escape analysis and other "compile-time GC" techniques. IMO, generational GC is a necessity, and once you have it, you don't need to worry so much about short-lived allocation because it all gets recycled in the cache. (try +RTS -G1 sometime to see the difference between generational GC and ordinary 2-space copying). On the other hand, generational GC does give rise to some strange performance characteristics, especially with mutable objects and sometimes laziness. And it doesn't solve the pause time problem, except that the long pauses are less frequent. Improving pause times is high up my list of things we want to do in the GC. Cheers, Simon