
2009/08/06 Jost Berthold
as Malcolm already said, the "mutator" in this text is the/a thread evaluating some Haskell expression.
I want to thank everyone for taking the time to clarify that to me; I'm now much more able to follow discussions of Haskell garbage collection.
1. Garbage collection and mutator running concurrently: while they usually do, they do not _have_ to exclude each other, but not doing so means that the objects they are treating have to be locked.
So this is the part that actually lead me here. Say you are implementing a network server, for example -- you don't want to have big spikes in the request latency due to GC. Not that Haskell is so much worse off relative to Java, say; Erlang is the only language I'm aware of that takes concurrent GC seriously. However, it seems that this problem is hard to solve for Haskell: Parallel GC is when the whole system stops and performs multi-threaded GC, as opposed to "concurrent GC", which is when the GC runs concurrently with the program. We think concurrent GC is unlikely to be practical in the Haskell setting, due to the extra synchronisation needed in the mutator. However, there may always be clever techniques that we haven't discovered, and synchronisation might become less expensive, so the balance may change in the future. -- Simon Marlow So I wonder, to what degree is GC latency controllable in Haskell? It seems that, pending further research, we can not hope for concurrent GC.
2. About "Blackholing": in the sequential evaluation (where hitting a blackhole indeed means to have a loop), some better performance can be gained by not blackholing a thunk immediately, so this was done in GHC earlier. However, it increases the chance for 2 mutator threads to evaluate the same thunk (double work), and we got better performance by blackholing immediately.
Can blackholing too early could result in non-termination ("...hitting a blackhole indeed means to have a loop")? Then it's not just a matter of performance when we do it? -- Jason Dusek