Chris,
There are a few things here.
- There are different levels of latency-sensitivity. The system I work on at Facebook is latency sensitive and we have no problem with the GC (after we implemented a few optimisations and did some tuning). But we're ok with pauses up to 100ms or so, and our average pause time is <50ms with 100MB live data on large multicore machines. There's probably still scope to reduce that some more.
- Thread-local heaps don't fix the pause-time issue. They reduce the pause time for a local collection but have no impact on the global collection, which is still unbounded in size.
- The issue is not so much maintaining multiple GCs. We already have 3 GCs (one of which is experimental and unsupported). The issue is more that a new kind of GC has non-local implications because it affects read- and write-barriers, and making a bad tradeoff can penalize the performance of all code. Perhaps you're willing to give up 10% of performance to get guaranteed 10ms pause times, but can we impose that 10% on everyone? If not, are you willing to recompile GHC and all your libraries?
Cheers
Simon