
On Apr 18, 2006, at 4:05 PM, Ravi Nanavati wrote:
I recently discovered that I'm running into the IORef / garbage collection issue described in the ghc user guide (at the bottom of the following page):
http://www.haskell.org/ghc/docs/6.4/html/users_guide/faster.html
I did a bit of back-and-forth with Simon M. on this when I was fiddling with Data.HashTable. It's especially bad for IOArrays.
Increasing the heap and allocation area size (with -H, -M and -A) helped improved my runtimes considerably (by preventing all of my execution time from being sucked up by the garbage collector), but I have some questions:
1. What "goes wrong" with mutable data and generational garbage collection to make this sort of workaround necessary? The issue surprised me because I thought there was plenty of satisfied experience with generational gc for languages with many more destructive updates (e.g. Java) than Haskell. Is there some optimization that the ghc runtime is not implementing?
There are a number of known optimizations. Basically, for an efficient generational GC, you need to make use of either a read or a write barrier; since the write barrier only applies if we actually mutate something, it's the usual solution. This allows us to keep track of IORefs in old-space which point into new-space. GHC is naive and just assumes all IORefs point into new space. But after 1 GC pass, if you don't touch an IORef this is certainly false (if I understand the promotion policy correctly). Note that a read or write barrier is necessary for efficient implementation of almost *any* GC algorithm except naive "stop and collect the entire world".
2. If there is a known fix for this issue, what would it involve (and, if there are any guesses, how much work might it be)?
I had thought this was on the list of things fixed in 6.4. Simon?
3. What is the best workaround for this issue if you *don't* know the maximum amount of memory available for your program? Would be it be best to fall back copying collection if you want your program to consume and release memory "as needed" or is there a cleverer trick?
Hmm, it'd be nice if the allocation area dynamically scaled up based on GC time. But those calculations are awfully hard to get right (you need to build a pretty-good cost model of GC, and the cost must include things like chasing down all those IORefs). -Jan
Thanks,
- Ravi _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users