[GHC] #8513: Parallel GC increases CPU load while slowing down program

#8513: Parallel GC increases CPU load while slowing down program ------------------------------+-------------------------------------------- Reporter: blitzcode | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.6.3 System | Operating System: Unknown/Multiple Keywords: | Type of failure: Runtime performance bug Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | ------------------------------+-------------------------------------------- I noticed this issue with a lot of my programs. I have no idea if this is a widely know issue or if I'm just particularly unluckily and/or unskilled when it comes to the GHC GC, but I thought it might be worth reporting as a bug. Here's a fairly simple program showing the issue: https://github.com/blitzcode/haskell-gol/tree/master/vector-glfwb (Note the 'GHC.Conc.getNumProcessors >>= setNumCapabilities', need to remove that for testing) On my quad core machine, this simple (non-parallel, some concurrency for draw & compute) Game-of-Life program runs as follows: +RTS -N1 = ~520G/s, CPU Load ~100% +RTS -N2 = ~505G/s, CPU Load ~135% +RTS -N3 = ~485G/s, CPU Load ~150% +RTS -N4 = ~485G/s, CPU Load ~160% Specifying -qg1 caps the CPU load increase at ~135% and it won't slow down below ~505G/s. The statistics from +RTS -s also suggest a decrease in GC time / increase in productivity through using -qg1. The program is a bit crummy, but it's the shortest example of this I got at hand. I've seen this in many different programs, serial GC just seems to be faster for a lot of workloads. I think it might at least be helpful to improve documentation a bit, suggesting some things to try for a GC speedup etc. Apologies if this is already a well-known issue or if I'm just doing something obviously dumb here that makes the GC perform poorly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8513 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8513: Parallel GC increases CPU load while slowing down program --------------------------------------------+------------------------------ Reporter: blitzcode | Owner: simonmar Type: bug | Status: closed Priority: normal | Milestone: Component: Runtime System | Version: 7.6.3 Resolution: worksforme | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Changes (by simonmar): * status: new => closed * resolution: => worksforme Comment: Your results seem to be in line with what I would expect. The parallel GC improves performance for (a) parallel programs and (b) sequential programs that have a large residency. For (b) you should use `+RTS -qg1`. The documentation for `+RTS -qg` already mentions the points above, and seems reasonably clear to me: http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime- control.html#rts-options-gc Your program looks like its main heap structure is a single Vector, which is not very parallelisable in the GC. This would explain why you don't see much speedup with parallel GC. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8513#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC