
#15999: Stabilise nofib runtime measurements -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: NoFib benchmark | Version: 8.6.2 suite | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5793 #9476 | Differential Rev(s): Phab:D5438 #15333 #15357 | Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Replying to [comment:6 simonmar]:
So as I understand it, the GC "wibbles" you're talking about are caused by the number of GCs we run? Making a small change to the nursery size can make the difference between N and N+1 GC runs, which could be a large difference in runtime.
Yes, that's one source of wibble (in hindsight, that may have been a bad term to use here). But it's not exactly the reason why I'm doing this: Have a look at the numbers in https://ghc.haskell.org/trac/ghc/ticket/9476#comment:55. The `./default` had significantly fewer Gen 0 collections and the same number of Gen 1 collections as `./allow-cg` (which produces more garbage but is faster in total). Gen 1 collections where cheaper for `./allow-cg` for some reason. Also note how this correlates with the productivity rate: 10% vs 15% for the latter. The findings in the thread led me to plot the above curves.
You're only looking at `-G1`, right? Generational GC often has weird
effects based on the timing of when a GC runs. I think there will still be issues when there's an old-gen collection right around the end of the program run - making a small change may mean the difference between running or not running the expensive GC. This is not `-G1` and I agree that a single old-gen collection might make the difference. But when we modify the program in a way that there are ''more'' Gen 1 collections, at more uniformly distributed points in the program, I argue we will have a much better experience comparing nofib numbers. There are multiple ways to achieve this, but I think the simplest one is what I outline above and more closely corresponds to the workload of real applications (e.g. long running time, growing and shrinking working sets). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15999#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler