
Hi Nicolas,
In my opinion we should look at nofib (slow) and make sure that
1) it's at least neutral on average (runtimes and preferably allocations
too),
2) there are some benchmarks that improve significantly (that's why we're
making the change after all), and
3) we can attribute the losses to something other than significantly worse
Core (or at least more programs get better than get worse).
If these 3 hold and the compile times aren't up too much, I think it's a
candidate for being on by default in -02.
In my mind the key is to understand why the programs that got worse got
worse. For example, when I enabled -funbox-small-strict-fields by default
there were some losers, but the reasons these were losers was more
accidental than due to -funbox-small-strict-fields so I was happy to turn
it on by default anyway.
-- Johan
On Fri, Aug 30, 2013 at 12:28 PM, Nicolas Frisby
TO: Performance czars and devs
I pushed a patch yesterday enabling a second demand analysis at the end of the core2core simplification pipeline. The flag is -flate-dmd-anal, and it is off by default.
My question:
What's the protocol for deciding if -O2 should imply it?
See http://ghc.haskell.org/trac/ghc/wiki/LateDmd for context.
In particular, this section includes highlights of some nofib runs I did.
http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
For some tests, it decreases allocation by 10% to 20%. But on the platforms I have tried, it causes a couple repeatable slowdowns, up to 10%. I've investigated a bit, but haven't found any clear explanations. I'm worried that it's caching effects, eg.
Any suggestions on how I should proceed with my investigation?
Also: I'd appreciate if any developer would generously run some benchmarks on various platforms they might have and add them to the same section in the wiki page.
http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
NB That it is unfortunately key to build the libraries twice: once with -flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how to do this robustly without a distclean — please let me know if you have a better method.
So I've used
# one of the following #GhcLibHcOpts = -O2 # both with and without -flate-dmd-anal GhcLibHcOpts = -O2 -flate-dmd-anal SplitObjs = NO DYNAMIC_BY_DEFAULT = NO DYNAMIC_GHC_PROGRAMS = NO
The last three aren't necessary, but please record what you use, if you are so generous as to run it :).
Thanks.