TO: Performance czars and devs
I pushed a patch yesterday enabling a second demand analysis at the end of the core2core simplification pipeline. The flag is -flate-dmd-anal, and it is off by default.
My question:
What's the protocol for deciding if -O2 should imply it?
In particular, this section includes highlights of some nofib runs I did.
http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers
For some tests, it decreases allocation by 10% to 20%. But on the platforms I have tried, it causes a couple repeatable slowdowns, up to 10%. I've investigated a bit, but haven't found any clear explanations. I'm worried that it's caching effects, eg.
Any suggestions on how I should proceed with my investigation?
Also: I'd appreciate if any developer would generously run some benchmarks on various platforms they might have and add them to the same section in the wiki page.
NB That it is unfortunately key to build the libraries twice: once with -flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how to do this robustly without a distclean — please let me know if you have a better method.
So I've used
# one of the following
#GhcLibHcOpts = -O2 # both with and without -flate-dmd-anal
GhcLibHcOpts = -O2 -flate-dmd-anal
SplitObjs = NO
DYNAMIC_BY_DEFAULT = NO
DYNAMIC_GHC_PROGRAMS = NO
The last three aren't necessary, but please record what you use, if you are so generous as to run it :).
Thanks.