
TO: Performance czars and devs I pushed a patch yesterday enabling a second demand analysis at the end of the core2core simplification pipeline. The flag is -flate-dmd-anal, and it is off by default. My question: What's the protocol for deciding if -O2 should imply it? See http://ghc.haskell.org/trac/ghc/wiki/LateDmd for context. In particular, this section includes highlights of some nofib runs I did. http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers For some tests, it decreases allocation by 10% to 20%. But on the platforms I have tried, it causes a couple repeatable slowdowns, up to 10%. I've investigated a bit, but haven't found any clear explanations. I'm worried that it's caching effects, eg. Any suggestions on how I should proceed with my investigation? Also: I'd appreciate if any developer would generously run some benchmarks on various platforms they might have and add them to the same section in the wiki page. http://ghc.haskell.org/trac/ghc/wiki/LateDmd#Newperformancenumbers NB That it is unfortunately key to build the libraries twice: once with -flate-dmd-anal in GhcLibHcOpts and once without. I have not determined how to do this robustly without a distclean — please let me know if you have a better method. So I've used # one of the following #GhcLibHcOpts = -O2 # both with and without -flate-dmd-anal GhcLibHcOpts = -O2 -flate-dmd-anal SplitObjs = NO DYNAMIC_BY_DEFAULT = NO DYNAMIC_GHC_PROGRAMS = NO The last three aren't necessary, but please record what you use, if you are so generous as to run it :). Thanks.