Re: [GHC] #14208: Performance with O0 is much better than the default or with -O2, runghc performs the best

12 Sep 2017

      #14208: Performance with O0 is much better than the default or with -O2, runghc
performs the best
-------------------------------------+-------------------------------------
        Reporter:  harendra          |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by harendra):

 I guess some function getting inlined too early is preventing list fusion.

 The combination of `-fexpose-all-unfoldings` and `-fspecialise-
 aggressively` is not "exactly" equivalent to putting everything in the
 same module. O1 with everything in the same module finishes in 8ms while
 with the combination of these two finishes in 4ms. So they do something
 more. I guess the added effect is that they make everything INLINEABLE.

 When everything is in the same module and `toList` marked NOINLINE then it
 takes 14ms (i.e. the worst case) irrespective of the monoid functions
 being marked INLINE or not.

-- 
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14208#comment:17
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler