
#14208: Performance with O0 is much better than the default or with -O2, runghc performs the best -------------------------------------+------------------------------------- Reporter: harendra | Owner: osa1 Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
If I change the optimization flags to -O0 for benchmark stanza in cabal file I can get close to ghci performance.
That contradicts what Omer found in comment:27. Nevertheless, if what you say is true, it'd be easier to debug with -O0 than GHCi (which brings the bytecode generator into the picture).
GHCi is 6x faster than my regular compiled code
This is totally bonkers and we MUST find out what is happening :-). I suggest not getting diverted into speculation about CPS. We have a repro case; let's just dig into it and find out what is going on. My suggestions * In comment:31 Does the same thing happen with -O0 vs -O, or only with GHCi vs -O? * In all repros, do the huge differences also show up in the bytes- allocated numbers? (If so, we don't need the Criterion apparatus.) * I notice that in comment:27, in the 2-module case, comparing -O0 and -O1: * Allocation is about halved in -O1 * But runtime actually increases That is most peculiar. * Matthew says in comment:34 "I can reproduce this..". That's great. But what is "this" precisely? Which version of GHC? What timing data? What happened to allocation and GC numbers? Somehow a 6x increase in execution time ought not to be hard to find! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14208#comment:38 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler