
#13253: Exponential compilation time with RWST & ReaderT stack with `-02` -------------------------------------+------------------------------------- Reporter: phadej | Owner: bgamari, osa1 Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15630 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): I've instrumented GHC to dump more intermediate information during SpecConstr. Specifically, I've added the following logic: In `scExpr`, capture the expression to be specialized before and after specialization; then compare number of terms to detect a blowup (a factor of 2 is enough to weed out the non-blowups), and if this particular incantation does in fact blow up, dump the expression before the blowup. The resulting dump is a bit too large to attach, but it shows a typical pattern: the expressions that blow up all look very much alike; as we progress through the compilation, the "before" size stays at ~6000 terms, while the "after" size progressively increases, and it looks like expressions from earlier incantations get inlined into expressions later in the process. Which fits the hypothesis from the GHC HQ meeting, namely, that the inlining heuristic only looks at the size of the inlinee, but not at the inlining context, so when something gets inlined "top-down", then functions to be inlined (which don't have their own dependencies inlined yet) are all still small, and so the inliner happily copies them many times, and then in the next round, it inlines exponentially more invocations of the inlined functions' dependencies, and so on. For example, given: {{{#!haskell f = g1 + g2 + g3 + g4 + g5 g1 = a1 + a2 + a3 + a4 g2 = b1 + b2 + b3 + b4 ... }}} ...the inlining heuristic will first look at `f`, conclude that each of `g1` through `g5` is small, and can thus be inlined; then after inlining, it will look at f again, and conclude that each of `a1` etc. is small, and inline those; and if those have further dependencies following the same matter, it will happily keep inlining all those small things, not realizing that it is creating a monstrosity. And because all the inlinees involved are different, there will not be any opportunities for optimizations that might shrink things back down, so the resulting program just keeps growing exponentially. One fun thing I'll try now is this: Considering that I already have code in place to detect blowups, I can trivially use this logic to just say "if this blows up, then throw away the specialized Core, and use the original expression instead". So I'll try that, and see what that does to various performance metrics. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13253#comment:41 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler