
#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Replying to [comment:16 simonpj]:
Is it ONLY that {{{ x :: Addr# x = "foo"# y :: T y = K x }}} is more expensive than {{{ y :: T y = K "foo"# }}} And if it is more expensive, how much more expensive? And does that cost come from simplifying, spitting out an interface file, code generation, reading in an interface file?
I think I've found the main source of extra cost from this transformation. When we build a module with `-split-objs`, the code generator emits one assembly file per strongly-connected component of Cmm declarations, and then the driver runs the assembler on each of these files. In the example above, the first program will be translated into two `.o` files using two assembler invocations, while the second will be translated into a single `.o` file with a single assembler invocation. Thus each string literal will, with this change, normally result in an extra assembler invocation (assuming it gets floated to top level, and is only referred to from one place). The builds on https://perf.haskell.org/ghc are done using something like the default (perf) flavour build settings, so when it built this commit (https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53f...) the libraries were built with `-split-objs`. I ran similar builds myself and found that the number of `.o` files (so, assembler invocations, roughly) increased from 100000 to 133000 with the top-level strings patch. The 33000 extra assembler invocations can plausibly explain the extra ~100 seconds of total build time. But rather than hand-waving estimates, there's a better way to confirm what happened. In fact, we actually mean to build the libraries ''not'' with `-split- objs`, but with `-split-sections`, a new flag that achieves a similar effect but which requires running the assembler only once per module. However, the top-level strings patch (Jan 18) landed in the window between * commit 266a9dc (Jan 10), which accidentally broke detection of `-split- sections` support (https://perf.haskell.org/ghc/#revision/266a9dc4cd34008f1162eb276032c85ef8371...), and * commit 283acec (Feb 4), which fixed `-split-sections` (https://perf.haskell.org/ghc/#revision/283acec1d7307fdbd8cd7b3f1d984a036366d...) The first of these two commits increased build time from 1891s to 2150s, a change of +259s, and the second decreased it from 2653s to 2311s, a change of -342s. Let's assume that the increase between these commits (from 2150s to 2653s) is caused mainly by ghc getting slower (e.g., join points core lint), not by ghc getting larger (more modules to compile). The changes of +259s and -342s from these `-split-sections` patches are presumably due mainly to saving 100000 and 133000 assembler invocations respectively; and 342/259 = 1.32 is in remarkably close agreement with these numbers. That suggests that 342-259 = 83 of the 102 seconds by which top-level unboxed strings increased the build time should be attributed to additional invocations of the assembler. The good news is that we do not have to do anything about those 83 seconds, as we do not use `-split-objs` during the GHC build any more. Effectively they were measurement error, due to the build being misconfigured when the top-level strings patch landed. This leaves an increase of 102-83 = 19 seconds (out of a total ~2000, so about 1%) from the top-level strings patch, which can plausibly be explained by (1) larger interface files and (2) lack of CSE for top-level strings. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler