[GHC] #15703: Significant compilation time blowup when refactoring singletons-heavy code

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Compile-time Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- At ICFP, I griped to Ben about some code I'm writing (which makes use of `singletons` to the nth degree) that takes absolutely forever to compile with optimizations. I suspected that type families (i.e., coercions) were the culprit, but Ben requested that I put the code on Trac anyways, so here it is. I've attached two modules (`Lib.hs` and `Lib2.hs`). You'll need GHC 8.6.1 or later to compile it. To compile it, run: {{{ $ time /opt/ghc/8.6.1/bin/ghc -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 3m28.367s user 3m28.512s sys 0m0.212s }}} Note that if you compile without optimizations, it'll finish almost immediately: {{{ $ time /opt/ghc/8.6.1/bin/ghc -O0 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m0.528s user 0m0.480s sys 0m0.036s }}} Also, if you look at the source code for `Lib.hs`, you'll notice some code which says: {{{#!hs instance SApplicative f => SApplicative (M1 i c f) where sPure x = singFun3 @(.@#@$) (%.) @@ singFun1 @M1Sym0 SM1 @@ (singFun1 @PureSym0 sPure) @@ x -- If I change the implementation of sPure above to be this: -- -- sPure x = SM1 (sPure x) -- -- Then Lib.hs compiles quickly again (< 1 second) with -O1. SM1 f %<*> SM1 x = SM1 (f %<*> x) }}} If you apply this suggested change, then you can see the difference when you compile it with optimizations again: {{{ $ time /opt/ghc/8.6.1/bin/ghc -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m0.986s user 0m0.940s sys 0m0.040s }}} Given that this particular change of code causes such a dramatic increase in compilation, I have to wonder if there's more going on here than just the usual type-family slowness. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * Attachment "Lib.hs" added. Lib.hs -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * Attachment "Lib2.hs" added. Lib2.hs -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): If you're wondering why I split the code up into two separate modules (instead of combining it into a single module), I observed that combining things into a single module actually made the compilation slowdown less severe (it took about 1 minute to compile, as opposed to the ~3m30s compile time shown above). I suspect that dividing up the code into further modules might exacerbate the compile times even more, since in the original project that I took this code from, it actually takes //longer// than 3m30s to compile the whole thing with `-O1`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * Attachment "Lib.hs" added. Lib.hs -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I built this code using a profiled GHC 8.6.1. Here are the highlights: {{{ Wed Oct 3 22:18 2018 Time and Allocation Profiling Report (Final) ghc-stage2 +RTS -p -RTS -B/u/rgscott/Software/ghc4/inplace/lib -O1 -fforce-recomp Lib.hs total time = 276.05 secs (276053 ticks @ 1000 us, 1 processor) total alloc = 403,405,509,720 bytes (excludes profiling overheads) COST CENTRE MODULE SRC %time %alloc simplCast-simplCoercion Simplify compiler/simplCore/Simplify.hs:1248:57-77 75.8 76.8 substTyWith TyCoRep compiler/types/TyCoRep.hs:(2213,23)-(2214,50) 23.8 23.0 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * Attachment "dshow-passes.txt" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I attached the results of compiling `Lib.hs` with `-dshow-passes` enabled. Of particular note is: {{{ Result size of Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}) = {terms: 950, types: 14,499, coercions: 83,275, joins: 0/12} !!! Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}) [Lib]: finished in 28.01 milliseconds, allocated 19.390 megabytes *** Simplifier [Lib]: Result size of Simplifier iteration=1 = {terms: 924, types: 12,141, coercions: 235,672, joins: 0/46} Result size of Simplifier iteration=2 = {terms: 755, types: 5,948, coercions: 312,927, joins: 0/17} Result size of Simplifier iteration=3 = {terms: 715, types: 3,828, coercions: 403,226, joins: 0/2} Result size of Simplifier = {terms: 703, types: 3,759, coercions: 228,342, joins: 0/2} !!! Simplifier [Lib]: finished in 208488.74 milliseconds, allocated 373891.763 megabytes *** Simplifier [Lib]: Result size of Simplifier = {terms: 703, types: 3,759, coercions: 228,342, joins: 0/2} }}} The number of coercions appears to jump between float-out and the first simplifier pass. However, it's a later simplifier pass that takes up the bulk of the time (208488.74 milliseconds of it). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by goldfire): It might be worth checking whether `optCoercion` is helping or hurting: you can check the size of the input to the function and the size of the output. The output shouldn't be bigger than the input! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I added `{-# OPTIONS_GHC -fprof-auto #-}` to the top of `OptCoercion`, which gives me some slightly more fine-grained information: {{{ Mon Oct 8 13:50 2018 Time and Allocation Profiling Report (Final) ghc-stage2 +RTS -p -RTS -B/u/rgscott/Software/ghc3/inplace/lib -O1 -fforce-recomp Lib.hs total time = 282.87 secs (282872 ticks @ 1000 us, 1 processor) total alloc = 403,927,138,248 bytes (excludes profiling overheads) COST CENTRE MODULE SRC %time %alloc matchAxiom OptCoercion compiler/types/OptCoercion.hs:(903,1)-(916,11) 47.3 43.4 substTyWith TyCoRep compiler/types/TyCoRep.hs:(2213,23)-(2214,50) 23.0 22.9 opt_trans_rule OptCoercion compiler/types/OptCoercion.hs:(512,1)-(681,30) 18.4 21.5 etaAppCo_maybe OptCoercion compiler/types/OptCoercion.hs:(985,1)-(996,11) 6.1 9.2 opt_co4 OptCoercion compiler/types/OptCoercion.hs:(174,1)-(386,71) 3.4 2.2 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Those are staggeringly large coercions decorating relatively tiny types. I wonder: could any of them be very large representations of `Refl`? I note that `OptCoercion` doesn't use `isReflexiveCo` (which tests for equality of the two types) because doing so at every node would be expensive. But you could try testing for `isReflexiveCo`, and spit out the smallest coercion which says `True` to `isReflexiveCo` but is not actually `Refl`. You'd want to make this test for every sub-tree in the coercion. That's a guess. But it's very hard to imagine that you ''really'' need such large proofs. The trick is to find where the redundancy is. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by RyanGlScott): * related: => #15725 Comment: I don't know if this is related to the performance issue witnessed here, but it turns out that `Lib.hs` compiles cleanly with `-dcore-lint -O0`, but crashes with `-dcore-lint -O1`. I've opened #15725 for this. (Interestingly, in the more minimal example I provide in #15725, the Core Lint error happens regardless of optimization level.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I tried compiling this program using the experimental Phab:D4766 (`Zap coercions when not building with -dcore-lint`) branch: {{{ $ time ~/Software/ghc3/inplace/bin/ghc-stage2 -O0 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m0.497s user 0m0.460s sys 0m0.028s $ time ~/Software/ghc3/inplace/bin/ghc-stage2 -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m17.164s user 0m16.980s sys 0m0.200s }}} This looks promising! Perhaps this really is a duplicate of #8095 in disguise? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: normal => high -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I still think (comment:6) that it may well be that there is massive redundancy here, and if so it'd be a pity to miss out on fixing that, good though #8095 is. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): Hold on a second... now that I've compiled GHC HEAD, it turns out that HEAD compiles this program //much// faster than 8.6.1 does: {{{ $ time ~/Software/ghc2/inplace/bin/ghc-stage2 -O0 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m0.462s user 0m0.404s sys 0m0.048s $ time ~/Software/ghc2/inplace/bin/ghc-stage2 -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m22.205s user 0m22.180s sys 0m0.048s }}} In fact, much of the speed gain that I saw in comment:8 would more accurately be attributed to using HEAD, not necessarily Phab:D4766 (although Phab:D4766 does seem to shave off about five seconds, which is nothing to shake a stick at). It's not clear to me what makes HEAD so much faster, but that's quite encouraging! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Interesting. It's great that HEAD is faster; and it'd be even better to understand why. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): After some digging, I've found the commit that made compiling this program become so much faster. It's commit 55a3f8552c9dc9b84e204ec6623c698912795347 (`Refactor coercion rule`). In the commit prior, it took this long to build: {{{ $ time ~/Software/ghc2/inplace/bin/ghc-stage2 -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 3m17.647s user 3m17.776s sys 0m0.168s }}} But if I apply that commit, it then takes this long to build: {{{ $ time ~/Software/ghc2/inplace/bin/ghc-stage2 -O1 -fforce-recomp Lib.hs [1 of 2] Compiling Lib2 ( Lib2.hs, Lib2.o ) [2 of 2] Compiling Lib ( Lib.hs, Lib.o ) real 0m26.102s user 0m26.032s sys 0m0.108s }}} This is quite a humorous coincidence, since up to this point I had been under the impression that that commit had made compilation performance slightly //worse// overall. But this shows I was completely wrong! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15703: Significant compilation time blowup when refactoring singletons-heavy code -------------------------------------+------------------------------------- Reporter: RyanGlScott | Owner: (none) Type: bug | Status: new Priority: high | Milestone: 8.8.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #15725 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by goldfire): * cc: xnning (added) Comment: Ha! One of the motivations for that refactoring was increased performance, and we were befuddled why that goal didn't materialize. And now it has. :) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15703#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC