
#14737: Improve performance of Simplify.simplCast -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #11735 #14683 | Differential Rev(s): Phab:D4385 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:10 simonpj]:
Try getting rid of the first equation for `puchCoTyArg` {{{ pushCoTyArg co ty | tyL `eqType` tyR = Just (ty, mkRepReflCo (piResultTy tyR ty)) }}} This is another big pile of type-equalities, rather like calling `isReflexiveCo` at the wrong moment.
Claim: if it happens that `tyL` = `tyR`, but we go ahead with all that `mkCoherenceLeftCo` stuff anyway, then the coercion optimiser will get rid of it later. '''Richard''': will it?
But try that change anyway. NO WAY should `pushCoTyArg` take 54% of compile time!
Plain out removing that case branch gets us down by another 4 seconds: {{{ Tue Apr 3 11:09 2018 Time and Allocation Profiling Report (Final) ghc-stage2 +RTS -p -RTS -B/home/tobias/well- typed/devel/ghc/T14737/inplace/lib ./cases/Grammar.hs -o ./a -fforce- recomp total time = 7.86 secs (7864 ticks @ 1000 us, 1 processor) total alloc = 10,150,661,432 bytes (excludes profiling overheads) COST CENTRE MODULE SRC %time %alloc mkInstCo CoreOpt compiler/coreSyn/CoreOpt.hs:982:33-84 31.7 40.6 tc_rn_src_decls TcRnDriver compiler/typecheck/TcRnDriver.hs:(494,4)-(556,7) 20.6 20.4 CoreTidy HscMain compiler/main/HscMain.hs:1253:27-67 7.2 5.5 SimplTopBinds SimplCore compiler/simplCore/SimplCore.hs:770:39-74 6.6 4.6 simplCast Simplify compiler/simplCore/Simplify.hs:(1213,5)-(1215,37) 3.7 3.5 zonkTopDecls TcRnDriver compiler/typecheck/TcRnDriver.hs:(445,16)-(446,43) 3.5 3.1 deSugar HscMain compiler/main/HscMain.hs:511:7-44 2.4 1.9 coercionKind Coercion compiler/types/Coercion.hs:1716:3-7 1.9 4.6 isReflexiveCo Simplify compiler/simplCore/Simplify.hs:1260:40-55 1.8 1.4 Parser HscMain compiler/main/HscMain.hs:(316,5)-(384,20) 1.8 2.3 StgCmm HscMain compiler/main/HscMain.hs:(1428,13)-(1429,62) 1.6 0.7 }}} I've added a few more SCC's to trace more deeply into `simplCast`, which is why `simplCast` itself has seemingly dropped to 3.7% - this isn't accurate, because `mkInstCo` makes up most of the rest of the `simplCast` call. So I suggest committing the branch deletion (assuming that it won't break anything). From here, I'm not 100% sure which is more promising: digging into `mkInstCo` to see if we can make it more efficient, or looking at `simplCast` to see if we can make it call `mkInstCo` less often. Also:
Note that ​Phab:D4395 currently removes the piResultTy from that case, but it's quite possible that the eqType call is what's taking up the time.
The full profile from before the deletion (which, unfortunately, I no longer have around) clearly shows that `eqType` is what consumes all that time, not `piResultTy`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14737#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler