
#11735: Optimize coercionKind -------------------------------------+------------------------------------- Reporter: goldfire | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): OK, new profiling result: {{{ Thu Jan 25 13:11 2018 Time and Allocation Profiling Report (Final) ghc-stage2 +RTS -p -RTS -B/home/tobias/well- typed/devel/ghc/inplace/lib Grammar.hs -ddump-stg -ddump-simpl -ddump-to- file -fforce-recomp total time = 20.99 secs (20989 ticks @ 1000 us, 1 processor) total alloc = 29,250,375,256 bytes (excludes profiling overheads) COST CENTRE MODULE SRC %time %alloc CoreTidy HscMain compiler/main/HscMain.hs:1253:27-67 24.2 28.5 Stg2Stg HscMain compiler/main/HscMain.hs:1489:12-44 20.3 24.2 simplCast Simplify compiler/simplCore/Simplify.hs:871:62-87 18.7 15.9 tc_rn_src_decls TcRnDriver compiler/typecheck/TcRnDriver.hs:(494,4)-(556,7) 9.0 6.9 addCoerce-pushCoTyArg Simplify compiler/simplCore/Simplify.hs:(1236,12)-(1237,72) 7.4 6.4 subst_ty TyCoRep compiler/types/TyCoRep.hs:2237:28-32 4.3 5.1 zonkTopDecls TcRnDriver compiler/typecheck/TcRnDriver.hs:(445,16)-(446,43) 1.4 1.1 coercionKind Coercion compiler/types/Coercion.hs:1725:3-7 1.3 3.0 simplExprF1-Lam Simplify compiler/simplCore/Simplify.hs:896:5-39 1.0 1.1 }}} This is the same `Grammar.hs`, compiled with the same GHC code as before, but with `NthCo` extended with an extra `Role` field, and the kind calculation from `coercionRole` moved out into `mkNthCo`. I ended up having to make changes in 16 modules, but most of them were straightforward, discarding or forwarding the extra field in a pattern match. I think this shouldn't have any negative impact, because forwarding the role can only make things better (avoiding future calls to `coercionRole`), and discarding it retains the old status. Conclusions: - We're shaving off another 4 seconds of execution time, and allocations remain the same. So this doesn't seem to make things worse for the `Grammar.hs` case. - We are not actually reducing allocations any further. - CoreTidy is worth looking into. - In order to verify that this change really makes an impact for the better, I would still love to test this against source code that would perform badly without it. Test cases very welcome. - 20 seconds is still awfully long. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11735#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler