
#10397: Compiler performance regression 7.6 -> 7.8 in elimCommonBlocks -------------------------------------+------------------------------------- Reporter: TobyGoodwin | Owner: Type: bug | Status: merge Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | performance Type of failure: None/Unknown | Architecture: Blocked By: | Unknown/Multiple Related Tickets: | Test Case: see ticket | Blocking: | Differential Revisions: Phab:D892 -------------------------------------+------------------------------------- Comment (by nomeata): <rant>Geez, how do I print these `CmmBlocks`? The lack of ubiquitous `Outputable` or `Show` instances can be quite a time waster.... ah `import PprCmm ()` helps.</rant> Indeed, the hash function is simply not fine-grained enough. Including the uniques of local registers yields {{{ ghc-stage2 +RTS -t -p -RTS -B/home/jojo/build/haskell/ghc| ghc-stage2 +RTS -t -p -RTS -B/home/jojo/build/haskell/gh | total time = 13.83 secs (13831 ticks @ 1000 us, 1 p| total time = 11.79 secs (11791 ticks @ 1000 us, 1 total alloc = 14,684,289,032 bytes (excludes profiling over| total alloc = 11,894,920,976 bytes (excludes profiling ove | COST CENTRE MODULE %time %alloc | COST CENTRE MODULE %time %alloc | elimCommonBlocks CmmPipeline 17.4 21.2 | SimplTopBinds SimplCore 12.7 12.6 SimplTopBinds SimplCore 11.1 10.2 | regLiveness AsmCodeGen 9.4 8.4 regLiveness AsmCodeGen 7.7 6.8 | pprNativeCode AsmCodeGen 8.3 10.5 pprNativeCode AsmCodeGen 7.0 8.5 | RegAlloc AsmCodeGen 7.1 8.8 RegAlloc AsmCodeGen 5.9 7.1 | StgCmm HscMain 6.9 6.5 StgCmm HscMain 5.6 5.3 | sink CmmPipeline 6.3 5.9 sink CmmPipeline 5.3 4.7 | genMachCode AsmCodeGen 3.9 3.7 genMachCode AsmCodeGen 3.5 3.0 | layoutStack CmmPipeline 3.9 4.0 layoutStack CmmPipeline 3.5 3.2 | do_block Hoopl.Dataflow 3.2 1.9 do_block Hoopl.Dataflow 2.7 1.5 | postorderDfs CmmUtils 2.9 2.4 NativeCodeGen CodeOutput 2.4 2.2 | NativeCodeGen CodeOutput 2.7 2.7 postorderDfs CmmUtils 2.3 2.0 | elimCommonBlocks CmmPipeline 2.5 2.8 sequenceBlocks AsmCodeGen 2.0 1.8 | sequenceBlocks }}} which now should be finally sufficient. Will create a DR for validation and then push this. I also replaced my fancy trie by a plain `Data.Map`. It turned out to be not performance critical, so let’s remove the custom code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10397#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler