[GHC] #15103: Speed optimizations for elimCommonBlocks

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): Phab:D4597 | Wiki Page: -------------------------------------+------------------------------------- == Use toBlockList instead of revPostorder. Block elimination works on a given Cmm graph by: * Getting a list of blocks. * Looking for duplicates in these blocks. * Removing all but one instance of duplicates. There are two (reasonable) ways to get the list of blocks. * The fast way: `toBlockList`: This just flattens the underlying map into a list. * The convenient way: `revPostorder`: Start at the entry label, scan for reachable blocks and return only these. This has the advantage of removing all dead code. If there is dead code the later is better. Work done on unreachable blocks is clearly wasted work. However by the point we run the common block elimination pass the input graph already had all dead code removed. This is done during control flow optimization in CmmContFlowOpt which is our first Cmm pass. This means common block elimination is free to use toBlockList because revPostorder would return the same blocks. (Although in a different order). == Change the triemap used for grouping by a label list from `(TM.ListMap UniqDFM)` to `ListMap (GenMap IntMap)`. * Using GenMap offers leaf compression. Which is a trie optimization described by the Note [Compressed TrieMap] in CoreSyn/TrieMap.hs * Using a IntMap removes the overhead associated with UniqDFM. The reasoning why this is deterministic after the change: * IntMap is deterministic given the same keys. * Labels have a Int representation, so for the same Labels we get the same keys, hence the same result for each run. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Changes (by AndreasK): * owner: (none) => AndreasK -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Changes (by AndreasK): * status: new => patch -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => CodeGen -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Replying to [comment:3 simonpj]: Is this tag accurate? After all it doesn't change the code being generated. It's strictly a optimization for making GHC faster. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Yes, but it makes it appear on [wiki:Commentary/Compiler/CodeGen this page] which collects backend tickets generally. I'd be entirely happy with a finer-grain categorisation, if you would like to propose one, but I'm finding these topic pages quite helpful. Here's [wiki:Status/SLPJ- Tickets my list of topics]. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks
-------------------------------------+-------------------------------------
Reporter: AndreasK | Owner: AndreasK
Type: task | Status: patch
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.5
Resolution: | Keywords: CodeGen
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4597
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): What is the perf impact of this optimisation, if any? (I think the goal is to improve compile times, right? Not run-time.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Replying to [comment:7 simonpj]:
What is the perf impact of this optimisation, if any?
Generated code is the same. Compiler performance changes for the better. I think the file I looked at was nofib/spectral/simple `ghc-stage2 -fforce-recomp -c -O1 Main.hs +RTS -s -RTS` || build || Allocated (Byte) || measured time || || master || 4,475,812,248 || 3.220 s (3.204 s .. 3.251 s) || || toBlockList || 4,474,452,216 || 3.196 s (3.192 s .. 3.204 s) || || IntMap + blockList || 4,461,727,496 || 3.189 s (3.174 s .. 3.203 s) ||
I think the goal is to improve compile times, right? Not run-time.
Yes it's purely an "make GHC faster" patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Terrific. Would you like to do a nofib run before-and-after, and dump the `nofib-analyse` summary in here? It's very helpful just to record the benefits of particular changes, for posterity. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Terrific. Would you like to do a nofib run before-and-after, and dump
#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Replying to [comment:9 simonpj]: the `nofib-analyse` summary in here? It's very helpful just to record the benefits of particular changes, for posterity. {{{ -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- maillist 0.0% 0.0% 0.024 0.025 +6.5% mate -0.0% 0.0% +4.6% +4.6% 0.0% cryptarithm1 0.0% 0.0% -1.7% -1.8% 0.0% -------------------------------------------------------------------------------- Min -0.0% -0.0% -1.7% -1.8% 0.0% Max +0.0% +0.1% +4.6% +4.6% +6.5% Geometric Mean -0.0% +0.0% +0.3% +0.3% +0.1% Compile Times ------------------------------------------------------------------------------- Program logNoOpt logWithOpt ------------------------------------------------------------------------------- -1 s.d. ----- -1.9% +1 s.d. ----- +1.5% Average ----- -0.2% Compile Allocations ------------------------------------------------------------------------------- Program logNoOpt logWithOpt ------------------------------------------------------------------------------- -1 s.d. ----- -0.3% +1 s.d. ----- -0.1% Average ----- -0.2% }}} These runtime results took me by surprise! It turns out code layout depends on uniques. And the final uniques change with this patch. So we get slightly different codelayout. I verified that the code for mate is otherwise the same. Using different starting or increment values for uniques also lead to different results with up to the same performance spread. So I don't think the runtime change is representative. Compiler performance impact is as expected. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15103: Speed optimizations for elimCommonBlocks -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: closed Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.5 Resolution: fixed | Keywords: CodeGen Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4597 Wiki Page: | -------------------------------------+------------------------------------- Changes (by AndreasK): * status: patch => closed * resolution: => fixed Comment: This has been merged -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15103#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC