[GHC] #9400: poor performance when compiling modules with many Text literals at -O1

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Keywords: | Operating System: Architecture: Unknown/Multiple | Unknown/Multiple Difficulty: Unknown | Type of failure: Compile- Blocked By: | time performance bug Related Tickets: #9370 | Test Case: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- (Spawned from #9370; see there for original discussion) Unpack the xmlhtml package and edit `src/Text/XmlHtml/HTML/Meta.hs` and remove the `OPTIONS_GHC` line at the top of the file that disables optimizations and build with `cabal buidl`. Then GHC takes ~1.5GB and over a minute to build this single module. Preliminary investigation indicates that Text's fromString and its constituent parts is being inlined repeatedly: {{{ Inlining done: Data.String.fromString Inlining done: Data.Text.$fIsStringText Inlining done: Data.Text.pack Inlining done: Data.Text.Internal.Fusion.unstream Inlining done: Data.Text.Internal.Fusion.Common.map Inlining done: Data.Text.Internal.Fusion.Common.streamList Inlining done: Data.Text.Internal.safe Inlining done: Data.Bits.$fBitsInt_$c.&. Inlining done: Data.Text.Internal.Fusion.Types.$WYield [ repeats ~4000 times ] }}} resulting in a very large intermediate program: {{{ *** Checking old interface for xmlhtml-0.2.3.2:Text.XmlHtml.HTML.Meta: [ 1 of 10] Compiling Text.XmlHtml.HTML.Meta ( src/Text/XmlHtml/HTML/Meta.hs, dist/build/Text/XmlHtml/HTML/Meta.o ) *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 26,260, types: 20,021, coercions: 0} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 1,446,658, types: 953,432, coercions: 314,352} ... }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Description changed by rwbarton: Old description:
(Spawned from #9370; see there for original discussion)
Unpack the xmlhtml package and edit `src/Text/XmlHtml/HTML/Meta.hs` and remove the `OPTIONS_GHC` line at the top of the file that disables optimizations and build with `cabal buidl`. Then GHC takes ~1.5GB and over a minute to build this single module.
Preliminary investigation indicates that Text's fromString and its constituent parts is being inlined repeatedly: {{{ Inlining done: Data.String.fromString Inlining done: Data.Text.$fIsStringText Inlining done: Data.Text.pack Inlining done: Data.Text.Internal.Fusion.unstream Inlining done: Data.Text.Internal.Fusion.Common.map Inlining done: Data.Text.Internal.Fusion.Common.streamList Inlining done: Data.Text.Internal.safe Inlining done: Data.Bits.$fBitsInt_$c.&. Inlining done: Data.Text.Internal.Fusion.Types.$WYield [ repeats ~4000 times ] }}} resulting in a very large intermediate program: {{{ *** Checking old interface for xmlhtml-0.2.3.2:Text.XmlHtml.HTML.Meta: [ 1 of 10] Compiling Text.XmlHtml.HTML.Meta ( src/Text/XmlHtml/HTML/Meta.hs, dist/build/Text/XmlHtml/HTML/Meta.o ) *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 26,260, types: 20,021, coercions: 0} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 1,446,658, types: 953,432, coercions: 314,352} ... }}}
New description: (Spawned from #9370; see there for original discussion) Unpack the xmlhtml package and edit `src/Text/XmlHtml/HTML/Meta.hs` and remove the `OPTIONS_GHC` line at the top of the file that disables optimizations and build with `cabal buidl`. Then GHC takes ~1.5GB and over a minute to build this single module. Preliminary investigation indicates that Text's fromString and its constituent parts is being inlined repeatedly: {{{ Inlining done: Data.String.fromString Inlining done: Data.Text.$fIsStringText Inlining done: Data.Text.pack Inlining done: Data.Text.Internal.Fusion.unstream Inlining done: Data.Text.Internal.Fusion.Common.map Inlining done: Data.Text.Internal.Fusion.Common.streamList Inlining done: Data.Text.Internal.safe Inlining done: Data.Bits.$fBitsInt_$c.&. Inlining done: Data.Text.Internal.Fusion.Types.$WYield [ repeats ~4000 times ] }}} resulting in a very large intermediate program: {{{ [ 1 of 10] Compiling Text.XmlHtml.HTML.Meta ( src/Text/XmlHtml/HTML/Meta.hs, dist/build/Text/XmlHtml/HTML/Meta.o ) *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 26,806, types: 14,467, coercions: 0} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 31,052, types: 20,719, coercions: 0} Result size of Simplifier = {terms: 31,052, types: 20,719, coercions: 0} *** Specialise: Result size of Specialise = {terms: 32,254, types: 22,696, coercions: 448} *** Float out(FOS {Lam = Just 0, Consts = True, PAPs = False}): Result size of Float out(FOS {Lam = Just 0, Consts = True, PAPs = False}) = {terms: 63,022, types: 59,045, coercions: 448} *** Float inwards: Result size of Float inwards = {terms: 63,022, types: 59,045, coercions: 448} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 28,537, types: 18,902, coercions: 654} Result size of Simplifier iteration=2 = {terms: 28,157, types: 18,257, coercions: 152} Result size of Simplifier iteration=3 = {terms: 28,074, types: 18,128, coercions: 140} Result size of Simplifier iteration=4 = {terms: 28,068, types: 18,096, coercions: 61} Result size of Simplifier = {terms: 28,068, types: 18,096, coercions: 61} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 43,941, types: 38,325, coercions: 61} Result size of Simplifier = {terms: 43,941, types: 38,325, coercions: 61} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 689,670, types: 461,960, coercions: 146,725} ... }}} [Edited: old output was not actually for `-O1` as described, but rather for the sequence described in #9370 of building with `-O0` after building another module with `-O1`.] -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: invalid | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => invalid Comment: OK this is a bit funny. Normally a Text literal `"abc"` gets desugared as {{{ fromString $fIsStringText (unpackCString# "abc"#) }}} Now `fromString $fIsStringText = pack`, and `pack = unstream . S.map safe . S.streamList`, and there is a rule in `Data.Text` {{{ {-# RULES "TEXT literal" forall a. unstream (S.map safe (S.streamList (GHC.unpackCString# a))) = unpackCString# a #-} }}} and `Data.Text.unpackCString#` has a NOINLINE pragma so we end up with the nice small code: `Data.Text.unpackCString# "abc"`. ''But'', a ''single-character'' literal `"a"` is instead desugared as {{{ fromString $fIsStringText (: (C# 'a') ([])) }}} and now there is no rule which matches this pattern. And `unstream` is marked `INLINE [0]`, as Simon predicted; and it's rather large. And most XML entities represent single Unicode characters, so GHC generated around 2000 copies of `unstream`. I don't know why there is an `INLINE` pragma on `unstream`. Perhaps no good reason. But anyways, there is a simple fix to the text package: add another rule to match the pattern `unstream (S.map safe (S.streamList [c]))`. (And similarly for empty string literals.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: invalid | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): Another possibility might be to remove the special behaviour of desugaring strings from GHC. Why is it there? I think it's in case you do {{{ case "x" of 'x' : ys -> blah }}} or something like that. But perhaps a better plan would be to make `unpackCString# "foo"` respond to `exprIsConApp_maybe` with "yes, I'm a cons; my head is `'f'` and my tail is `unpackCString# "oo"`. So before making `Text` more complicated, maybe we should explore making GHC less complicated! Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: invalid | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): Just to add a bit more detail: * The special handling for singleton string literals is in `MkCore.mkStringExprFS`. Omittign the `lengthFS str == 1` case looks very plausible to me. * Notice that this same function sometimes generates `unpackCStringUtf8` rather than `unpackCString`. Rules that recognise only the latter will stumble if the former is produced. I have no clue what to do about this -- Unicode experts may. * The function `CoreSubst.exprIsConApp_maybe` is the place where we say "does this expression look like a constructor application?". Adding an extra case to `go`, below the ones for `dataConWorkId` and `DFunFunfolding` should let us dynamically expand a call to `unpackCString` into a cons cell. And that will actually be ''better'' than now, because it should eliminate {{{ case "foo" of 'f' : 'o' : xs -> .... }}} Would someone care to try? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: invalid | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by rwbarton): Replying to [comment:4 simonpj]:
Another possibility might be to remove the special behaviour of desugaring strings from GHC. Why is it there?
I was guessing that perhaps the generated code is smaller for a one- character list than for a call to `unpackCString#`. But if that's true, it's not a very good reason to put that logic in the desugarer; we could rewrite `unpackCString# "a"` to `'a' : []` in a later optimizer pass, after RULES have fired. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by simonpj): * status: closed => new * resolution: invalid => Comment: Re-opening because I think we should fix GHC as comment:4 etc. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): Would anyone be willing to try the fix I describe above? I can advise. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by xnyhps): * owner: => xnyhps Comment: This looked like quite a simple bug I could work on, so I decided to have a look. * Removing the `lengthFS str == 1` case in `mkStringExprFS` does not appear to break anything. * I've managed to modify `CoreSubst.exprIsConApp_maybe` to split calls to `unpackCString#` or `unpackCStringUtf8#`. It'll also recognize when it's a single-character string and split it into `(':', [c, nil])`, to avoid ever creating calls to `unpackCString*# ""#`. Looking at the generated Core, I've verified that it now pushes `unpackCString*#` calls through `case`-statements. This means that, even at -O0: {{{ main = case "abc" of (x:xs) -> putStrLn xs }}} compiles to `main = System.IO.putStrLn GHC.CString.unpackCString# "bc"#` * I haven't timed it, but `src/Text/XmlHtml/HTML/Meta.hs` now compiles quickly and with only ~180MB RAM. * `unpackCString#` and `unpackCStringUtf8#` seem very similar, except that `unpackCStringUtf8#` parses UTF8. At least for `mkStringExprFS` it could work to only use `unpackCStringUtf8#`, with a slight run-time penalty (calling `leChar#` for each character in the string), but the only benefit would be making rewrite rules easier. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: Phab:D199 | -------------------------------------+------------------------------------- Changes (by xnyhps): * differential: => Phab:D199 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1
-------------------------------------+-------------------------------------
Reporter: rwbarton | Owner: xnyhps
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.3
Resolution: | Keywords:
Operating System: | Architecture: Unknown/Multiple
Unknown/Multiple | Difficulty: Unknown
Type of failure: Compile- | Blocked By:
time performance bug | Related Tickets: #9370
Test Case: |
Blocking: |
Differential Revisions: Phab:D199 |
-------------------------------------+-------------------------------------
Comment (by Austin Seipp

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: xnyhps Type: bug | Status: closed Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.3 Resolution: fixed | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: Phab:D199 | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: new => closed * resolution: => fixed * milestone: => 7.10.1 Comment: Awesome! Merged, thank you! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9400: poor performance when compiling modules with many Text literals at -O1 -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: xnyhps Type: bug | Status: closed Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.3 Resolution: fixed | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: Compile- | Blocked By: time performance bug | Related Tickets: #9370 Test Case: | Blocking: | Differential Revisions: Phab:D199 | -------------------------------------+------------------------------------- Comment (by rwbarton): By the way, bos has added the rules for Data.Text.fromString on 0- and 1-character literals in text-1.2.0.0, and I confirmed that the original issue of this ticket does not arise with that version. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9400#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC