[GHC] #13344: Core string literal patch regresses compiler performance considerably

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- The Core string literals patch regresses compiler performance by nearly 5%. https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53f... We should really understand why this is; this is a huge regression for something that really shouldn't be costly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * owner: (none) => bgamari * failure: None/Unknown => Compile-time performance bug * priority: normal => high Old description:
The Core string literals patch regresses compiler performance by nearly 5%. https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53f...
We should really understand why this is; this is a huge regression for something that really shouldn't be costly.
New description: The Core string literals patch [[https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53feb3|regresses]] compiler performance by nearly 5%. We should really understand why this is; this is a huge regression for something that really shouldn't be costly. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * version: 8.0.1 => 8.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I asked David look into this a few weeks back; he says that he may have seen duplicated strings in Core. `T5837` looks like it would be a good testcase to evaluate this issue with. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by akio): * cc: akio (added) Comment: I'm sorry that my patch slowed down GHC by so much. I'll try to investigate. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I had a quick look. I am seeing significant increases in compiler allocations as early as typecheck/rename. For instance, in the case of `T5837` I see, ||= Phase =||= Before patch (megabytes) =||= After patch (megabytes) =|| || Parser || 1.069 || 1.069 || || Renamer/typechecker || 117.168 || 125.031 || || Desugar || 4.127 || 4.155 || || Simplifier || 0.746 || 0.811 || || CoreTidy || 2.277 || 2.516 || || CorePrep || 0.287 || 0.299 || || CodeGen || 6.804 || 6.845 || The only difference in the simplified core is due to the strings associated with the Typeable bindings. Before the patch we had, {{{#!hs $trModule4 :: GHC.Types.TrName $trModule4 = GHC.Types.TrNameS "T5837"# }}} Now we have, {{{#!hs $trModule3 :: GHC.Prim.Addr# $trModule3 = "T5837"# $trModule4 :: GHC.Types.TrName $trModule4 = GHC.Types.TrNameS $trModule3 }}} While some small fraction of the allocations change after CodePrep is attributable to the fact that StgSyn got slightly larger (since we now have a separate type for top-level bindings), this doesn't explain the difference during renaming/typechecking. I wonder if some of these newly floating string literals have snuck into interface files? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Hmm, it seems that even `tc-trace` and `rn-trace` output is unchanged by the patch. It looks like I'll need to build some ticky'd compilers. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
It looks like I'll need to build some ticky'd compilers.
Indeed; thank you. This is deeply weird. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Hmm, indeed ticky suggests that there is some leakage into interface files, {{{ | Change | alloc A | alloc B | name | |-----------|---------|---------|-------------------------------------------------------------------------------------------------| | +417944.0 | 3978840 | 4396784 | containers-0.5.7.1:Data.IntMap.Base.$winsert (:<no module>) | | +249216.0 | 1225560 | 1474776 | $wgo (ghc:BinIface) | | +166144.0 | 821968 | 988112 | go7 (ghc:LoadIface) | | +131256.0 | 1767520 | 1898776 | ghc:Module.$sinsertWith_$s$sgo10 (:<no module>) | | +75680.0 | 566080 | 641760 | mk_supply (ghc:UniqSupply) | | +68208.0 | 454152 | 522360 | unsafeDupableInterleaveIO1 (base:GHC.IO.Unsafe) | | +62016.0 | 1209552 | 1271568 | containers-0.5.7.1:Data.Map.Base.balanceL (:<no module>) | | +52864.0 | 1592272 | 1645136 | ghc:IfaceType.$w$cget6 (:<no module>) | | +45312.0 | 744224 | 789536 | ghc:Binary.$w$cget3 (:<no module>) | | +45312.0 | 226728 | 272040 | ghc:IfaceSyn.$w$cget1 (:<no module>) | | +45312.0 | 238080 | 283392 | ghc:TcRnMonad.$wforkM_maybe (:<no module>) | | +44776.0 | 934504 | 979280 | ghc:FastString.$wmkFastStringWith (:<no module>) | | +31360.0 | 340240 | 371600 | go (base:Data.OldList) | | +30208.0 | 156096 | 186304 | ghc:Module.$WDefiniteUnitId (:<no module>) | | +30208.0 | 156096 | 186304 | ghc:Module.$w$cget2 (:<no module>) | | +29008.0 | 1155896 | 1184904 | base:GHC.Base.++ (:<no module>) | | +27976.0 | 39000 | 66976 | ghc:TysWiredIn.$wisBuiltInOcc_maybe (:<no module>) | | +26432.0 | 130984 | 157416 | $wlvl12 (ghc:HscTypes) | | +26432.0 | 202496 | 228928 | bytestring-0.10.8.1:Data.ByteString.$wcopy (:<no module>) | | +26432.0 | 355440 | 381872 | ghc:NameCache.extendNameCache1 (:<no module>) | | +22768.0 | 71896 | 94664 | lexToken (ghc:Lexer) | | +22656.0 | 137472 | 160128 | ghc:Binary.$w$cget1 (:<no module>) | | +22656.0 | 107040 | 129696 | ghc:Binary.$wlazyGet (:<no module>) | | +22656.0 | 333696 | 356352 | ghc:FastString.$wmkFastStringByteString (:<no module>) | | +20544.0 | 0 | 20544 | ghc:TcRnMonad.emitInsoluble3 (:<no module>) | }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Ahh, indeed I should have looked at the interface files sooner. There are indeed more bindings here. e.g., previously in `Prelude` we had (in `--show-iface` output), {{{#!hs $trModule :: Module {- HasNoCafRefs, Strictness: m, Unfolding: (Module $trModule2 $trModule1) -} }}} Now we have, {{{#!hs $trModule1 :: TrName {- HasNoCafRefs, Strictness: m1, Unfolding: (TrNameS $trModule2) -} $trModule2 :: Addr# {- HasNoCafRefs, Unfolding: ("Prelude"#) -} }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Just looking at the interface file sizes between the two trees, it appears that many files increased by a few hundred bytes with a few (e.g. `GHC.Show`, `GHC.Generics`) increasing by some kilobytes. For future reference I performed the comparison with, {{{#!sh echo "delta after before file" paste \ <(ls -l $(find ghc-compare-1/libraries/base/dist-install -iname *.hi) | awk '{print $5,$9}') \ <(ls -l $(find ghc-compare-2/libraries/base/dist-install -iname *.hi) | awk '{print $5,$9}') \ | awk '{ print (($1 - $3)), $1, $3, $2}' | sort -rn | less }}} Which results in {{{ 16461 800034 783573 libraries/base/dist-install/build/GHC/Generics.hi 7673 130677 123004 libraries/base/dist-install/build/GHC/RTS/Flags.hi 7555 969695 962140 libraries/base/dist-install/build/Data/Data.hi 7449 103629 96180 libraries/base/dist-install/build/GHC/IO/Exception.hi 7068 268944 261876 libraries/base/dist-install/build/Data/Foldable.hi 6904 136200 129296 libraries/base/dist-install/build/GHC/Show.hi 4830 211720 206890 libraries/base/dist-install/build/Data/Monoid.hi 4767 458028 453261 libraries/base/dist-install/build/Data/Semigroup.hi 4735 498885 494150 libraries/base/dist-install/build/Foreign/C/Types.hi 4651 59316 54665 libraries/base/dist-install/build/GHC/IO/Handle/Types.hi ... }}} In the case of `GHC.Generics` we now have things like, {{{#!hs $fEqV14 :: [Char] {- Unfolding: (unpackCString# $fEqV15) -} $fEqV15 :: Addr# {- HasNoCafRefs, Unfolding: ("error"#) -} }}} Whereas we previously had, {{{#!hs $fEqV11 :: [Char] {- Unfolding: (unpackCString# "error"#) -} }}} One potential improvement we can make here (at the expense of some constant folding opportunities) is to drop the unfoldings from `Addr#` bindings; afterall, in most cases there is little benefit to inlining the string literal. In the case of the `Typeable` bindings, I'm very tempted to just eliminate unfoldings from all generated `TrName`s. This would likely cut down on interface file sizes dramatically. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by gridaphobe): * cc: gridaphobe (added) Comment: Replying to [comment:3 bgamari]:
I asked David look into this a few weeks back; he says that he may have seen duplicated strings in Core.
`T5837` looks like it would be a good testcase to evaluate this issue with.
FWIW, I noticed this weekend that Phab:D2605 did not enable CSE on the floated literals. I think the latest 1% decrease in binary sizes in my Phab:D1259 is largely due to enabling CSE on the floated strings. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by gridaphobe): Actually, I retract my comment, I think I grossly misunderstood what the `exprIsLiteralString` check in `tryForCSE` was doing.. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): The patch in #8472 allows floating out unboxed string literals to top level in (at least) three places: 1. In the "Floats" section in SimplEnv we float a let out of a let. If the body of the let is a lambda then we can increase the arity of the original binding, which is a good thing. This is the optimization that was allowed in the example of #8472 by letting the unboxed string literal float to the top level. 2. In `prepareRhs` in Simplify we can float, for example, an argument out of a constructor. In general this brings us closer to ANF, but when the argument is a literal it is trivial already; and the data representation we generate for `Ptr "blob"#` is just fine. 3. In the float out pass we can float out an expression to avoid repeated allocation or evaluation. But there is no allocation or evaluation associated to an unboxed literal. In https://github.com/ghc/ghc/commit/8f99cfa49be0fc0767f0c73967ff2472de5cfcd6 I've tried disabling the above items 2 and 3 for unboxed string literals. It still produces the desired code in the example of #8472, but it no longer floats the string literals out of, for example, `$trModule` bindings. I'm waiting to see what perf.haskell.org thinks about it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): There is a good reason for (2). Consider {{{ let x = C "foo"# in .x..x..x.....case x of { C s -> ...s...s... }.... }}} We'll do case-of-known-constructor on the case expression, binding `s` to the component. Ah! The component is `"foo"#` so we'll duplicate `"foo"#` at every occurrence of `s`. That is why we do ANF for let-bound constructors. I think we should continue to do so. (Unless, just possibly, the let-binding has a NOINLINE pragma...) For (3) I agree. The only reason to aggressively float literals would be for CSE. ------------- More generally, if it is ''really'' the case that these top level bindings {{{ x = "foo#" y = C x }}} take a lot longer to compile (code generate?) than {{{ y = C "foo"# }}} then I think we should seek insight into ''why''. There is no good reason for it to take longer. Let's not bend ourselves out of shape to accommodate what may possibly be an easily-fixed perf bug in the back end. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Reid and I had a rather lengthy discussion about this problem. The take- away message is that while there are a variety of things that we could do to avoid moving string literals up to the top-level (and consequently avoid the costs associated with interface file names, as seen in comment:10), it is unfortunately rather tricky to do in a way that doesn't either come at significant implementation cost or risk regressing #12585. Consequently, we came up with a few ideas for instead reducing the cost of the now-larger interface files. Happily, this effort may also have helpful effects beyond string literals. The gist is that we want make interface file deserialisation more lazy. Reid is working on a patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I'm all for making interface-file deserialisation cheaper. But I don't feel that we have a well-characterised problem. ''What exactly is the problem we are trying to solve here?'' Is it ONLY that {{{ x :: Addr# x = "foo"# y :: T y = K x }}} is more expensive than {{{ y :: T y = K "foo"# }}} And if it is more expensive, how much more expensive? And does that cost come from simplifying, spitting out an interface file, code generation, reading in an interface file? I don't think there is much to do with strings here. The same will happen for {{{ x = [True, False, True, ..., True] }}} Perhaps we just don't want to expose the unfolding for a top level constructor. That would be a bit less good on optimisation, but perhaps not a lot. We could measure it. I feel as if we are at risk of taking actions without data to guide them. I'd love to have data. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): The increase in interface file size is definitely not the only factor. Another known problem is that CSE is not working properly for top-level unboxed string literals. I found that I could make it work by changing the `Lit l` case in `use_subst` below, in CSE: {{{#!hs addBinding :: CSEnv -- Includes InId->OutId cloning -> InId -> OutId -> OutExpr -- Processed binding -> (CSEnv, OutId) -- Final env, final bndr -- Extend the CSE env with a mapping [rhs -> out-id] -- unless we can instead just substitute [in-id -> rhs] addBinding env in_id out_id rhs' | noCSE in_id = (env, out_id) | use_subst = (extendCSSubst env in_id rhs', out_id) | otherwise = (extendCSEnv env rhs' id_expr', zapped_id) where id_expr' = varToCoreExpr out_id zapped_id = zapIdUsageInfo out_id -- Putting the Id into the cs_map makes it possible that -- it'll become shared more than it is now, which would -- invalidate (the usage part of) its demand info. -- This caused Trac #100218. -- Easiest thing is to zap the usage info; subsequently -- performing late demand-analysis will restore it. Don't zap -- the strictness info; it's not necessary to do so, and losing -- it is bad for performance if you don't do late demand -- analysis -- Should we use SUBSTITUTE or EXTEND? -- See Note [CSE for bindings] use_subst = case rhs' of Var {} -> True Lit l -> litIsTrivial l _ -> False }}} `litIsTrivial` is false for an unboxed string literal (and for an `Integer` literal, and True for other literals). However I don't really understand what is going on here. Is making `use_subst` return True for unboxed string literals the right fix? I also don't think fixing this is going to make a large change to total build time, since gridaphobe says that Phab:D1259 also fixes CSE for top- level string literals, yet it improved overall build time by less than 1% on my test system. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK so you are pointing out a bug in CSE. Very good! Let's make a ticket (#13367) and fix the bug. I have done that; fix validating now. (The same fix is in gridaphobe's patch, but it's orthogonal, so best done separately.) Back to comment:16. Let's assume that this string-CSE thing makes no difference. Then all my questions in comment:16 remain. I just want to be sure that we are fixing the right problem! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton):
Consequently, we came up with a few ideas for instead reducing the cost of the now-larger interface files. Happily, this effort may also have helpful effects beyond string literals. The gist is that we want make interface file deserialisation more lazy. Reid is working on a patch.
This patch is Phab:D3269.
Back to comment:16. Let's assume that this string-CSE thing makes no difference. Then all my questions in comment:16 remain. I just want to be sure that we are fixing the right problem!
I think it will make some substantial difference, but I suspect there is still something else going on. When your CSE fix is ready (I assume it is a simple change), I will try comparing * the commit before the top-level string patch, with * the top-level string patch plus your fix to CSE by doing a `validate --build-only` with detailed tracking of where time is being spent. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
When your CSE fix is ready (I assume it is a simple change)
It is, but I got snarled up in something else. Monday I hope -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Replying to [comment:16 simonpj]:
Is it ONLY that {{{ x :: Addr# x = "foo"# y :: T y = K x }}} is more expensive than {{{ y :: T y = K "foo"# }}} And if it is more expensive, how much more expensive? And does that cost come from simplifying, spitting out an interface file, code generation, reading in an interface file?
I think I've found the main source of extra cost from this transformation. When we build a module with `-split-objs`, the code generator emits one assembly file per strongly-connected component of Cmm declarations, and then the driver runs the assembler on each of these files. In the example above, the first program will be translated into two `.o` files using two assembler invocations, while the second will be translated into a single `.o` file with a single assembler invocation. Thus each string literal will, with this change, normally result in an extra assembler invocation (assuming it gets floated to top level, and is only referred to from one place). The builds on https://perf.haskell.org/ghc are done using something like the default (perf) flavour build settings, so when it built this commit (https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53f...) the libraries were built with `-split-objs`. I ran similar builds myself and found that the number of `.o` files (so, assembler invocations, roughly) increased from 100000 to 133000 with the top-level strings patch. The 33000 extra assembler invocations can plausibly explain the extra ~100 seconds of total build time. But rather than hand-waving estimates, there's a better way to confirm what happened. In fact, we actually mean to build the libraries ''not'' with `-split- objs`, but with `-split-sections`, a new flag that achieves a similar effect but which requires running the assembler only once per module. However, the top-level strings patch (Jan 18) landed in the window between * commit 266a9dc (Jan 10), which accidentally broke detection of `-split- sections` support (https://perf.haskell.org/ghc/#revision/266a9dc4cd34008f1162eb276032c85ef8371...), and * commit 283acec (Feb 4), which fixed `-split-sections` (https://perf.haskell.org/ghc/#revision/283acec1d7307fdbd8cd7b3f1d984a036366d...) The first of these two commits increased build time from 1891s to 2150s, a change of +259s, and the second decreased it from 2653s to 2311s, a change of -342s. Let's assume that the increase between these commits (from 2150s to 2653s) is caused mainly by ghc getting slower (e.g., join points core lint), not by ghc getting larger (more modules to compile). The changes of +259s and -342s from these `-split-sections` patches are presumably due mainly to saving 100000 and 133000 assembler invocations respectively; and 342/259 = 1.32 is in remarkably close agreement with these numbers. That suggests that 342-259 = 83 of the 102 seconds by which top-level unboxed strings increased the build time should be attributed to additional invocations of the assembler. The good news is that we do not have to do anything about those 83 seconds, as we do not use `-split-objs` during the GHC build any more. Effectively they were measurement error, due to the build being misconfigured when the top-level strings patch landed. This leaves an increase of 102-83 = 19 seconds (out of a total ~2000, so about 1%) from the top-level strings patch, which can plausibly be explained by (1) larger interface files and (2) lack of CSE for top-level strings. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by gridaphobe): Wow, excellent sleuthing! So is the bottom line is that the increase that you believe to '''actually''' be attributable to the top-level strings patch will be fixed by the combined CSE of string literals and lazier interface file patches? (I seem to recall you mentioning a gain of roughly 1% from one or the other, but it's not in this ticket, perhaps IRC..) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton):
(I seem to recall you mentioning a gain of roughly 1% from one or the other, but it's not in this ticket, perhaps IRC..)
Yes, I think I mentioned on IRC that your D1259 sped up the build by about 1%; likely because it fixes CSE, but not sure. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Great stuff. My CSE patch is done but I have to re-validate having caught up with HEAD. In flight... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: bgamari
Type: bug | Status: new
Priority: high | Milestone: 8.2.1
Component: Compiler | Version: 8.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
OK done
{{{
commit 9304df5230a7a29d3e992916d133e462b854e55f
Author: Simon Peyton Jones

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: closed Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => fixed Comment: As perhaps expected, the above commit sped up the build by 20 seconds: https://perf.haskell.org/ghc/#revision/9304df5230a7a29d3e992916d133e462b854e... (click the "=" at the top right to see the build time) which means that we've accounted for the regressions from the top-level strings patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: closed Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #8472 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * related: => #8472 Old description:
The Core string literals patch [[https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53feb3|regresses]] compiler performance by nearly 5%.
We should really understand why this is; this is a huge regression for something that really shouldn't be costly.
New description: The Core string literals patch [[https://perf.haskell.org/ghc/#revision/d49b2bb21691892ca6ac8f2403e31f2a5e53feb3|regresses]] (see #8472) compiler performance by nearly 5%. We should really understand why this is; this is a huge regression for something that really shouldn't be costly. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13344: Core string literal patch regresses compiler performance considerably -------------------------------------+------------------------------------- Reporter: bgamari | Owner: bgamari Type: bug | Status: closed Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 8.1 Resolution: fixed | Keywords: strings Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #8472 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * keywords: => strings -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13344#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC