[GHC] #15019: Fix performance regressions from #14737

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Compile-time Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: #14737 Differential Rev(s): D4568 | Wiki Page: -------------------------------------+------------------------------------- #14737, while drastically improving performance on some edge cases, introduces performance regressions in 3 test cases: perf/compiler/parsing001.run perf/compiler/T9020.run perf/compiler/T12707.run -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): D4568 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * owner: (none) => tdammers Comment: Tobias will look into this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * differential: D4568 => phab:D4568 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Description changed by simonpj: Old description:
#14737, while drastically improving performance on some edge cases, introduces performance regressions in 3 test cases:
perf/compiler/parsing001.run perf/compiler/T9020.run perf/compiler/T12707.run
New description: #14737, while drastically improving performance on some edge cases, introduces performance regressions in 3 test cases: * perf/compiler/parsing001.run * perf/compiler/T9020.run * perf/compiler/T12707.run -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Looking at one of the regressions, T9010, I set up a trap to catch only those coercions that are found to be equal by `eqType`, but don't match on `isReflCo`. And they're all the same: `(State# RealWorld -> (# State# RealWorld, () #), State# RealWorld -> (# State# RealWorld, () #))`. So this one looks like it would be relatively easy to detect. Unfortunately, the other one, T12707, paints a different picture; here, the matches look like these: {{{ (forall a. K1 R Bool a -> [String] -> [String], forall a. K1 R Bool a -> [String] -> [String]) }}} {{{ (forall a. M1 S ('MetaSel ('Just "foo3Field08") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (K1 R (Maybe Bool)) a -> [String] -> [String], forall a. M1 S ('MetaSel ('Just "foo3Field08") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) a -> [String] -> [String]) }}} {{{ (forall a. (:*:) (((S1 ('MetaSel ('Just "foo0Field00") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field01") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field02") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field03") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field04") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool)))) :*: ((S1 ('MetaSel ('Just "foo0Field05") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field06") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field07") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field08") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field09") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool))))) (((S1 ('MetaSel ('Just "foo0Field10") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field11") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field12") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field13") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field14") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool)))) :*: ((S1 ('MetaSel ('Just "foo0Field15") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field16") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field17") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field18") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field19") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool))))) a -> [String] -> [String], forall a. (:*:) (((S1 ('MetaSel ('Just "foo0Field00") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field01") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field02") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field03") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field04") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool)))) :*: ((S1 ('MetaSel ('Just "foo0Field05") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field06") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field07") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field08") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field09") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool))))) (((S1 ('MetaSel ('Just "foo0Field10") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field11") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field12") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field13") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field14") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool)))) :*: ((S1 ('MetaSel ('Just "foo0Field15") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 String) :*: S1 ('MetaSel ('Just "foo0Field16") 'NoSourceUnpackedness 'SourceStrict 'DecidedUnpack) (Rec0 Int)) :*: (S1 ('MetaSel ('Just "foo0Field17") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 [Int]) :*: (S1 ('MetaSel ('Just "foo0Field18") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Bool)) :*: S1 ('MetaSel ('Just "foo0Field19") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Bool))))) a -> [String] -> [String]) }}} Further digging, however, reveals that these are all `Sym` coercions, so I think I can figure out a faster way to detect these. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Looking at one of the regressions, T9010, I set up a trap to catch only
#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): those coercions that are found to be equal by eqType, but don't match on isReflCo. And they're all the same: ... Eh? The code you give doesn't look like a coercion. It looks like a type, or a pair of types. I must be misunderstanding. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:5 simonpj]:
Eh? The code you give doesn't look like a coercion. It looks like a type, or a pair of types. I must be misunderstanding.
Sorry about that, didn't realize that this was misleading without the relevant context. So we have this code here: {{{ pushCoTyArg co ty | isReflCo co = Just (ty, Nothing) -- The following is inefficient - don't do `eqType` here, the coercion -- optimizer will take care of it. See Trac #14737. | tyL `eqType` tyR = pprTrace "eqType fired: " (ppr (tyL, tyR)) $ -- <- this is where we're doing the trace logging Just (ty, Nothing) | isForAllTy tyL = ASSERT2( isForAllTy tyR, ppr co $$ ppr ty ) Just (ty `mkCastTy` mkSymCo co1, Just co2) | otherwise = Nothing where Pair tyL tyR = coercionKind co -- co :: tyL ~ tyR -- tyL = forall (a1 :: k1). ty1 -- tyR = forall (a2 :: k2). ty2 }}} So the two types we see in this log are `tyL` and `tyR`, calculated using `coercionKind co`. And our goal is to find a drop-in replacement for `isReflCo` that detects most of the coercions that hit the `eqType` guard right now, but is significantly cheaper than `eqType` (and `coercionKind` / `coercionKindRole`). It may become less confusing if we dump the actual coercion too: {{{ pushCoTyArg co ty | isReflCo co = Just (ty, Nothing) -- The following is inefficient - don't do `eqType` here, the coercion -- optimizer will take care of it. See Trac #14737. | tyL `eqType` tyR = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $ Just (ty, Nothing) }}} This gives us output like the following: {{{ eqType fired: (Sym (N:GQuu[0] <K1 R Bool>_N) ; N:GQuu[0] <K1 R Bool>_N, forall a. K1 R Bool a -> [String] -> [String], forall a. K1 R Bool a -> [String] -> [String]) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Ah, much easier to understand. Now my next question is this: who built that coercion? It is actually reflexive, and it's trivial to see that; but it is not `ReflCo`. Usually we build coercions with smart constructors that rapidly smash to `ReflCo`. Can you describe how to reproduce this exact example? I can't get it to trip on `T9020` and `T12707` seems much larger. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): OK, sure. Just check out GHC HEAD, and apply this patch: {{{ diff --git a/compiler/coreSyn/CoreOpt.hs b/compiler/coreSyn/CoreOpt.hs index f1ff68d133..6ddab2c94e 100644 --- a/compiler/coreSyn/CoreOpt.hs +++ b/compiler/coreSyn/CoreOpt.hs @@ -958,10 +958,14 @@ pushCoTyArg :: CoercionR -> Type -> Maybe (Type, Maybe CoercionR) -- If the returned coercion is Nothing, then it would have been reflexive; -- it's faster not to compute it, though. pushCoTyArg co ty + | isReflCo co + = Just (ty, Nothing) + -- The following is inefficient - don't do `eqType` here, the coercion -- optimizer will take care of it. See Trac #14737. - -- -- | tyL `eqType` tyR - -- -- = Just (ty, Nothing) + | tyL `eqType` tyR + = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $ + Just (ty, Nothing) | isForAllTy tyL = ASSERT2( isForAllTy tyR, ppr co $$ ppr ty ) @@ -993,10 +997,15 @@ pushCoValArg :: CoercionR -> Maybe (Coercion, Maybe Coercion) -- If the second returned Coercion is actually Nothing, then no cast is necessary; -- the returned coercion would have been reflexive. pushCoValArg co + -- The "easy" case; this doesn't interest us. + | isReflCo co + = Just (mkRepReflCo arg, Nothing) + -- The following is inefficient - don't do `eqType` here, the coercion -- optimizer will take care of it. See Trac #14737. - -- -- | tyL `eqType` tyR - -- -- = Just (mkRepReflCo arg, Nothing) + | tyL `eqType` tyR + = pprTrace "eqType fired: " (ppr (co, tyL, tyR)) $ + Just (mkRepReflCo arg, Nothing) | isFunTy tyL , (co1, co2) <- decomposeFunCo Representational co }}} Build with the `validate` flavor, and then: {{{ ./inplace/bin/ghc-stage2 -O2 testsuite/tests/perf/compiler/T9020.hs -fforce-recomp }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK, got it. The culprit is this: {{{ addCoerce co1 (CastIt co2 cont) = addCoerce (mkTransCo co1 co2) cont }}} It's really very common for `co1` and `co2` to cancel each other out, but `mkTransCo` doesn't spot that. Next thing to try: {{{ addCoerce co cont | isReflCo co = return cont addCoerce co1 (CastIt co2 cont) | isReflexiveCo co' = return cont | otherwise = {-#SCC "addCoerce-simple-recursion" #-} addCoerce co' cont where co' = mkTransCo co1 co2 }}} And get rid of the `isReflexiveCo` at the end altogether. The initial `isReflCo` is optional -- just to catch the case where the `simplCorecion` returns Refl. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:9 simonpj]:
OK, got it. The culprit is this: {{{ addCoerce co1 (CastIt co2 cont) = addCoerce (mkTransCo co1 co2) cont }}} It's really very common for `co1` and `co2` to cancel each other out, but `mkTransCo` doesn't spot that. Next thing to try:
{{{ addCoerce co cont | isReflCo co = return cont
addCoerce co1 (CastIt co2 cont) | isReflexiveCo co' = return cont | otherwise = {-#SCC "addCoerce-simple-recursion" #-} addCoerce co' cont where co' = mkTransCo co1 co2 }}} And get rid of the `isReflexiveCo` at the end altogether.
The initial `isReflCo` is optional -- just to catch the case where the `simplCorecion` returns Refl.
Yes, this does seem to work. Running some tests to verify. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Wow. With this change I see {{{ Unexpected stat failures: /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T5837.run T5837 [stat too good] (normal) (-7.9%) /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T5321FD.run T5321FD [stat too good] (normal) (-12.5%) /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T12425.run T12425 [stat too good] (optasm) (-9.5%) /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T9020.run T9020 [stat too good] (optasm) (-32.5%) /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T12707.run T12707 [stat too good] (normal) (-8.7%) /tmp/ghctest-c0sci53m/test spaces/./perf/compiler/T12227.run T12227 [stat not good enough] (normal) (+660%) }}} Clearly something is going pear shaped in `T12227` but the other numbers look fantastic for such a small change! Do you see this too? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Actually, no, I only see "stat too good" in T5321FD, T9020, T12707 and T12425. I just submitted differential Phab:D4635, so I guess we should see what harbormaster has to say on the matter. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK. Turns out that the T12227 regression was because I removed the `isReflexiveCo` at the end of `addCoerce`. This is embarrassingly delicate. Anyway, could you try this revised patch please? I've adjusted it a bit, and added comments. {{{ diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs index d92f6d7..f6a86f3 100644 --- a/compiler/simplCore/Simplify.hs +++ b/compiler/simplCore/Simplify.hs @@ -1209,40 +1209,73 @@ rebuild env expr cont ************************************************************************ -} +{- See Note [Optimising reflexivity] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +It's important (for compiler performance) to get rid of reflexivity as soon +as it appears. See Trac #11735, #14737, and #15019. + +In particular, we want to behave well on + + * e |> co1 |> co2 + where the two happent to cancel out entirely. That is quite common; + e.g. a newtype wrapping and unwrapping cancel + + + * (f |> co) @t1 @t2 ... @tn x1 .. xm + Here we wil use pushCoTyArg and pushCoValArg successively, which + build up NthCo stacks. Silly to do that if co is reflexive. + +However, we don't want to call isReflexiveCo too much, because it uses +type equality which is expensive on big types (Trac #14737 comment:7). + +A good compromise (determined experimentally) seems to be to call +isReflexiveCo + * when composing casts, and + * at the end + +In investigating this I saw missed opportunities for on-the-fly +coercion shrinkage. See Trac #15090. +-} + + simplCast :: SimplEnv -> InExpr -> Coercion -> SimplCont -> SimplM (SimplFloats, OutExpr) simplCast env body co0 cont0 = do { co1 <- {-#SCC "simplCast-simplCoercion" #-} simplCoercion env co0 - ; cont1 <- {-#SCC "simplCast-addCoerce" #-} addCoerce co1 cont0 + ; cont1 <- {-#SCC "simplCast-addCoerce" #-} + if isReflCo co1 + then return cont0 -- See Note [Optimising reflexivity] + else addCoerce co1 cont0 ; {-#SCC "simplCast-simplExprF" #-} simplExprF env body cont1 } where -- If the first parameter is Nothing, then simplifying revealed a -- reflexive coercion. Omit. - addCoerce0 :: Maybe OutCoercion -> SimplCont -> SimplM SimplCont - addCoerce0 Nothing cont = return cont - addCoerce0 (Just co) cont = addCoerce co cont + addCoerceM :: Maybe OutCoercion -> SimplCont -> SimplM SimplCont + addCoerceM Nothing cont = return cont + addCoerceM (Just co) cont = addCoerce co cont addCoerce :: OutCoercion -> SimplCont -> SimplM SimplCont - - addCoerce co1 (CastIt co2 cont) - = {-#SCC "addCoerce-simple-recursion" #-} - addCoerce (mkTransCo co1 co2) cont + addCoerce co1 (CastIt co2 cont) -- See Note [Optimising reflexivity] + | isReflexiveCo co' = return cont + | otherwise = addCoerce co' cont + where + co' = mkTransCo co1 co2 addCoerce co cont@(ApplyToTy { sc_arg_ty = arg_ty, sc_cont = tail }) | Just (arg_ty', m_co') <- pushCoTyArg co arg_ty = {-#SCC "addCoerce-pushCoTyArg" #-} - do { tail' <- addCoerce0 m_co' tail + do { tail' <- addCoerceM m_co' tail ; return (cont { sc_arg_ty = arg_ty', sc_cont = tail' }) } addCoerce co cont@(ApplyToVal { sc_arg = arg, sc_env = arg_se - , sc_dup = dup, sc_cont = tail }) + , sc_dup = dup, sc_cont = tail }) | Just (co1, m_co2) <- pushCoValArg co , Pair _ new_ty <- coercionKind co1 - , not (isTypeLevPoly new_ty) -- without this check, we get a lev-poly arg + , not (isTypeLevPoly new_ty) -- Without this check, we get a lev-poly arg -- See Note [Levity polymorphism invariants] in CoreSyn -- test: typecheck/should_run/EtaExpandLevPoly = {-#SCC "addCoerce-pushCoValArg" #-} - do { tail' <- addCoerce0 m_co2 tail + do { tail' <- addCoerceM m_co2 tail ; if isReflCo co1 then return (cont { sc_cont = tail' }) -- Avoid simplifying if possible; @@ -1260,15 +1293,10 @@ simplCast env body co0 cont0 , sc_cont = tail' }) } } addCoerce co cont - | isReflexiveCo co = {-#SCC "addCoerce-reflexive" #-} - return cont - | otherwise = {-#SCC "addCoerce-other" #-} - return (CastIt co cont) - -- It's worth checking isReflexiveCo. - -- For example, in the initial form of a worker - -- we may find (coerce T (coerce S (\x.e))) y - -- and we'd like it to simplify to e[y/x] in one round - -- of simplification + | isReflexiveCo co = return cont -- Having this at the end makes a huge + -- difference in T12227, for some reason + -- See Note [Optimising reflexivity] + | otherwise = return (CastIt co cont) simplArg :: SimplEnv -> DupFlag -> StaticEnv -> CoreExpr -> SimplM (DupFlag, StaticEnv, OutExpr) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): How did the patch in comment:13 work out? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Replying to [comment:14 simonpj]:
How did the patch in comment:13 work out?
Absolutely gorgeous. Phab:D4635 shows improvement on all 3 tests that previously failed, they now even perform about 10% better than pre-#14737. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Great. Just add the data here (and/or in the commit message), commit, close, and move on! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): Improvements should be evident from the changes in the tests, but here they are: - T5321FD: - Before: 415136648 - After: 367567168 (down 11.5%) - T9020: - Before: 423163832 - #14737: 562206104 (up 32.9%) - After: 392559256 (down 7.2% from Before) - T12425: - Before: 134780272 - #14737: 141952368 (up 5.3%) - After: 130646336 (down 3.1% from Before) None of the other test cases show any regressions, including the test added for #14737. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * status: new => patch -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): {{{ Remove unnecessary check in simplCast The coercion optimizer will take care of it anyway, and the check is prohibitively expensive. See Trac Trac #14737. Reviewers: bgamari Subscribers: simonpj, thomie, carter Differential Revision: https://phabricator.haskell.org/D4568 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: tdammers Type: bug | Status: closed Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4568 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * status: closed => new * owner: tdammers => (none) * resolution: fixed => Comment: But where is Phab:D4635? Did that land? comment:17 is all about that patch, but comment:19 is a different patch entirely. Re-opening until we sort this out. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4635 Wiki Page: | -------------------------------------+------------------------------------- Changes (by tdammers): * status: new => patch * differential: phab:D4568 => phab:D4635 Comment: Figured out what went wrong. Phab:D4568 was listed in the ticket data, but that one actually belongs to #14737, where we fixed the badness in the infamous Grammar.hs (now the T13737 test case), and doctored the performance tests to silence the resulting regressions. This one has landed. The patch that is supposed to close this ticket is Phab:D4635, where we fix the regressions introduced earlier. This one hasn't landed, but I believe it should ASAP. The confusion came from me looking at the status of D4568 earlier today, but thinking I was looking at D4635; seeing that the patch had landed, I concluded that this should close the ticket as well. So, in short: this ticket isn't done yet, but all that's missing is landing Phab:D4635. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737
-------------------------------------+-------------------------------------
Reporter: tdammers | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version:
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #14737 | Differential Rev(s): phab:D4635
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4635 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): The patch in comment:23 (hooray) claims to fix three perf regressions, and get other improvement for free. Yet the patch only changes one perf number in `perf/compiler/all.T`, plus (oddly) one other comment. What's up here? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4635 Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Hmm, the comment change was due to my conflict resolution. For better or worse the numbers currently in `all.T` passed validation locally, so I can only guess that some of the numbers are now marginal. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4635 Wiki Page: | -------------------------------------+------------------------------------- Comment (by tdammers): T12425 at least does not look like it could possibly have become marginal. We went from 141952368 after #14747 (non-marginally up) to 130646336 in #15019 (non-marginally down), and now back up to 150743648 in this patch, which is even higher than the original value post-#14747. So this is somewhat curious IMO. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15019: Fix performance regressions from #14737 -------------------------------------+------------------------------------- Reporter: tdammers | Owner: (none) Type: bug | Status: patch Priority: high | Milestone: 8.10.1 Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #14737 | Differential Rev(s): phab:D4635 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: normal => high * cc: davide (added) * milestone: 8.8.1 => 8.10.1 Comment: We'll need to sort out what the status of this is. DavidE, can you investigate? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15019#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC