
#13207: Performance regressions from removing LNE analysis -------------------------------------+------------------------------------- Reporter: lukemaurer | Owner: lukemaurer Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.1 Resolution: | Keywords: JoinPoints Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: 12988 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by lukemaurer: @@ -35,2 +35,2 @@ - adapt the occurrence analyzer and the simplifier, or not do that in Core - Tidy. + stop doing that there or adapt the occurrence analyzer and the simplifier + (or `simpleOptPgm`). New description: `CoreToStg`'s LNE analysis has been superseded by the occurrence analyzer's join-point analysis. However, there are a few cases where the occurrence analyzer doesn't find a join point where the old analysis would've found an LNE. Known causes include: 1. The LNE analysis was untyped, which (very) occasionally allows an LNE where a join point would've fallen afoul of the polymorphism rule (see Note [The polymorphism rule of join points] in `CoreSyn.hs`). 2. Sometimes a rewrite rule prevents a binding from being christened a join point; such a binding would've been found by the LNE analysis (which runs after Core Tidy has removed rules from local binders). The biggest loss in NoFib is `spectral/cryptarithm2`, which saw a 0.3% increase in allocations because a rule prevented detection of a join point. Specifically, there's a `go` that gets SpecConstr'd as an `$sgo`. All the calls to `$sgo` happen to be tail calls, but some calls to `go` aren't, so it's only the rule that keeps `$sgo` from becoming a join point. In summary (adapted from the `cryptarithm2` situation): {{{ letrec $sgo = \x y z -> ... go {- non-tail call -} ... {-# RULE "SPEC go" forall x y z. go ((x,y):z) = $sgo x y z #-} go = \xs -> ... go ... in ... go ... $sgo a b c {- tail call -} ... }}} The only thing keeping `$sgo` from becoming a join point is the rule. Possible solutions: 1. Run the occurrence analyzer and the simplifier (or `simpleOptPgm`) after Tidy Core. Seems like the solution needing the least code. Problems: * Tidy Core turns top-level binders into `GlobalId`s. Would need to stop doing that there or adapt the occurrence analyzer and the simplifier (or `simpleOptPgm`). * Only covers case 2. 2. Write a new pass that just throws out all the rules from the local binders in the module, then run it (and then the simplifier) at the end of the Core-to-Core pipeline, just before Core Tidy. (Local rules don't appear in ifaces anyway.) * Needs a whole new traversal, albeit a very simple one. Though it's possible that dropping the rules would lead to other optimizations (`$sgo` would end up pre-inlined in the above example), so it might be helpful anyway. * Also only covers case 2. 3. Write the LNE analysis as an STG-to-STG pass. Only idea that covers both cases. * Needs even more code, and duplicative of the occurrence analyzer. 4. Loosen the join-point invariants somehow. For example, if we could flag a rule as applying only to tail calls, that rule could contain a jump in its RHS even though its LHS isn't a jump. * But this is already problematic in the above example—we also can't make `$sgo` a join point so long as it's mutually recursive with `go`, which it still is so long as the rule is there. In other words, even if the rule doesn't directly keep `$sgo` from being a join point, it's still the only thing keeping it from breaking free of `go`'s recursive group, which it must do to become a join point. * So it doesn't cover case 2 very well and it doesn't cover case 1 at all. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13207#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler