
#10069: CPR related performance issue -------------------------------------+------------------------------------- Reporter: pacak | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.6.2 Resolution: | Keywords: | DemandAnalysis Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * keywords: CPRAnalysis, DemandAnalysis => DemandAnalysis Comment: Looking at https://ghc.haskell.org/trac/ghc/attachment/ticket/10069/Blah .dump-simpl#L1668, I don't think this is related to CPR analysis but to the worker/wrapper transformation having issues with NOINLINE functions. What happens here is that `f1` to `f4` can't be inlined (so we don't see the case on the `A`), but `fa` still gets a strictness signature saying that all but arguments 2 to 5 are dead. WW will now split `fa` into a wrapper function that scrutinises the `A` to just project out the 4 arguments that aren't dead and pass it on to the worker `$wfa` unboxed. So far so good. Now, WW arranges it so that the worker `$wfa` builds up a new `A` with dummy values for absent fields (0# for Int#). Normally, this new `A` binding would cancel out with case matches in `$wfa`, because the strictness signature must ultimately come from some case expression. These however are hidden in `NOINLINE` functions, so no cancelling is happening. As a result, we allocate the dummy `A` for nothing, we could have just passed along the old `A`. Here's an example demonstrating this in the small: {{{#!hs data C = C !Int !Int {-# NOINLINE c1 #-} c1 :: C -> Int c1 (C _ c) = c {-# NOINLINE fc #-} fc :: C -> Int fc c = c1 c + c1 c }}} Relevant Core: {{{#!hs c1_rP = \ (ds_d3af :: C) -> case ds_d3af of { C dt_d3PA dt1_d3PB -> GHC.Types.I# dt1_d3PB } Main.$wfc = \ (ww_s7DJ :: GHC.Prim.Int#) -> case c1_rP (Main.C 0# ww_s7DJ) of { GHC.Types.I# x_a4kS -> GHC.Prim.*# 2# x_a4kS } fc = \ (w_s7DF :: C) -> case w_s7DF of { C ww1_s7DI ww2_s7DJ -> case Main.$wfc ww2_s7DJ of ww3_s7DN { __DEFAULT -> GHC.Types.I# ww3_s7DN } } }}} The problem I see here is that we don't WW `c1`, or that we don't inline the resulting wrapper into `fc` before the hypothetical worker `$wc1` of `c1` gets inlined back into `c1` because it's so small. If we inlined `$wc1` into `$wfc`, the case on `C` would cancel out with the dummy `C` and everything would be well. So: If we WW `fc`, we should also WW `c1`, otherwise we end up with bad code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10069#comment:32 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler