
#10980: Deriving Read instance from datatype with N fields leads to N^2 code size growth -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: deriving-perf Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Replying to [comment:14 nomeata]:
(In reply to comment:12) Is this a specific `+`, or just the overloaded method from `Num`? I guess the latter, as a strict `+` would yield quite different code.
Right, I meant the latter. It didn't occur to me that this behavior would only occur when `(+)` is not known by GHC to be strict; I thought it would be enough for GHC to have to perform the call to `(+)`, i.e., not inline it.
So you are proposing to add a transformation to STG (or Core) that replaces {{{ let t1 = let t2 = e1 in e2 in e3 }}} with {{{ let t2 = e1 t1 = e2 in e3 }}}
Can we give an easy criterion when this transformation is beneficial?
The problem is that this will increase allocation in the case when `t1` is not called, as it is more space efficient to pack the union of the free variables of `e1` and `e2` into one closure, instead of having one for the free variables of `e1` and one for those of `e2`.
We could say “Do this transformation if the free variables of e2 and e1” are disjoint. Then we’d allocate two extra words (one for the info table of `t2`, and one pointer to that in the closure for `t1`), but have some nice gains if `t1` is indeed called.
But this rule would not fire in this case, because the `Num a` dictionary is also a free variable, which would now have to be copied into both closures!
I guess one could try and see whether real-world programs benefit from
And you're right that it's only a gain when `(+)` is actually going to evaluate `t1`, and a loss when it is not going to. In this case the gain is asymptotic and the loss is a constant factor, but if there were a lot of subexpressions and few free variables then the reverse could be true. this transformation, and how many shared free variables between `e1` and `e2` are ok. Perhaps this discussion should move to a new ticket, since I remembered it is actually not the issue with the derived Read instance after all? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10980#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler