Re: [GHC] #10980: Deriving Read instance from datatype with N fields leads to N^2 code size growth

31 Jan 2017

      #10980: Deriving Read instance from datatype with N fields leads to N^2 code size
growth
-------------------------------------+-------------------------------------
        Reporter:  slyfox            |                Owner:
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.10.2
      Resolution:                    |             Keywords:  deriving-perf
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by rwbarton):

 Replying to [comment:14 nomeata]:
...
(In reply to comment:12) Is this a specific `+`, or just the overloaded
 method from `Num`? I guess the latter, as a strict `+` would yield quite
 different code.
Right, I meant the latter. It didn't occur to me that this behavior would
 only occur when `(+)` is not known by GHC to be strict; I thought it would
 be enough for GHC to have to perform the call to `(+)`, i.e., not inline
 it.
...
So you are proposing to add a transformation to STG (or Core) that
 replaces
{{{
let t1 = let t2 = e1
         in e2
in e3
}}}
with
{{{
let t2 = e1
    t1 = e2
in e3
}}}
Can we give an easy criterion when this transformation is beneficial?
The problem is that this will increase allocation in the case when `t1`
 is not called, as it is more space efficient to pack the union of the free
 variables of `e1` and `e2` into one closure, instead of having one for the
 free variables of `e1` and one for those of `e2`.
...
We could say “Do this transformation if the free variables of e2 and e1”
 are disjoint. Then we’d allocate two extra words (one for the info table
 of `t2`, and one pointer to that in the closure for `t1`), but have some
 nice gains if `t1` is indeed called.
But this rule would not fire in this case, because the `Num a`
 dictionary is also a free variable, which would now have to be copied into
 both closures!
I guess one could try and see whether real-world programs benefit from
And you're right that it's only a gain when `(+)` is actually going to
 evaluate `t1`, and a loss when it is not going to. In this case the gain
 is asymptotic and the loss is a constant factor, but if there were a lot
 of subexpressions and few free variables then the reverse could be true.

 this transformation, and how many shared free variables between `e1` and
 `e2` are ok.

 Perhaps this discussion should move to a new ticket, since I remembered it
 is actually not the issue with the derived Read instance after all?

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10980#comment:15
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler