
Simon Peyton Jones
PS: before we go to some effort to optimise X, can we
I briefly characterized this earlier this week. For a module exporting lots of static data of roughly the same type as TypeRep (e.g. data T = T Addr# Int# Int), the cost scales essentially linearly in the number of static bindings, so I don't think there's any easy non-linearity to fix here. However, compiler allocations roughly double when moving from -O0 to -O1 (from roughly 300 kB / binding to roughly 600 kB / binding). Allocations hardly increase any further with -O2. I just checked now and the C-- looks reasonable. Admittedly, allocations only doubling isn't so bad given how much it costs to optimize non-trivial code (e.g. an Ord instance). Unfortunately I didn't quantify this effect in my investigation earlier in the week. I'll do so now: With -O0 deriving even one set of simple Eq and Ord instances increases allocations while compiling even the 10000 static binding program by nearly 30%. This suggests that, compared to "real" code, simplification of the static bindings is relatively cheap. Indeed I should have checked this earlier. However, this is interesting as when I look back on my measurements of the comparing pre- and post-Typeable compilers on `lens`, I see that the largest changes in compiler allocations (as reported by -v) tend to be in demand analysis, called arity analysis, and the specialiser. This clearly needs further investigation. I've put the testcase here [1] if anyone wants to play with it. Cheers, - Ben [1] https://github.com/bgamari/ghc-static-data-opt-testbench