
Ok, done, I created https://ghc.haskell.org/trac/ghc/ticket/14964
But first:
On Thu, Mar 22, 2018 at 3:26 AM, Simon Peyton Jones
score max mb total mb prd derive lily perform ghc 6 72.26 3279.22 0.88 0.79~0.84 0.70~0.74 0.31~0.33 8.0.2 6 76.63 3419.20 0.58 1.45~1.59 1.05~1.07 0.33~0.36 8.4.1
bloom 70.69 2456.14 0.89 1.32~1.36 0.15~0.16 8.0.2 bloom 67.86 2589.97 0.62 1.94~1.99 0.20~0.22 8.4.1
The bytes-allocated number has gone up a bit. (Not too surprising… the compiler is doing more.) But the productivity number is down sharply, and consistently so, which translates directly into longer compile times. Somehow, although residency is not increasing, GC time is greatly increased.
To be clear, this is the performance of generated code, not the performance of the compiler itself. The conclusion from compiler performance was non-optimized is much faster (yay!) and optimized slightly slower. That's reasonable if it's doing more work, but somehow the more work turns into lower GC productivity in the generated code :(
It’d be good to figure out what’s gone wrong here. Maybe a change in nursery size or something stupid like that?
I don't think so, this is my own code and the invocation is the same across versions, hopefully only the compiler version has changed. I am tweaking nursery size though, here are the RTS flags: -N -A8m -T I can try testing again with no RTS flags. I realize this is a bit vague as it stands, I'll try to narrow things down.