I'm still seeing different numbers from Johan, so I'm just going to throw up my hands and proceed separately for now.
I'm sharing my Excel spreadsheet. It contains:
* My 8 Feb run (libraries O2, programs O1, 10 iterations)
* Johans' first (comparable) set of 7.6.2 -> HEAD numbers (O2, O1, 5 iterations)
* My 11 Feb run (libraries O2, programs O2, 30 iterations)
* some analysis, indicating that his and my numbers do not correlate
Info about my runs:
* 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.2.1, 7.2.2, 7.4.1, 7.4.2, 7.6.1, 7.6.2, HEAD (ec9477b1e51)
* I using the NoFib make to build the tests, but not to actually execute them.
Here's why I don't simply use the NoFib makefiles for everything. That would run an entire suite for each version: all of 7.0.1, then all of 7.0.2, etc.
I wanted better time locality, since my test machine (though quite capable - Marlow and SPJ might know the specs), has a few other users and I've been seeing plenty of noise in the measurements. So now I use NoFib's makefiles to build all of the tests and generate the test-running command, but I cache the run command and then interleave the execution of each GHC version's executable. In other words:
using nofib: outer loop = GHC version, inner loop = each program
my rig: outer loop = each program; inner loop = GHC version
NB I do all the building before entering that nested loop.
So I think I end up with less noise in the data.
I have an additional loop between those two loops that controls the iterations. My rigorous data set (11 Feb) uses 30 iterations.
Here are the big swings I'm seeing...
Indicated allocation regressions that seem to still affect HEAD:
NB The 'swing' column is the subtraction (of percentage points) between HEAD today and the best job GHC ever did on that program. The 'concrete' column converts 'swing' to the actual difference in allocation.
7.0.1 7.0.* 7.2.* 7.4.1 7.4.2 7.6.* HEAD swing concrete
atom 1,372,769,928 0.0% 18.5% 18.4% 18.4% 18.4% 18.4% 18% 252,589,667
exp3_8 3,500,231,456 0.0% 68.2% 68.2% 68.2% 68.2% 68.2% 68% 2,387,157,853
integrate 1,092,339,840 0.0% -12.3% -13.8% -57.3% 39.0% -11.0% 46% 505,753,346
lcss 1,272,240,016 0.0% 22.5% 22.5% 22.5% 22.5% 22.2% 22% 282,437,284
mandel 244,936,399 0.0% -9.8% 15.9% 24.1% 24.1% 24.0% 34% 82,788,503
primes 861,817,344 0.0% 57.9% 57.9% 57.9% 57.9% 57.9% 58% 498,992,242
rfib 78,240 4.2% 48.6% 49.7% 48.0% 48.8% 48.2% 48% 37,712
tak 91,304 3.4% 24.7% 24.2% 18.7% 15.8% 14.8% 15% 13,513
wave4main 151,494,392 0.0% 20.3% 22.5% 22.5% 22.5% 20.3% 20% 30,753,362
wheel-sieve1 24,131,096 0.0% -1.6% -1.6% -1.6% 99.2% 99.2% 101% 24,324,145
7.0 to 7.2 and 7.4 to 7.6 seem to be the major players.
Indicated allocation regressions that seem to still affect HEAD:
7.0.1 7.0.2 7.0.3 7.0.4 7.2.1 7.2.2 7.4.1 7.4.2 7.6.1 7.6.2 HEAD swing concrete
ansi 1.66 0.9% 0.9% -1.0% -0.3% 0.9% -14.1% -16.7% -1.1% 0.5% 5.6% 22.3% 0.37
atom 2.83 1.2% 1.4% -0.2% 8.7% 8.0% 11.3% 9.2% 11.5% 8.7% 9.7% 9.9% 0.28
exp3_8 3.40 0.1% 0.1% -0.9% 19.2% 19.7% 20.6% 23.8% 20.3% 20.4% 19.5% 20.4% 0.69
genfft 0.90 3.1% 3.1% 2.9% 3.1% 2.0% 5.8% 4.5% 3.0% -2.1% 7.4% 9.5% 0.09
hpg 0.28 0.4% 0.1% -34.3% 4.0% -8.6% -11.8% -52.1% -46.0% -51.2% -16.8% 35.3% 0.10
integrate 0.74 -12.2% -12.2% -20.7% -17.0% -14.7% -14.4% -66.7% 58.4% 63.4% -14.5% 52.2% 0.39
knights 1.82 0.2% 0.0% 1.0% 2.8% -3.6% 1.5% 9.0% -7.0% 0.9% 5.1% 12.1% 0.22
lcss 2.81 -0.1% -0.5% -6.0% 6.1% 6.3% 6.4% -0.1% 0.0% -0.4% 24.2% 30.2% 0.85
maillist 0.40 -0.5% 0.7% -63.1% 9.5% 9.7% -10.1% -76.2% -74.0% -75.1% -12.6% 63.6% 0.25
mandel 0.24 -2.6% -2.4% -1.8% -9.7% -11.3% 3.6% 8.3% 10.9% 14.5% 10.4% 21.7% 0.05
wang 1.00 1.4% 1.2% -3.3% 1.0% 2.0% 1.9% -3.7% -1.3% -1.5% 7.0% 10.7% 0.11
More noisy, of course :( The major players from allocation show up again. One new difference is 7.6 -> HEAD, where we seem to lose ~50% in hpg and maillist.
I am a bit worried because I see no *huge* swings like Johan reported. Here's an example excerpt from the NoFib makefile's log when building the programs:
HC_OPTS = -H32m -O -O -Rghc-timing -package array -H32m -hisuf hi -hisuf hi -hcsuf hc -osuf o -package-db /home/t-nicof/installs/ghc-7.7.20130206/lib/ghc-7.7.20130206/package.conf.d -O2 -rtsopts
Does anything look particularly suspicious there? (I used -package-db just to be safe; haven't ran multiple GHC versions too often.)
I might investigate some of these myself; it depends on how we allocate my internship time.
Please feel free to look into what's happening in some of these! Especially if you have some of the major compiler changes between versions in your head... :)
Thanks. HTH.
I've been running 7.0.1 all through HEAD. I'd like to compare my setup with Johan's before making any claims.
I just ran nofib on current HEAD and compared it to 7.6.2 on my 64-bit Linux machine. There are some regressions I think we should look into before a release:
It would be interesting to see the numbers compared to 7.6.1, if possible, as I've seen some big performance regressions between 7.6.1 and 7.6.2.