[GHC] #9315: Weird change in allocation numbers of T9203

#9315: Weird change in allocation numbers of T9203 ------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- While trying to improve our handle on allocation numbers, I’m stuck with the test case T9203. On some machines, it allocates roughly 95747304 bytes (this includes travis and my laptop), on others 42946176 bytes (e.g. on the machine where I monitor benchmark performance). All machines are 64 bit Linux machines. The output of `-ddump-simpl`, `-ddump-stg` and `-ddump-cmm` is identical (up to variable names). Even `-ddump-asm` looks the same, besides some jump target reordering. The binary runs too small to get a heap profile. I’m a bit stuck here: What can be the cause for these differences? (BTW, if have an up-to-date GHC tree, can you report the number you get? Run `make -C testsuite VERBOSE=4 TEST=9203` for that.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------ Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Phabricator and SPJ also have the higher number. If noone else can reproduce 42946176 I guess I’ll change it back. But I still don’t get it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------ Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Ah, Austin already reverted it in changeset:4690466de2249fd73600567bd90d462ac26b2d1c/ghc. But sure enough, my benchmark builder fails again... And it really is a 64bit executable, in case you think that’s it: {{{ $ file ./ghc-master/testsuite/tests/perf/should_run/T9203 ./ghc-master/testsuite/tests/perf/should_run/T9203: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x7672054d3dd125700b11710744581a8ae8502766, not stripped }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------ Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by simonpj): Next thing to try is ticky-ticky profiling. Compile with `-ticky` and run with `+RTS -rT9203.ticky` and compare output. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): How did I forget about ticky ticky? Thanks for reminding! Ticky output is identical.... so it has to be in the libraries. I recompiled them with `-ticky`, and now I see some difference related to `GHC.Fingerprint.fingerprintData1`. I’m attaching both ticky reports, maybe someone can see a pattern. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): Sigh. Both `Foreign.Storable` and `GHC.Fingerprint` have (with `-dsuppress-unique`) identical core. Also `Data.Typeable.Internal` where the ticky diffs show the same number of calls to `base:Data.Typeable.Internal.$fTypeableks_$ctypeRep#`, but a different allocation count.... hah! but only when I copy the ghc invocation from one host to the other, not if if I run `make` there, which for some reason passes `-O2` when compiling `Data.Typeable.Internal`.... So the problem is as follows: Invocation of `validate` and the usual `devel2` settings in `mk/build.mk` have `GhcLibHcOpts += -O -dcore- lint`, while the default in the absence of everything is `-O2` (in `in GhcLibHcOpts=-O2`). And sometime that makes a difference... Clearly, this needs to be consolidated. What are the settings used to actually build the releases? `-O` or `-O2`? The appropriate one needs to be used in `validate`, otherwise these tests are not very useful. And then people will have to use the same flags in their `mk/build.mk` or learn to ignore benchmark results from such a tree. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonpj): As well as your last para, you seem to have found that compiling some particular module (is it `Data.Typeable.Internal`?) with -O2 instance of -O1 makes a factor of 2 difference in allocation for at least this test. Is that right? If so, it'd be worth knowing why, and perhaps putting `{-# OPTIONS_GHC -O2 #-}` with a clear comment in that file. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): Right, but that’s a separate issue from getting reproducible test cases, which is what I care about at the moment. The factor of two has so far only turned up in a pathetic test case, hasn’t it? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Description changed by jrp: Old description:
While trying to improve our handle on allocation numbers, I’m stuck with the test case T9203. On some machines, it allocates roughly 95747304 bytes (this includes travis and my laptop), on others 42946176 bytes (e.g. on the machine where I monitor benchmark performance). All machines are 64 bit Linux machines.
The output of `-ddump-simpl`, `-ddump-stg` and `-ddump-cmm` is identical (up to variable names). Even `-ddump-asm` looks the same, besides some jump target reordering. The binary runs too small to get a heap profile.
I’m a bit stuck here: What can be the cause for these differences?
(BTW, if have an up-to-date GHC tree, can you report the number you get? Run `make -C testsuite VERBOSE=4 TEST=9203` for that.)
New description: While trying to improve our handle on allocation numbers, I’m stuck with the test case T9203. On some machines, it allocates roughly 95747304 bytes (this includes travis and my laptop), on others 42946176 bytes (e.g. on the machine where I monitor benchmark performance). All machines are 64 bit Linux machines. The output of `-ddump-simpl`, `-ddump-stg` and `-ddump-cmm` is identical (up to variable names). Even `-ddump-asm` looks the same, besides some jump target reordering. The binary runs too small to get a heap profile. I’m a bit stuck here: What can be the cause for these differences? (BTW, if have an up-to-date GHC tree, can you report the number you get? Run `make -C testsuite VERBOSE=4 TEST=T9203` for that.) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by jrp): I get {{{ =====> T9203(normal) 2035 of 4040 [0, 0, 0] cd ./perf/should_run && '/Users/jrp/Projects/ghc/inplace/bin/ghc-stage2' -fforce-recomp -dcore-lint -dcmm-lint -dno-debug-output -no-user-package- db -rtsopts -fno-ghci-history -o T9203 T9203.hs -O2 >T9203.comp.stderr 2>&1 cd ./perf/should_run && ./T9203 +RTS -V0 -tT9203.stats --machine-readable -RTS T9203.run.stdout 2>T9203.run.stderr bytes allocated value is too low: (If this is because you have improved GHC, please update the test so that GHC doesn't regress again) Expected bytes allocated: 95747304 +/-5% Lower bound bytes allocated: 90959938 Upper bound bytes allocated: 100534670 Actual bytes allocated: 42946176 *** unexpected failure for T9203(normal) }}} with OS X, BuildFlavour = perf and InstallExtraPackages=YES. There are also a few other performance-related failures: {{{ Unexpected results from: TEST="T5611 T9203 T5435_dyn_asm haddock.Cabal T4801 T6048" OVERALL SUMMARY for test run started at Sun Jul 13 17:28:27 2014 BST 2:42:49 spent to go through 4040 total tests, which gave rise to 19417 test cases, of which 15712 were skipped 28 had missing libraries 3629 expected passes 42 expected failures 0 caused framework failures 0 unexpected passes 6 unexpected failures Unexpected failures: concurrent/should_run T5611 [bad stderr] (normal) perf/compiler T4801 [stat too good] (normal) perf/compiler T6048 [stat not good enough] (optasm) perf/haddock haddock.Cabal [stat not good enough] (normal) perf/should_run T9203 [stat too good] (normal) rts T5435_dyn_asm [bad stdout] (normal) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonmar): The perf tests don't guarantee to pass unless you use the validate settings, I think that's unavoidable. Maybe it's possible to just skip them unless the correct settings are being used. We don't use -O2 in validate because validate is supposed to be fast. But that doesn't render the perf tests useless, they're just testing -O and not -O2 (and individual tests can use -O2 if we want). Should we force -O2 for Data.Typeable.Internal? A tricky one. I generally don't like forcing optimisation settings for individual modules like this, because it prevents us from seeing the real effect of global optimisation settings. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata):
The perf tests don't guarantee to pass unless you use the validate settings, I think that's unavoidable.
True; but it seems that we would want to validate the settings used for release, aren’t we? Otherwise there might be a regression affecting our releases, and we won’t notice.... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonmar): Replying to [comment:11 nomeata]:
The perf tests don't guarantee to pass unless you use the validate settings, I think that's unavoidable.
True; but it seems that we would want to validate the settings used for release, aren’t we? Otherwise there might be a regression affecting our releases, and we won’t notice....
There are *lots* of things that validate doesn't test for, and lots of regressions that it won't catch. It's a compromise between coverage and speed, deliberately. For instance, it doesn't test profiling at all - yet that's an important feature of our releases. We need to make the full testsuite runs more visible to developers, and focus on fixing regressions that they find. I suspect that adding -O2 to libraries and compiler (we would have to do both) would make validate slower, but I don't know how much; feel free to measure it. I do think validate is too slow already, though, and we should be finding ways to speed it up. The full testsuite run should use -O2 (I think I used to do that when I did the nightly builds), and the perf tests should have a separate set of figures for when libraries are built with -O2 where necessary. Probably only this one test will need to have different results. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by jrp): Running with the validate settings (ie, running validate) I get all the performance tests to pass other than: {{{ =====> T4801(normal) 1987 of 4045 [0, 0, 0] cd ./perf/compiler && '/Users/jrp/Projects/haskell/ghc/inplace/bin/ghc- stage2' -fforce-recomp -dno-debug-output -no-user-package-db -rtsopts -fno-ghci-history -c T4801.hs +RTS -V0 -tT4801.comp.stats --machine- readable -RTS -static >T4801.comp.stderr 2>&1 max_bytes_used value is too low: (If this is because you have improved GHC, please update the test so that GHC doesn't regress again) Expected T4801(normal) max_bytes_used: 25145320 +/-5% Lower bound T4801(normal) max_bytes_used: 23888054 Upper bound T4801(normal) max_bytes_used: 26402586 Actual T4801(normal) max_bytes_used: 23720120 Deviation T4801(normal) max_bytes_used: -5.7 % peak_megabytes_allocated value is too low: (If this is because you have improved GHC, please update the test so that GHC doesn't regress again) Expected T4801(normal) peak_megabytes_allocated: 70 +/-1% Lower bound T4801(normal) peak_megabytes_allocated: 69 Upper bound T4801(normal) peak_megabytes_allocated: 71 Actual T4801(normal) peak_megabytes_allocated: 68 Deviation T4801(normal) peak_megabytes_allocated: -2.9 % bytes allocated value is too low: (If this is because you have improved GHC, please update the test so that GHC doesn't regress again) Expected T4801(normal) bytes allocated: 464872776 +/-5% Lower bound T4801(normal) bytes allocated: 441629137 Upper bound T4801(normal) bytes allocated: 488116415 Actual T4801(normal) bytes allocated: 415645496 Deviation T4801(normal) bytes allocated: -10.6 % *** unexpected failure for T4801(normal) }}} This is on OS X with the master HEAD -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata):
I suspect that adding -O2 to libraries and compiler (we would have to do both) would make validate lower, but I don't know how much; feel free to measure it. I do think validate is too slow already, though, and we should be finding ways to speed it up.
I changed the builder for ghcspeed to set `Validating=YES`, and the time to run `make` was reduced by ¼. I didn’t measure the time for a whole validate run, I guess I’ll do that separately. But setting that changes more than just `GhcLibHcOpts`, e.g. it disables split objects (increasing binary sizes by a factor of 3.5). Allocation of nofib benchmarks wobble a bit, but significat changes are only on the increasing side (fish, +5%), so for nofib, the release settings seem to be better than the validate setting. (good!) The allocation number of #4801’s test case actually decreases by 4.7%. In total, 7 test cases now get *better* allocation numbers, while only `haddock.base` degrades by more than 1%. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): I just timed two plain `./validate` runs. The first with the current settings (`-O`): {{{ $ cat /tmp/validate-O.txt Command exited with non-zero status 1 3573.46user 204.47system 36:13.68elapsed 173%CPU (0avgtext+0avgdata 1534340maxresident)k 76944inputs+17408472outputs (255major+249188896minor)pagefaults 0swaps }}} The second one with `GhcLibHcOpts += -O2 -dcore-lint` in `k/validate- settings.mk`: {{{ Command exited with non-zero status 1 3675.64user 209.70system 37:10.83elapsed 174%CPU (0avgtext+0avgdata 1535452maxresident)k 1048inputs+18010496outputs (18major+249644503minor)pagefaults 0swaps }}} Yes, it does take longer. But hardly significantly. Hence I’m in favour of setting `GhcLibHcOpts += -O2` for validate runs, bringing our validation closer to what we release, and also making a `make -C testsuite` in non-validate non-customized tree to yield the same performance numbers (at least for non-compiler-performance-tests). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by jrp): It may also help to run validate with a default -j8 or some such. The tests and compilations are all more or less single core and most developers will have at least 8 cores. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): The current default is `-j2`, I guess it could be updated. But this is digressing from what this ticket is about (making `validate` settings and default settings agree more, if possible). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by jrp): Understood, but I thought that there was some hesitation about using more expensive validate settings because of the running time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): Of course improving validate time is desirable, I put your suggestion at #9341. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonmar): If we change validate to use GhcLibHcOpts=-O2 and update the tests to match, then * validate takes a minute longer * the tests now fail with a `quick` build (which we recommend for development, and I generally use) * the compiler is still not built with -O2 So I'm not convinced this is a win. BTW, on the question of -j setting, the usual way is to say `CPUS=8 ./validate` if you have an 8-core machine. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata):
validate takes a minute longer
Out of more than half an hour. It’s not like validate is fast enough so that people can actually sit and watch it. And no matter whether its 36 or 37 minutes, I guess most will stop running validate themselves and rather leave that to Phabricator or other CI stuff, precisely because validate takes so long and its convenient to let someone else do it for you.
the tests now fail with a quick build (which we recommend for development, and I generally use)
True. But development trees are inherently unusable, as people might have all kinds of settings there (`-O`, `-DDEBUG` etc.) that invalidate performance numbers. I never paid attention to them for that reason, unless I was doing a clean validate run in a separate checkout. Isn’t running validate in the development tree discouraged anyways?
the compiler is still not built with -O2
True, but one step at a time. (I’ll time this change later, probably over lunch.) You say `quick` is recommended; I thought `devel2` is recommended and that’s what I use and recommend. And there will be much more variation in people’s development trees settings, it’s hard to aim for them. What we can aim for are our two well-defined settings: The default (i.e. what we release) and validate. I’m arguing that we should unify these two as much as possible. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata):
the compiler is still not built with -O2 True, but one step at a time. (I’ll time this change later, probably over lunch.)
{{{ 4311.12user 223.56system 42:58.26elapsed 175%CPU (0avgtext+0avgdata 1557124maxresident)k 256824inputs+18215896outputs (472major+251942192minor)pagefaults 0swaps }}} Another +5 minutes, so I won’t argue for changing that at this point. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

validate takes a minute longer
Out of more than half an hour. It’s not like validate is fast enough so
#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by simonmar): Replying to [comment:21 nomeata]: that people can actually sit and watch it. And no matter whether its 36 or 37 minutes, I guess most will stop running validate themselves and rather leave that to Phabricator or other CI stuff, precisely because validate takes so long and its convenient to let someone else do it for you. But then where do you stop? By your arguments we should be making validate run the full testsuite, which takes 3 hours or so (IIRC). When we first made validate it took 15 mins, now it takes more than double that, and we arrived here not by huge leaps, but by adding one minute at a time. So every time I see this happening, I like to question (vigorously!) whether that extra minute is paying for itself. In this case I'm not at all convinced. The point of validate is to catch bugs that will block other developers if they get committed: build breakage or serious bugs, whereas everything else we catch with the nightly builds, If there's a feeling that we should change this, then let's take that discussion to the mailing list. It's not at all clear-cut (which is why we're having this discussion!). What I'd like to see instead is one of the following: * Skip the test unless the libraries were built with -O2 * Skip the test *if* the libraries were built with -O2 * Have different results according to the optimisation levels All of which fix the perceived problem without introducing the regression in validate time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 7.9 System | Keywords: Resolution: | Operating System: Unknown/Multiple Differential Revisions: | Type of failure: None/Unknown Architecture: | Test Case: Unknown/Multiple | Blocking: Difficulty: Unknown | Blocked By: | Related Tickets: | -------------------------------------+------------------------------------- Comment (by nomeata): I tried to bring it up on the mailing list, but nobody cared...
What I'd like to see instead is one of the following:
* Skip the test unless the libraries were built with -O2 * Skip the test *if* the libraries were built with -O2 * Have different results according to the optimisation levels
All of which fix the perceived problem without introducing the regression in validate time.
While the third option is the most complete, I don’t think it is maintainable. Given the aim of validate is to prevent breakage, and performance regressions are not breakage, I think it is ok to skip the tests there (even makes it faster!), and run the tests only when we are doing a release build. So how about this: * The expected numbers are adjusted to reflect the release settings. * `validate` in modes fast and normal pass `SKIP_PERF_TESTS` to the test suite * `validate` in slow mode uses the release settings (`-O2` to libraries and stage2), and does not `SKIP_PERF_TESTS` * Optionally: The test suite, when invoked manually, ignores performance tests if the flags are not as expected¹ Note that the test suite already skips compiler performance tests if `-DDEBUG` is set, so this has precedent. ¹ Is that possible in a reliable and not-too-complicate way, and without duplicating the “expected flags”? Maybe a warning “You have a custom `mk/build.mk`; failures in performance tests might be irrelevant.” would be enough -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9315: Weird change in allocation numbers of T9203 -------------------------------------+------------------------------------- Reporter: nomeata | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #8096 | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thomie): * related: => #8096 Comment: See also ezyang's proposal for a fudge-factor for performance tests run on non-validate builds: #8096. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9315#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC