
When I run `sh validate -legacy`, towards the end I see stuff like this Performance Metrics (test environment: local): Conversions(normal) runtime/bytes allocated 107696.000 DeriveNull(normal) runtime/bytes allocated 112050960.000 InlineArrayAlloc(normal) runtime/bytes allocated 1600041088.000 InlineByteArrayAlloc(normal) runtime/bytes allocated 1440041088.000 InlineCloneArrayAlloc(normal) runtime/bytes allocated 1600041248.000 ManyAlternatives(normal) compile_time/bytes allocated 840985728.000 ManyConstructors(normal) compile_time/bytes allocated 4540766560.000 MethSharing(normal) runtime/peak_megabytes_allocated 2.000 MethSharing(normal) runtime/bytes allocated 480098136.000 MultiLayerModules(normal) compile_time/bytes allocated 5856970504.000 ... It is intermingled with other apparently unrelated output. What should I conclude form this? Is it good or bad? By what amount have these figures changed, and relative to what? How can I run a single perf test? Thanks Simon

Simon Peyton Jones via ghc-devs
When I run `sh validate -legacy`, towards the end I see stuff like this
Performance Metrics (test environment: local):
Conversions(normal) runtime/bytes allocated 107696.000
DeriveNull(normal) runtime/bytes allocated 112050960.000
InlineArrayAlloc(normal) runtime/bytes allocated 1600041088.000
InlineByteArrayAlloc(normal) runtime/bytes allocated 1440041088.000
InlineCloneArrayAlloc(normal) runtime/bytes allocated 1600041248.000
ManyAlternatives(normal) compile_time/bytes allocated 840985728.000
ManyConstructors(normal) compile_time/bytes allocated 4540766560.000
MethSharing(normal) runtime/peak_megabytes_allocated 2.000
MethSharing(normal) runtime/bytes allocated 480098136.000
MultiLayerModules(normal) compile_time/bytes allocated 5856970504.000
... It is intermingled with other apparently unrelated output.
Hmm, it really should be a distinct block of output. Are you saying that you are seeing lines of unrelated output interspersed in the performance metrics table?
What should I conclude form this? Is it good or bad? By what amount have these figures changed, and relative to what?
Below the performance metrics table you should see a blurb of text like the following: Missing Baseline Metrics these metrics trivially pass because a baseline (expected value) cannot be recovered from previous git commits. This may be due to HEAD having new tests or having expected changes, the presence of expected changes since the last run of the tests, and/or the latest test run being too old. MultiLayerModules ... If the tests exist on the previous commit (And are configured to run with the same ways), then check out that commit and run the tests to generate the missing metrics. Alternatively, a baseline may be recovered from ci results once fetched: git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/ci/perf The suggested command will fetch up-to-date performance metrics from the metrics repository (populated by CI). If you then run the testsuite again you will see output comparing each test's output to the baseline from CI. For instance, Performance Metrics (test environment: local): MultiLayerModules(normal) compile_time/bytes allocated 5710260920.000 (baseline @ HEAD~1) 5726848340.000 [unchanged 0.3%] The "unchanged" here refers to the fact that the +0.3% observed change is within the indicated acceptance window of the test.
How can I run a single perf test?
Perf tests are treated like any other test. A single test can be run under Hadrian with the following: ./hadrian/build.cabal.sh test --build-root=_validatebuild \ --flavour=Validate --only=MultiLayerModules Cheers, - Ben

| Hmm, it really should be a distinct block of output. Are you saying
| that
| you are seeing lines of unrelated output interspersed in the
| performance
| metrics table?
Yes, just so. I attach the tail of the validate run.
| The suggested command will fetch up-to-date performance metrics from
| the
| metrics repository (populated by CI). If you then run the testsuite
| again you will see output comparing each test's output to the baseline
| from CI. For instance,
Aha, I'll try that. I missed those words. But actually the words don't say "run validate again to see the differences". They say "a baseline may be recovered from ci results once fetched" which is a deeply cryptic utterance.
Also, does it make sense to spit out all these numbers if they are useless? Better, perhaps in the overall SUMMARY (at the end of validate) to
- Have a section for perf tests, along with "Unexpected passes"
and "Unexpected failure" we can have "Unexpected perf failures"
- In that section, if there is no baseline data, put the words that
explain how to get it.
Simon
| -----Original Message-----
| From: Ben Gamari

Simon Peyton Jones via ghc-devs
| Hmm, it really should be a distinct block of output. Are you saying | that | you are seeing lines of unrelated output interspersed in the | performance | metrics table?
Yes, just so. I attach the tail of the validate run.
Ahh, I see what is happening here. This is due to interleaving of stdout (where the performance metrics are printed) and stderr (where exceptions are printed). This is indeed quite unfortunate. I'm not entirely sure
| The suggested command will fetch up-to-date performance metrics from | the | metrics repository (populated by CI). If you then run the testsuite | again you will see output comparing each test's output to the baseline | from CI. For instance,
Aha, I'll try that. I missed those words. But actually the words don't say "run validate again to see the differences". They say "a baseline may be recovered from ci results once fetched" which is a deeply cryptic utterance.
Also, does it make sense to spit out all these numbers if they are useless? Better, perhaps in the overall SUMMARY (at the end of validate) to
We started emitting these metrics at the end of the log because they can be quite handy when diagnosing performance changes in CI. However we have recently started dumping the metrics to a file which is uploaded as an artifact from the CI job, so perhaps this is no longer necessary.
- Have a section for perf tests, along with "Unexpected passes" and "Unexpected failure" we can have "Unexpected perf failures"
- In that section, if there is no baseline data, put the words that explain how to get it.
A fair suggestion. I will try to get to this soon. Cheers, - Ben
participants (2)
-
Ben Gamari
-
Simon Peyton Jones