
Ben, David I'm still baffled by how to reliably get GHC perf metrics on my local machine. The wiki page https://gitlab.haskell.org/ghc/ghc/wikis/building/running-tests/performance-... helps, but not enough! * There are two things going on: * CI perf measurements * Local machine perf measurements I think that they are somehow handled differently (why?) but they are all muddled up on the wiki page. * My goal is this: * Start with a master commit, say from Dec 2019. * Implement some change, on a branch. * sh validate -legacy (or something else if you like) * Look at perf regressions. * I believe I have first to utter the incantation $ git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/ci/perf * But then: * How do I ensure that the baseline perf numbers I get relate to the master commit I started from, back in Dec 2019? I don't want numbers from Jan 2020. * If I rebase my branch on top of HEAD, say, how do I update the perf baseline numbers to be for HEAD? * Generally, how can I tell the commit to which the baseline numbers relate? * Also, in my tree I have a series of incremental changes; I want to see if any of them have perf regressions. How do I do that? Thanks Simon

Hi Simon,||||
* There are two things going on:
1. CI perf measurements 2. Local machine perf measurements
I think that they are somehow handled differently (why?) but they are all muddled up on the wiki page.
They are handled differently because we do not want to compare local metrics with CI metrics. The exception is when local metrics don't exist, then we fall back to CI metrics as a baseline (see How baseline metrics are calculated https://gitlab.haskell.org/ghc/ghc/wikis/building/running-tests/performance-...).
* My goal is this:
o Start with a master commit, say from Dec 2019. o Implement some change, on a branch. o sh validate –legacy (or something else if you like) o Look at perf regressions.
Getting to the *raw data* should be easy: 1. Checkout an the <baseline> commit. 2. Use `git status` to double check git sees a clean working tree. 3. Run the performance tests. 4. Check out your <target> branch. 5. Use `git status` to double check git sees a clean working tree (else commit any changes) 6. Run the performance tests. 7. Compare metrics (filtering for `local` metrics and outputting a chart): |python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <baseline> <target>| see `|python3 testsuite/driver/perf_notes.py --help`| for more filtering options. This doesn't detect regressions automatically, it only shows you the raw data. Ideally we'd add an option to the testrunner to let you specify a baseline commit manually. I suspect that would be close to what you're looking for.
* I believe I have first to utter the incantation
$ git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/ci/perf
Yes, this fetches the latest CI metrics into your git notes.
* But then: o How do I ensure that the baseline perf numbers I get relate to the master commit I started from, back in Dec 2019? I don’t want numbers from Jan 2020.
see above.
o If I rebase my branch on top of HEAD, say, how do I update the perf baseline numbers to be for HEAD
The test runner should use HEAD's metrics automatically (see How baseline metrics are calculated https://gitlab.haskell.org/ghc/ghc/wikis/building/running-tests/performance-...), though you will need to fetch CI metrics or run the perf tests locally on HEAD to get the relevant metrics.
o Generally, how can I tell the commit to which the baseline numbers relate?
The test runner will output (per test) which baseline commit is used e.g. "... from local baseline @ HEAD~2" says the baseline was a local run from 2 commits ago.
* Also, in my tree I have a series of incremental changes; I want to see if any of them have perf regressions. How do I do that?
You can run the perf tests on each commit *in commit order*, and the previous commit will always be used as the baseline. You can also then chart the results: |python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <oldestCommit>..<newestCommit>| Sorry if this is a bit unoptimal, but I Hope that helps - David E -- David Eichmann, Haskell Consultant Well-Typed LLP, http://www.well-typed.com Registered in England & Wales, OC335890 118 Wymering Mansions, Wymering Road, London W9 2NF, England

David
Thanks. Concerning this:
1. Checkout an the <baseline> commit.
2. Use `git status` to double check git sees a clean working tree.
3. Run the performance tests.
4. Check out your <target> branch.
5. Use `git status` to double check git sees a clean working tree (else commit any changes)
6. Run the performance tests.
7. Compare metrics (filtering for `local` metrics and outputting a chart):
python3 testsuite/driver/perf_notes.py --chart chart.html --test-env local <baseline> <target>
I believe that
* This compares two local builds
* It does not require fetching CI perf data; in fact it 100% independent of the CI system
* It does require two separate build trees (that is fine)
Is that right? If so, two questions
* In that Python command line (step 7) is "<baseline>" the path to the root of the baseline tree, or to some file within that tree?
* Is this process (and what it does) written up on some wiki page somewhere? Where? Rather than replying to me individually, it'd be better to use this conversation to produce better guidance for everyone.
Thanks
Simon
From: David Eichmann
participants (2)
-
David Eichmann
-
Simon Peyton Jones