[GHC] #12758: Bring sanity to our performance testsuite

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: task | Status: new Priority: normal | Milestone: 8.2.1 Component: Test Suite | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- The GHC testsuite's performance tests tends to be a constant source of busy-work as it requires that contributors manually bump performance numbers and propagate these changes across platforms. Moreover, the testsuite is poor at noticing performance regressions due to false positive failures (due to spurious environmental differences) and false negatives (due to the rather generous acceptance windows that many tests have). Joachim and I briefly discussed this at Hac Phi and came up with the following proposal: * We rip expected performance numbers out of the `.T` files * -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: task | Status: new Priority: normal | Milestone: 8.2.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by bgamari: @@ -13,1 +13,5 @@ - * + * We replace the existing `compiler_stats_num_field` test modifier with + an implementation that dumps performance metrics to a CSV file + * Introduce a tool to compare this CSV file to metrics associated with a + ancestor commit via `git notes` + * The tool would also be able to add notes New description: The GHC testsuite's performance tests tends to be a constant source of busy-work as it requires that contributors manually bump performance numbers and propagate these changes across platforms. Moreover, the testsuite is poor at noticing performance regressions due to false positive failures (due to spurious environmental differences) and false negatives (due to the rather generous acceptance windows that many tests have). Joachim and I briefly discussed this at Hac Phi and came up with the following proposal: * We rip expected performance numbers out of the `.T` files * We replace the existing `compiler_stats_num_field` test modifier with an implementation that dumps performance metrics to a CSV file * Introduce a tool to compare this CSV file to metrics associated with a ancestor commit via `git notes` * The tool would also be able to add notes -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: task | Status: new Priority: normal | Milestone: 8.2.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * cc: nomeata (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: task | Status: new Priority: normal | Milestone: 8.2.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I'm all for eliminating busy-work, but my experience is that we have too few, not too many, performance tests. Mostly, when a perf-test fails, there really is something to investigate, annoying though I often find it. The exception is `max_bytes_used` which wobbles around unpredicatably. I'm not sure how your new approach would make things better. Would you like to elaborate? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: task | Status: new Priority: normal | Milestone: 8.2.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by michalt): * cc: michalt (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner:
Type: task | Status: new
Priority: normal | Milestone: 8.2.1
Component: Test Suite | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12758: Bring sanity to our performance testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner:
Type: task | Status: new
Priority: normal | Milestone: 8.2.1
Component: Test Suite | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.4.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.2.1 => 8.4.1 Comment: There is general agreement that we want to do something along these lines. However, it will probably only happen for 8.4. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: task | Status: new
Priority: normal | Milestone: 8.4.1
Component: Test Suite | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: normal | Milestone: 8.6.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): A quick update: Last summer Jared Weakly implemented the proposal of this ticket. The result is Phab:D3758. At the time the plan was to switch to Jenkins in the fall (see #13716) and incorporate his work at that point since it requires a bit of integration with CI. However, shortly before ICFP we started a discussion which culminated in the decision to instead move CI to CircleCI. This process is still on-going (see wiki:ContinuousIntegration) but is nearly an end, at which point we will be able to move ahead with Phab:D3758. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.6.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: normal => high -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: task | Status: new
Priority: high | Milestone: 8.6.1
Component: Test Suite | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.6.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * differential: => Phab:D3758 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758 Wiki Page: | -------------------------------------+------------------------------------- Comment (by domenkozar): Some insight how I've implemented a similar setting for Snabb (open source high performance networking drivers). Using Nix, we build a matrix of Snabb (multiple git branches, different versions of software dependencies like Qemu, etc) and plot all of those graphs. There is no smart automation to abort of fail for contributions, but if you run these tests before releases - and even as a cronjob every X days, you can spot regressions easily and then bisect the commit. Nix tags what derivations are benchmarks and those run on specialized hardware 1 at the time. Example report: https://hydra.snabb.co/build/3641949/download/2/report.html -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758 Wiki Page: | -------------------------------------+------------------------------------- Comment (by davide): In the case of expected changes in performance metrics, we still need the tests to pass. This requires some **convenient** mechanism by which the author can explicitly inform the CI of that change. The current plan is for the author to indicate such changes in the commit message in the following form: {{{ Metric (Increase | Decrease) ['metric' | \['metrics',..\]] [\((test_env|way)='abc',...\)]: TestName01, TestName02, ... }}} e.g. {{{ # All metrics decreased for test T1234. Metric Decrease: T1234 # max_bytes_used decreased for test T1234. Metric Decrease 'max_bytes_used': T1234 # bytes allocated and peak_megabytes_allocated increased for tests T005 and T006. Metric Increase ['bytes allocated', 'peak_megabytes_allocated']: T005, T00 # All metrics decreased for test T200 in the x86_linux environment Metric Decrease (test_env = 'x86_linux'): T200 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758, Wiki Page: | Phab:D5059 -------------------------------------+------------------------------------- Changes (by osa1): * differential: Phab:D3758 => Phab:D3758, Phab:D5059 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758, Wiki Page: | Phab:D5059 -------------------------------------+------------------------------------- Comment (by davide): I've added a wiki page for the proposed change: https://ghc.haskell.org/trac/ghc/wiki/Performance/Tests. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: task | Status: new
Priority: high | Milestone: 8.8.1
Component: Test Suite | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D3758,
Wiki Page: | Phab:D5059
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D3758, Wiki Page: | Phab:D5059 -------------------------------------+------------------------------------- Comment (by osa1): What needs to be done here? An updated summary would be helpful. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #15936 | Differential Rev(s): Phab:D3758, Wiki Page: | Phab:D5059 -------------------------------------+------------------------------------- Changes (by bgamari): * related: => #15936 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12758: Bring sanity to our performance testsuite -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: task | Status: new Priority: high | Milestone: 8.8.1 Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #15936 | Differential Rev(s): Phab:D3758, Wiki Page: | Phab:D5059 -------------------------------------+------------------------------------- Comment (by davide): I've been working on some changes recently, which is currently under review here [https://gitlab.haskell.org/ghc/ghc/merge_requests/22 !22]. To Summarize: * Gitlab CI pushes performance results to the [https://gitlab.haskell.org/ghc/ghc-performance-notes GHC Performance Notes] repo. * This is important for logging performance metrics. * When running performance tests, establish baseline (expected) performance values from older commits (not just the previous one) and even use performance numbers pushed from gitlab CI. * This should greatly improve usability: for many common workflows this will succeed in establishing a baseline where the previous method would fail. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12758#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC