
What about the case where the rebase *lessens* the improvement? That is, you're expecting these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a blanket "accept improvements" won't tell you.
I don't think that scenario currently triggers a CI failure. So this wouldn't really change. As I understand it the current logic is: * Run tests * Check if any cross the metric thresholds set in the test. * If so check if that test is allowed to cross the threshold. I believe we don't check that all benchmarks listed with an expected in/decrease actually do so. It would also be hard to do so reasonably without making it even harder to push MRs through CI. Andreas Am 24/03/2021 um 13:08 schrieb Richard Eisenberg:
What about the case where the rebase *lessens* the improvement? That is, you're expecting these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a blanket "accept improvements" won't tell you.
I'm not hard against this proposal, because I know precise tracking has its own costs. Just wanted to bring up another scenario that might be factored in.
Richard
On Mar 24, 2021, at 7:44 AM, Andreas Klebinger
wrote: After the idea of letting marge accept unexpected perf improvements and looking at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4759 which failed because of a single test, for a single build flavour crossing the improvement threshold where CI fails after rebasing I wondered.
When would accepting a unexpected perf improvement ever backfire?
In practice I either have a patch that I expect to improve performance for some things so I want to accept whatever gains I get. Or I don't expect improvements so it's *maybe* worth failing CI for in case I optimized away some code I shouldn't or something of that sort.
How could this be actionable? Perhaps having a set of indicator for CI of "Accept allocation decreases" "Accept residency decreases"
Would be saner. I have personally *never* gotten value out of the requirement to list the indivial tests that improve. Usually a whole lot of them do. Some cross the threshold so I add them. If I'm unlucky I have to rebase and a new one might make it across the threshold.
Being able to accept improvements (but not regressions) wholesale might be a reasonable alternative.
Opinions?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs