
Yes, this is exactly one of the issues that marge might run into as well,
the aggregate ends up performing differently from the individual ones. Now
we have marge to ensure that at least the aggregate builds together, which
is the whole point of these merge trains. Not to end up in a situation
where two patches that are fine on their own, end up to produce a broken
merged state that doesn't build anymore.
Now we have marge to ensure every commit is buildable. Next we should run
regression tests on all commits on master (and that includes each and
everyone that marge brings into master. Then we have visualisation that
tells us how performance metrics go up/down over time, and we can drill
down into commits if they yield interesting results in either way.
Now lets say you had a commit that should have made GHC 50% faster across
the board, but somehow after the aggregate with other patches this didn't
happen anymore? We'd still expect this to somehow show in each of the
singular commits on master right?
On Wed, Mar 24, 2021 at 8:09 PM Richard Eisenberg
What about the case where the rebase *lessens* the improvement? That is, you're expecting these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a blanket "accept improvements" won't tell you.
I'm not hard against this proposal, because I know precise tracking has its own costs. Just wanted to bring up another scenario that might be factored in.
Richard
On Mar 24, 2021, at 7:44 AM, Andreas Klebinger
wrote: After the idea of letting marge accept unexpected perf improvements and looking at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4759 which failed because of a single test, for a single build flavour crossing the improvement threshold where CI fails after rebasing I wondered.
When would accepting a unexpected perf improvement ever backfire?
In practice I either have a patch that I expect to improve performance for some things so I want to accept whatever gains I get. Or I don't expect improvements so it's *maybe* worth failing CI for in case I optimized away some code I shouldn't or something of that sort.
How could this be actionable? Perhaps having a set of indicator for CI of "Accept allocation decreases" "Accept residency decreases"
Would be saner. I have personally *never* gotten value out of the requirement to list the indivial tests that improve. Usually a whole lot of them do. Some cross the threshold so I add them. If I'm unlucky I have to rebase and a new one might make it across the threshold.
Being able to accept improvements (but not regressions) wholesale might be a reasonable alternative.
Opinions?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs