Could margebot squash?

Hi, as far as I understand, the expected workflow for MRs is that when they are ready, the developer manually squashes the chronological commit history of the MR into a logical one with polished commit messages, typically consisting of a single commit, but could be multiple ones, and then assigns the MR to margebot, which will rebase that sequence of commits onto the staging branch and eventually merges that into master. One downside of this approach is that it requires destructive changes to work-in-progres branches: I might think the MR is ready, squash the commit sequence into a single commit, but then more work is ready. Now it’s hard to revert individual patches, or collaborate with others, because the git history was disrupted. Another is that the commit message itself isn’t very easily visible to reviewers. In other similarly sized projects (e.g. mathlib) I often see a mode where the actual commits of the MR are ignored (so they can represent the true git history of the branch, including merges and all that grit, which is good for collaboration and for reviewers to understand what has happened, without requiring developers to spend cosmetics effort on them), and upon merging by margebot/bors/mergify/whatever, the MR is merged as a single commit with the description taken from the MR description (which encourages developers to keep the MR description up to date as the MR develops, reviewers can easily see that). A downside of this that you’ll always get one commit on master per MR. If you like to submit a curated list of logical commits within one MR, then this would not work, and you’d have to use multiple MRs. Has this been considered? (I don’t want to cause unnecessary disruption with a presumptious call for action here; take it as a comment to weigh in in if and when this part of our infrastructure is about to change anyways, or maybe a careful probe if my sentiment may be shared widely.) Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Both the current workflow and the one Joachim proposes here make sense to me, with different pros and cons. But I think now is not the time for this debate: what we have currently isn't working well enough to consider design changes. That is, CI frequently has spurious failures, and it remains (for me, at least) mostly a hope that margebot works when assigned. (There are various conditions that stop margebot from working, though without reporting any error messages.) My understanding is that we were trying to get sufficient support within GitLab so that merge trains didn't rely on margebot. I thus think that CI reliability and maintainability should be our focus in this area; reorienting to a new design would, I fear, distract our limited energy away from that focus. Richard
On Apr 2, 2022, at 7:28 AM, Joachim Breitner
wrote: Hi,
as far as I understand, the expected workflow for MRs is that when they are ready, the developer manually squashes the chronological commit history of the MR into a logical one with polished commit messages, typically consisting of a single commit, but could be multiple ones, and then assigns the MR to margebot, which will rebase that sequence of commits onto the staging branch and eventually merges that into master.
One downside of this approach is that it requires destructive changes to work-in-progres branches: I might think the MR is ready, squash the commit sequence into a single commit, but then more work is ready. Now it’s hard to revert individual patches, or collaborate with others, because the git history was disrupted.
Another is that the commit message itself isn’t very easily visible to reviewers.
In other similarly sized projects (e.g. mathlib) I often see a mode where the actual commits of the MR are ignored (so they can represent the true git history of the branch, including merges and all that grit, which is good for collaboration and for reviewers to understand what has happened, without requiring developers to spend cosmetics effort on them), and upon merging by margebot/bors/mergify/whatever, the MR is merged as a single commit with the description taken from the MR description (which encourages developers to keep the MR description up to date as the MR develops, reviewers can easily see that).
A downside of this that you’ll always get one commit on master per MR. If you like to submit a curated list of logical commits within one MR, then this would not work, and you’d have to use multiple MRs.
Has this been considered?
(I don’t want to cause unnecessary disruption with a presumptious call for action here; take it as a comment to weigh in in if and when this part of our infrastructure is about to change anyways, or maybe a careful probe if my sentiment may be shared widely.)
Cheers, Joachim
-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Joachim Breitner
Hi,
as far as I understand, the expected workflow for MRs is that when they are ready, the developer manually squashes the chronological commit history of the MR into a logical one with polished commit messages, typically consisting of a single commit, but could be multiple ones, and then assigns the MR to margebot, which will rebase that sequence of commits onto the staging branch and eventually merges that into master.
One downside of this approach is that it requires destructive changes to work-in-progres branches: I might think the MR is ready, squash the commit sequence into a single commit, but then more work is ready. Now it’s hard to revert individual patches, or collaborate with others, because the git history was disrupted.
Another is that the commit message itself isn’t very easily visible to reviewers.
In other similarly sized projects (e.g. mathlib) I often see a mode where the actual commits of the MR are ignored (so they can represent the true git history of the branch, including merges and all that grit, which is good for collaboration and for reviewers to understand what has happened, without requiring developers to spend cosmetics effort on them), and upon merging by margebot/bors/mergify/whatever, the MR is merged as a single commit with the description taken from the MR description (which encourages developers to keep the MR description up to date as the MR develops, reviewers can easily see that).
A downside of this that you’ll always get one commit on master per MR. If you like to submit a curated list of logical commits within one MR, then this would not work, and you’d have to use multiple MRs.
We would certainly want this to be optional. I, for one, do try when possible to maintain fine-grained histories. Requiring one MR per commit would make this significantly more labor-intensive.
Has this been considered?
(I don’t want to cause unnecessary disruption with a presumptious call for action here; take it as a comment to weigh in in if and when this part of our infrastructure is about to change anyways, or maybe a careful probe if my sentiment may be shared widely.)
Indeed we have considered this and, if we didn't need to use marge-bot, GitLab itself has quite good support for optionally squashing. Sadly though, we do need to use Marge for the reasons described in #19046. In the past we have been rather conservative about what features of Marge we have used since experience has shown it lacking in robustness. In general I'd very much like to move away from Marge but this will require help from GitLab upstream. I have made our need of this feature clear to upstream (and thankfully several other FOSS projects have made similar requests) but progress has been quite slow. Thankfully there appears [1] to be a recent uptick in activity; here's to hoping that it will happen in the next few releases. Cheers, - Ben [1] https://gitlab.com/groups/gitlab-org/-/epics/4911

One downside of this approach is that it requires destructive changes
to work-in-progres branches: I might think the MR is ready, squash the
commit sequence into a single commit, but then more work is ready. Now
it’s hard to revert individual patches, or collaborate with others,
because the git history was disrupted.
Another is that the commit message itself isn’t very easily visible to
reviewers.
I couldn't parse this. What does "but then more work is ready" mean?
Why is it hard to collaborate with others? Which commit message "itself
isn’t very easily visible to reviewers."?
I regard squashing as a positive bonus. I take a long series of commits
with messages like "bugfix" and "fix comments" and put them into one or
more logical commits, each doing (so far as poss) as single thing, each
with a comprehensible commit message. That makes it *easier* to
collaborate, and easier to review subsequently.
(Agreed, there is a moment when I need to hold the token, but that's seldom
a problem.)
TL;DR: I don't yet understand the problem you are trying to solve, still
less the solution.
Simon
On Sat, 2 Apr 2022 at 12:30, Joachim Breitner
Hi,
as far as I understand, the expected workflow for MRs is that when they are ready, the developer manually squashes the chronological commit history of the MR into a logical one with polished commit messages, typically consisting of a single commit, but could be multiple ones, and then assigns the MR to margebot, which will rebase that sequence of commits onto the staging branch and eventually merges that into master.
One downside of this approach is that it requires destructive changes to work-in-progres branches: I might think the MR is ready, squash the commit sequence into a single commit, but then more work is ready. Now it’s hard to revert individual patches, or collaborate with others, because the git history was disrupted.
Another is that the commit message itself isn’t very easily visible to reviewers.
In other similarly sized projects (e.g. mathlib) I often see a mode where the actual commits of the MR are ignored (so they can represent the true git history of the branch, including merges and all that grit, which is good for collaboration and for reviewers to understand what has happened, without requiring developers to spend cosmetics effort on them), and upon merging by margebot/bors/mergify/whatever, the MR is merged as a single commit with the description taken from the MR description (which encourages developers to keep the MR description up to date as the MR develops, reviewers can easily see that).
A downside of this that you’ll always get one commit on master per MR. If you like to submit a curated list of logical commits within one MR, then this would not work, and you’d have to use multiple MRs.
Has this been considered?
(I don’t want to cause unnecessary disruption with a presumptious call for action here; take it as a comment to weigh in in if and when this part of our infrastructure is about to change anyways, or maybe a careful probe if my sentiment may be shared widely.)
Cheers, Joachim
-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi, as Richard rightfully says, this is not an aspect of our workflow that we should change right now, so consider this thread now a leisurely coffee machine chat full of hypotheticals, not a concrete call for action. But I’m happy to elaborate the technical details here. Am Montag, dem 04.04.2022 um 09:04 +0100 schrieb Simon Peyton Jones:
One downside of this approach is that it requires destructive changes to work-in-progres branches: I might think the MR is ready, squash the commit sequence into a single commit, but then more work is ready. Now it’s hard to revert individual patches, or collaborate with others, because the git history was disrupted.
Another is that the commit message itself isn’t very easily visible to reviewers.
I couldn't parse this. What does "but then more work is ready" mean? Why is it hard to collaborate with others?
I think I meant “more work is needed”. Consider the no-app-invariant- MR. I asked Sebastian to collaborate with me on the branch. It could have happened that he was making anther refinement to the DmdAnal while I was preparing the branch for merging. If I made final adjustments, squashed the commits and force-pushed the branch while he was working on it, then he can’t push his branch anymore and has a hell of a time of figuring out what went wrong, because by squashing and force-pushing (the “force” is telling) git lost the history it needs to cleanly merge my changes into his branch to push it again. In “my” workflow, you never have to force-push a feature-branch, so this problem does not occur.
Which commit message "itself isn’t very easily visible to reviewers."?
Again, consider the no-app-invariant MR. You approved it, without even knowing what the final commit message would be, because I didn’t squash yet. So the commit message that will end up on master was not visible to you as a reviewer. In “my” workflow, the final commit message will be taken from the MR description, very visible to a reviewer and easily editable by all. Maybe a minor point (commit messages are not as important as, say, good Notes) but still. Also, I often see that MR descriptions are not kept up-to-date as the MR changes; this workflow creates more incentives to keep the MR description good.
I regard squashing as a positive bonus. I take a long series of commits with messages like "bugfix" and "fix comments" and put them into one or more logical commits, each doing (so far as poss) as single thing, each with a comprehensible commit message.
I do too! There is still squashing happening, but it is done by margebot, not you manually, and only “in transit” to master – your feature branch is left alone in the process. (This means there would only be _one_ logical commit per MR, not multiple, which may be the biggest downside of this. I thought I disliked the workflow for that reason as well, until I worked with it in a few projects, and I no longer missed it. Multiple logical commits didn’t seem that useful after all, especially since they are no longer individually CI-checked, etc.) Anyways, probably these things become easier and prettier (or simply different) once we can use Gitlab’s native merge train. And unfortunately every workflow is a compromise in one way or another with git… Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
participants (4)
-
Ben Gamari
-
Joachim Breitner
-
Richard Eisenberg
-
Simon Peyton Jones