Re: Gitlab workflow

11 Jul 2019

      Bryan Richter  writes:
...
On 7/7/19 7:53 PM, Sven Panne wrote:> Am So., 7. Juli 2019 um 17:06
Uhr schrieb Bryan Richter :
...
...
How does the scaling argument reconcile with the massive scope
of the Linux kernel, the project for which git was created? I
can find some middle ground with the more specific points
you made in your email, but I have yet to understand how the
scaling argument holds water when Linux trucks along with "4000
developers, 450 different companies, and 200 new developers each
release"[1]. What makes Linux special in this regard? Is there
some second inflection point?
Well, somehow I saw that example coming... :-D I think the main
reason why things work for Linux is IMHO the amount of highly
specialized high-quality maintainers, i.e. the people who pick the
patches into the (parts of) the releases they maintain, and who do
it as their main (sole?) job. In addition they have a brutal review
system plus an army of people continuously testing *and* they have
Linus.
:D
I would add to your argument that they appear to use git primarily
to *keep a record of merges*. Incoming patches have no history
whatsoever; they're just individual patches. I guess that could be
considered a simpler-to-use version of the fast-forward-only strategy!
Perhaps Linux isn't such a great counterexample after all....
Once they have committed patches to some particular history, though,
they don't rebase, since that would rewrite important audit history.
...
I would very much like to turn the question around: I never fully
understood why some people like merge-based workflows so much. OK,
you can see that e.g. commits A, B, and C together implement feature
X, but to be honest: After the feature X landed, probably nobody
really cares about the feature's history anymore, you normally care
much more about:  Which commit broke feature Y? Which commit slowed
down things? Which commit introduced a space leak/race condition?
What I *don't* like is rewriting history, for all the reasons I don't
like mutable state. As you say, what you're generally interested in is
commits. When references to commits (in emails etc.) get invalidated,
it adds confusion and extra work. Seeing this happen is what led me to
wonder why people even prefer this strategy.
I would reiterate this. In my experience when I'm looking back at GHC's
history I'm probably doing so for one of a few possible reasons:

 * I want to know which patch broke something
 * I want to know which patch made something slower
 * I want to know which patch added something

In all of these cases I (personally) find a linear history makes
reasoning about the progression of changes much easier. Bisection,
blame, and performance analysis tools are all much easier when you have
only one "past" to worry about.
...
On top of that, many of the problems people have with merges actually
seem to be problems with bad commits, as you yourself hinted. Other
concerns seem to be based in unfamiliarity with git's features, or an
irrational desire for "pure history". (Merges *are* history!)
One final thing I like about merges is conflict resolution. Resolving
conflicts via rebase is something I get wrong 40% of the time. It's
hard. Even resolving a conflict during a merge is hard, but it's
easier.
I strongly disagree here. In my experience, resolving conflicts via
rebase is much easier than doing so via merge (which is one of the
reason why I personally use a rebase workflow even outside of GHC).

The difference is that during a rebase workflow I can reason about the
changes made by each commit individually. I can look at the diff of the
original commit (which is generally small, if history was constructed
well), refer to the relevant subset of changes from the new commits I'm
rebasing on top of, and adapt my changes needing only this "local"
state.

By contrast during a merge I need to keep both the entirety of my branch
as well as every new commit that I'm merging into in my head. Not only
is this often plain infeasible (e.g. I can't imagine trying to do this
with the recent concurrent GC patches), but you end up with a result
that is incoherent since changes that were likely relevant to your
feature branch commits end up recorded in the merge commit.
...
Plus, the eventual merge commit keeps a record of the
resolution! (I only learned this recently, since `git log` doesn't
show it by default.) Keeping a public record of how a conflict was
resolved seems like a huge benefit.
I'm not sure I see the value in this. To me it seems like the merge
resolution is just another step in the *development* of the patch. We
generally don't preserve such steps in history. We only care about the
fully-consistent state of the patch when it is merged.

Cheers,

- Ben

Re: Gitlab workflow

Ben Gamari