Proposal: accept pull requests on GitHub

Hello all, my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. Some quotes from #ghc to pique your curiosity (there are some 50 more): * "is arc broken today?" * "arc is a frickin' mystery." * "i have a theory that i've managed to create a revision that phab can't handle." * "Diffs just seem to be too expensive to create ... I can't blame contributors for not wanting to do this for every atomic change" * "but seriously, we can't require this for contributing to GHC... the entry barrier is already high enough" GitHub has side-by-side diffs https://github.com/blog/1884-introducing-split-diffs nowadays, and Travis-CI can run `./validate --fast` comfortably https://travis-ci.org/ghc/ghc/builds. *Proposal: accept pull requests from contributors on https://github.com/ghc/ghc https://github.com/ghc/ghc.* Details: * use Travis-CI to validate pull requests. * keep using the Trac issue tracker (contributors are encouraged to put a link to their pull-request in the 'Differential Revisions' field). * keep using the Trac wiki. * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. * keep pushing to git.haskell.org, where the existing Git receive hooks can do their job keeping tabs, trailing whitespace and dangling submodule references out, notify Trac and send emails. Committers close pull-requests manually, just like they do Trac tickets. * keep running Phabricator for as long as necessary. * mention that pull requests are accepted on https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs. My expectation is that the majority of patches will start coming in via pull requests, the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops). Reviewers will get many more emails. Other arguments against GitHub are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub. I probably missed a few things, so fire away. Thanks, Thomas

Hi, I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked. I do however recommend accepting pull requests via Github. This is already the case for simple changes: In the past I asked Austin "can you pull this from my branch on Github called XXX", and it went in without problems and without me having to use arc locally. But this process could be more automated: For Ganeti (cluster manager made by Google, written largely in Haskell) I built a tool (https://github.com/google/pull-request-mailer) that listens for pull requests and sends them to the mailing list (Ganeti's preferred way of accepting patches and doing reviews). We built it because some people (me included) liked the Github workflow (push branch, click button) more than `git format-patch`+`git send-email`. You can see an example at https://github.com/ganeti/ganeti/pull/22. The tool then replies on Github that discussion of the change please be held on the mailing list. That has worked so far. It can also handle force-pushes when a PR gets updated based on feedback. Writing it and setting it up only took a few days. I think it wouldn't be too difficult to do the same for GHC: A small tool that imports Github PRs into Phabricator. I don't like the arc user experience. It's modeled in the same way as ReviewBoard, and just pushing a branch is easier in my opinion. However, Phabricator is quite good as a review tool. Its inability to review multiple commits is nasty, but I guess that'll be fixed at some point. If not, such an import tool I suggest could to the squashing for you. Unfortunately there is currently no open source review tool that can handle reviewing entire branches AND multiple revisions of such branches. It's possible to build them though, some companies have internal review tools that do it and they work extremely well. I believe that a simple automated import setup could address many of the points in https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator. Niklas On 01/09/15 20:34, Thomas Miedema wrote:
Hello all,
my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator.
Some quotes from #ghc to pique your curiosity (there are some 50 more): * "is arc broken today?" * "arc is a frickin' mystery." * "i have a theory that i've managed to create a revision that phab can't handle." * "Diffs just seem to be too expensive to create ... I can't blame contributors for not wanting to do this for every atomic change" * "but seriously, we can't require this for contributing to GHC... the entry barrier is already high enough"
GitHub has side-by-side diffs https://github.com/blog/1884-introducing-split-diffs nowadays, and Travis-CI can run `./validate --fast` comfortably https://travis-ci.org/ghc/ghc/builds.
*Proposal: accept pull requests from contributors on https://github.com/ghc/ghc.*
Details: * use Travis-CI to validate pull requests. * keep using the Trac issue tracker (contributors are encouraged to put a link to their pull-request in the 'Differential Revisions' field). * keep using the Trac wiki. * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. * keep pushing to git.haskell.org http://git.haskell.org, where the existing Git receive hooks can do their job keeping tabs, trailing whitespace and dangling submodule references out, notify Trac and send emails. Committers close pull-requests manually, just like they do Trac tickets. * keep running Phabricator for as long as necessary. * mention that pull requests are accepted on https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs.
My expectation is that the majority of patches will start coming in via pull requests, the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops).
Reviewers will get many more emails. Other arguments against GitHub are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub.
I probably missed a few things, so fire away.
Thanks, Thomas
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I like Niklas's suggestion of a middle-ground approach. There are benefits
to using phabricator (and arc), but there should be a lowered-bar approach
where people can start contributing through github (even though they may be
forced to do the code review on phabricator).
On Tue, Sep 1, 2015 at 1:42 PM, Niklas Hambüchen
Hi,
I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked.
I do however recommend accepting pull requests via Github.
This is already the case for simple changes: In the past I asked Austin "can you pull this from my branch on Github called XXX", and it went in without problems and without me having to use arc locally.
But this process could be more automated:
For Ganeti (cluster manager made by Google, written largely in Haskell) I built a tool (https://github.com/google/pull-request-mailer) that listens for pull requests and sends them to the mailing list (Ganeti's preferred way of accepting patches and doing reviews). We built it because some people (me included) liked the Github workflow (push branch, click button) more than `git format-patch`+`git send-email`. You can see an example at https://github.com/ganeti/ganeti/pull/22. The tool then replies on Github that discussion of the change please be held on the mailing list. That has worked so far. It can also handle force-pushes when a PR gets updated based on feedback. Writing it and setting it up only took a few days.
I think it wouldn't be too difficult to do the same for GHC: A small tool that imports Github PRs into Phabricator.
I don't like the arc user experience. It's modeled in the same way as ReviewBoard, and just pushing a branch is easier in my opinion.
However, Phabricator is quite good as a review tool. Its inability to review multiple commits is nasty, but I guess that'll be fixed at some point. If not, such an import tool I suggest could to the squashing for you.
Unfortunately there is currently no open source review tool that can handle reviewing entire branches AND multiple revisions of such branches. It's possible to build them though, some companies have internal review tools that do it and they work extremely well.
I believe that a simple automated import setup could address many of the points in https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator.
Niklas
On 01/09/15 20:34, Thomas Miedema wrote:
Hello all,
my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator.
Some quotes from #ghc to pique your curiosity (there are some 50 more): * "is arc broken today?" * "arc is a frickin' mystery." * "i have a theory that i've managed to create a revision that phab can't handle." * "Diffs just seem to be too expensive to create ... I can't blame contributors for not wanting to do this for every atomic change" * "but seriously, we can't require this for contributing to GHC... the entry barrier is already high enough"
GitHub has side-by-side diffs https://github.com/blog/1884-introducing-split-diffs nowadays, and Travis-CI can run `./validate --fast` comfortably https://travis-ci.org/ghc/ghc/builds.
*Proposal: accept pull requests from contributors on https://github.com/ghc/ghc.*
Details: * use Travis-CI to validate pull requests. * keep using the Trac issue tracker (contributors are encouraged to put a link to their pull-request in the 'Differential Revisions' field). * keep using the Trac wiki. * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. * keep pushing to git.haskell.org http://git.haskell.org, where the existing Git receive hooks can do their job keeping tabs, trailing whitespace and dangling submodule references out, notify Trac and send emails. Committers close pull-requests manually, just like they do Trac tickets. * keep running Phabricator for as long as necessary. * mention that pull requests are accepted on https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs.
My expectation is that the majority of patches will start coming in via pull requests, the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops).
Reviewers will get many more emails. Other arguments against GitHub are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub.
I probably missed a few things, so fire away.
Thanks, Thomas
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked.
At least you're able to submit comments! I just had my second interaction with arc/phab and I have to say that I really hate both now. The former is not flexible enough and lacks documentation, the latter is just plain confusing. I was trying to update an existing phab diff (D1334), but I had no idea what would be submitted, and I'm still not sure whether I did that okay or not. Oh, slyfox tells me that I overwrote my previous changes, nice! In the process, I also created a new revision by mistake. The web UI didn't help either since there's so much stuff: diffs, revisions, ids. Is it okay to have multiple diffs in a single phab differential after updating? No idea. After that I was struggling to reply to rwbarton. I hit "Done" and added my comment, but both things were marked as "Unsubmitted" (or something). After a while I decided to click on the button at the bottom of the page. Looks like it did the trick, but I have no idea whether it's the right way or not. Not that I'm saying that GitHub is perfect, but at least it works instead of messing up with the work I carefully tested.

Hello Nikita, Phabricator has a model where you can make all of your comments in a batch (unsubmitted), and then submit them at once. TBH, I've never had a workflow where I didn't want my intermediate comments to be posted immediately, but it also hasn't been too much of a bother to make my comments, and then scroll to the bottom and submit. Edward Excerpts from Nikita Karetnikov's message of 2015-10-28 14:30:56 -0700:
I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked.
At least you're able to submit comments! I just had my second interaction with arc/phab and I have to say that I really hate both now. The former is not flexible enough and lacks documentation, the latter is just plain confusing.
I was trying to update an existing phab diff (D1334), but I had no idea what would be submitted, and I'm still not sure whether I did that okay or not. Oh, slyfox tells me that I overwrote my previous changes, nice! In the process, I also created a new revision by mistake. The web UI didn't help either since there's so much stuff: diffs, revisions, ids. Is it okay to have multiple diffs in a single phab differential after updating? No idea.
After that I was struggling to reply to rwbarton. I hit "Done" and added my comment, but both things were marked as "Unsubmitted" (or something). After a while I decided to click on the button at the bottom of the page. Looks like it did the trick, but I have no idea whether it's the right way or not.
Not that I'm saying that GitHub is perfect, but at least it works instead of messing up with the work I carefully tested.

On 28/10/2015 14:30, Nikita Karetnikov wrote:
I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked.
At least you're able to submit comments! I just had my second interaction with arc/phab and I have to say that I really hate both now. The former is not flexible enough and lacks documentation, the latter is just plain confusing.
I was trying to update an existing phab diff (D1334), but I had no idea what would be submitted, and I'm still not sure whether I did that okay or not. Oh, slyfox tells me that I overwrote my previous changes, nice! In the process, I also created a new revision by mistake. The web UI didn't help either since there's so much stuff: diffs, revisions, ids. Is it okay to have multiple diffs in a single phab differential after updating? No idea.
After that I was struggling to reply to rwbarton. I hit "Done" and added my comment, but both things were marked as "Unsubmitted" (or something). After a while I decided to click on the button at the bottom of the page. Looks like it did the trick, but I have no idea whether it's the right way or not.
Not that I'm saying that GitHub is perfect, but at least it works instead of messing up with the work I carefully tested.
My guess is that the problems here were mostly because you were unfamiliar with the workflow. Did you look at the wiki page? https://ghc.haskell.org/trac/ghc/wiki/Phabricator At a high level it works like this: - create a diff (arc diff) - update it (arc diff), possibly multiple times - land it (arc land) each time you update it, Phabricator creates a new "revision", attached to the same diff, and you can look at the differences between an arbitrary pair of revisions in the UI. So at any point we can see the whole diff, or just what you changed in the latest revision. As Edward mentioned, in Phabricator you make multiple comments and then submit them all together (which sends a single email to the reviewers). To get to the submit button quickly, hit 'z'. Cheers Simon

I would like to add that even though I'm also skeptical about and intimidated by phabricator by all the reason already discussed, I am very happy with documentation on the wiki. Submitting my first patch didn't feel any harder to me than doing a github pull request. Just to let you know not everybody out there is unhappy with phab. (-: cheers, m. On Sun, Nov 01, 2015 at 06:11:23PM -0800, Simon Marlow wrote:
Date: Sun, 1 Nov 2015 18:11:23 -0800 From: Simon Marlow
To: Nikita Karetnikov , Niklas Hambüchen Cc: "ghc-devs@haskell.org" Subject: Re: Proposal: accept pull requests on GitHub On 28/10/2015 14:30, Nikita Karetnikov wrote:
I would recommend against moving code reviews to Github. I like it and use it all the time for my own projects, but for a large project like GHC, its code reviews are too basic (comments get lost in multi-round reviews), and its customisation an process enforcement is too weak; but that has all been mentioned already on the https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub page you linked.
At least you're able to submit comments! I just had my second interaction with arc/phab and I have to say that I really hate both now. The former is not flexible enough and lacks documentation, the latter is just plain confusing.
I was trying to update an existing phab diff (D1334), but I had no idea what would be submitted, and I'm still not sure whether I did that okay or not. Oh, slyfox tells me that I overwrote my previous changes, nice! In the process, I also created a new revision by mistake. The web UI didn't help either since there's so much stuff: diffs, revisions, ids. Is it okay to have multiple diffs in a single phab differential after updating? No idea.
After that I was struggling to reply to rwbarton. I hit "Done" and added my comment, but both things were marked as "Unsubmitted" (or something). After a while I decided to click on the button at the bottom of the page. Looks like it did the trick, but I have no idea whether it's the right way or not.
Not that I'm saying that GitHub is perfect, but at least it works instead of messing up with the work I carefully tested.
My guess is that the problems here were mostly because you were unfamiliar with the workflow. Did you look at the wiki page? https://ghc.haskell.org/trac/ghc/wiki/Phabricator
At a high level it works like this: - create a diff (arc diff) - update it (arc diff), possibly multiple times - land it (arc land)
each time you update it, Phabricator creates a new "revision", attached to the same diff, and you can look at the differences between an arbitrary pair of revisions in the UI. So at any point we can see the whole diff, or just what you changed in the latest revision.
As Edward mentioned, in Phabricator you make multiple comments and then submit them all together (which sends a single email to the reviewers). To get to the submit button quickly, hit 'z'.
Cheers Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 01/09/2015 11:34, Thomas Miedema wrote:
Hello all,
my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator.
Thanks for taking the time to summarize all the issues. Personally, I think github's support for code reviews is too weak to recommend it over Phabricator. The multiple-email problem is a killer all by itself. We can improve the workflow for Phabricator to address some of the issues you raise are fixable, such as fixing the base revision to use, and ignoring untracked files (these are local settings, I believe). Stacks of commits are hard to reviewers to follow, so making them easier might have a detrimental effect on our processes. It might feel better for the author, but discovering what changed between two branches of multiple commits on github is almost impossible. Instead the recommended workflow seems to be to add more commits, which makes the history harder to read later. I have only had to update my arc once. Is that a big problem? Cheers Simon
Some quotes from #ghc to pique your curiosity (there are some 50 more): * "is arc broken today?" * "arc is a frickin' mystery." * "i have a theory that i've managed to create a revision that phab can't handle." * "Diffs just seem to be too expensive to create ... I can't blame contributors for not wanting to do this for every atomic change" * "but seriously, we can't require this for contributing to GHC... the entry barrier is already high enough"
GitHub has side-by-side diffs https://github.com/blog/1884-introducing-split-diffs nowadays, and Travis-CI can run `./validate --fast` comfortably https://travis-ci.org/ghc/ghc/builds.
*Proposal: accept pull requests from contributors on https://github.com/ghc/ghc.*
Details: * use Travis-CI to validate pull requests. * keep using the Trac issue tracker (contributors are encouraged to put a link to their pull-request in the 'Differential Revisions' field). * keep using the Trac wiki. * in discussions on GitHub, use https://ghc.haskell.org/ticket/1234 to refer to Trac ticket 1234. The shortcut #1234 only works on Trac itself. * keep pushing to git.haskell.org http://git.haskell.org, where the existing Git receive hooks can do their job keeping tabs, trailing whitespace and dangling submodule references out, notify Trac and send emails. Committers close pull-requests manually, just like they do Trac tickets. * keep running Phabricator for as long as necessary. * mention that pull requests are accepted on https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs.
My expectation is that the majority of patches will start coming in via pull requests, the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops).
Reviewers will get many more emails. Other arguments against GitHub are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotGitHub.
I probably missed a few things, so fire away.
Thanks, Thomas
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Wed, Sep 2, 2015 at 8:51 PM, Simon Marlow
Stacks of commits are hard to reviewers to follow, so making them easier might have a detrimental effect on our processes. It might feel better for the author, but discovering what changed between two branches of multiple commits on github is almost impossible. Instead the recommended workflow seems to be to add more commits, which makes the history harder to read later.
I've reviewed+merged various big diffs in the form of branches published as pull requests (on and off GitHub), and being able to see each change separately with its own commit message was way easier than one big diff with a summarized message. If Phabricator would use merge commits, reading multi-commit history, especially what commits got merged together (aka what branch was integrated), is easy. Also, bisecting is more precise without collapsed diffs. Therefore, I wouldn't say the single-commit collapsed view is the right choice for all diffs.

Simon Marlow
On 01/09/2015 11:34, Thomas Miedema wrote:
Hello all,
my arguments against Phabricator are here: https://ghc.haskell.org/trac/ghc/wiki/WhyNotPhabricator.
Thanks for taking the time to summarize all the issues.
Personally, I think github's support for code reviews is too weak to recommend it over Phabricator. The multiple-email problem is a killer all by itself.
As a wild idea -- did anyone look at /Gitlab/ instead? I didn't look into its review functionality to any meaninful degree, but: - it largely tries to replicate the Github workflow - Gitlab CE is open source - it evolves fairly quickly -- с уважениeм / respectfully, Косырев Серёга -- “And those who were seen dancing were thought to be insane by those who could not hear the music.” – Friedrich Wilhelm Nietzsche

On 02/09/15 22:42, Kosyrev Serge wrote:
As a wild idea -- did anyone look at /Gitlab/ instead?
Hi, yes. It does not currently have a sufficient review functionality (cannot handle multiple revisions easily). On 02/09/15 20:51, Simon Marlow wrote:
It might feel better for the author, but discovering what changed between two branches of multiple commits on github is almost impossible.
I disagree with the first part of this: When the UI of the review tool is good, it is easy to follow. But there's no open-source implementation of that around. I agree that it is not easy to follow on Github.

(JFYI: I hate to announce my return with a giant novel of negative-nancy-ness about a proposal that just came up. I'm sorry about this!) TL;DR: I'm strongly -1 on this, because I think it introduces a lot of associated costs for everyone, the benefits aren't really clear, and I think it obscures the real core issue about "how do we get more contributors" and how to make that happen. Needless to say, GitHub does not magically solve both of these AFAICS. As is probably already widely known, I'm fairly against GitHub because I think at best its tools are mediocre and inappropriate for GHC - but I also don't think this proposal or the alternatives stemming from it are very good, and that it reduces visibility of the real, core complaints about what is wrong. Some of those problems may be with Phabricator, but it's hard to sort the wheat from the chaff, so to speak. For one, having two code review tools of any form is completely bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to switch, we should just switch. Having to have people decide how to contribute with two tools is as crazy as having two VCSs and just a way of asking people to get *more* confused, and have us answer more questions. That's something we need to avoid. For the same reason, I'm also not a fan of 'use third party thing to augment other thing to remove its deficiencies making it OK', because the problem is _it adds surface area_ and other problems in other cases. It is a solution that should be considered a last resort, because it is a logical solution that applies to everything. If we have a bot that moves GH PRs into Phab and then review them there, the surface area of what we have to maintain and explain has suddenly exploded: because now instead of 1 thing we have 3 things (GH, Phab, bot) and the 3 interactions between them, for a multiplier of *six* things we have to deal with. And then we use reviewable,io, because GH reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian. We can logically 'automate' everything in all ways to make all contributors happy, but there's a real *cognitive* overhead to this and humans don't scale as well as computers do. It is not truly 'automated away' if the cognitive burden is still there. I also find it extremely strange to tell people "By the way, this method in which you've contributed, as was requested by community members, is actually a complete proxy for the real method of contributing, you can find all your imported code here". How is this supposed to make contribution *easier* as opposed to just more confusing? Now you've got the impression you're using "the real thing" when in reality it's shoved off somewhere else to have the nitpicking done. Just using Phabricator would be less complicated, IMO, and much more direct. The same thing goes for reviewable.io. Adding it as a layer over GitHub just makes the surface area larger, and puts less under our control. And is it going to exist in the same form in 2 or 3 years? Will it continue to offer the same tools, the same workflows that we "like", and what happens when we hit a wall? It's easy to say "probably" or "sure" to all this, until we hit something we dislike and have no possibility of fixing. And once you do all this, BTW, you can 'never go back'. It seems so easy to just say 'submit pull requests' once and nothing else, right? Wrong. Once you commit to that infrastructure, it is *there* and simply taking it out from under the feet of those using it is not only unfortunate, it is *a huge timesink to undo it all*. Which amounts to it never happening. Oh, but you can import everything elsewhere! The problem is you *can't* import everything, but more importantly you can't *import my memories in another way*, so it's a huge blow to contributors to ask them about these mental time sinks, then to forget them all. And as your project grows, this becomes more of a memory as you made a first and last choice to begin with. Phabricator was 'lucky' here because it had the gateway into being the first review tool for us. But that wasn't because it was *better* than GitHub. It was because we were already using it, and it did not interact badly with our other tools or force us to compromise things - so the *cost* was low. The cost is immeasurably higher by default against GitHub because of this, at least to me. That's just how it is sometimes. Keep in mind there is a cost to everything and how you fix it. GitHub is not a simple patch to add a GHC feature. It is a question that fundamentally concerns itself with the future of the project for a long time. The costs must be analyzed more aggressively. Again, Phabricator had 'first child' preferential treatment. That's not something we can undo now. I know this sounds like a lot of ad hoc mumbo jumbo, but please bear with me: we need to identify the *root issue* here to fix it. Otherwise we will pay for the costs of an improper fix for a long time, and we are going to keep having this conversation over, and over again. And we need to weigh in the cost of fixing it, which is why I mention that so much. So with all this in mind, you're back to just using GitHub. But again GitHub is quite mediocre at best. So what is the point of all this? It's hinted at here:
the number of contributions will go up, commits will be smaller, and there will be more of them per pull request (contributors will be able to put style changes and refactorings into separate commits, without jumping through a bunch of hoops).
The real hint is that "the number of contributions will go up". That's
a noble goal and I think it's at the heart of this proposal.
Here's the meat of it question: what is the cost of achieving this
goal? That is, what amount of work is sufficient to make this goal
realizable, and finally - why is GitHub *the best use of our time for
achieving this?* That's one aspect of the cost - that it's the best
use of the time. I feel like this is fundamentally why I always seem
to never 'get' this argument, and I'm sure it's very frustrating on
behalf of the people who have talked to me about it and like GitHub.
But I feel like I've never gotten a straight answer for GHC.
If the goal is actually "make more people contribute", that's pretty
broad. I can make that very easy: give everyone who ever submits a
patch push access. This is a legitimate way to run large projects that
has worked. People will almost certainly be more willing to commit,
especially when overhead on patch submission is reduced so much. Why
not just do that instead? It's not like we even mandate code review,
although we could. You could reasonably trust CI to catch and revert
things a lot of the time for people who commit directly to master. We
all do it sometimes.
I'm being serious about this. I can start doing that tomorrow because
the *cost is low*, both now and reasonably speaking into some
foreseeable future. It is one of many solutions to raw heart of the
proposal. GitHub is not a low cost move, but also, it is a *long term
cost* because of the technical deficiencies it won't aim to address
(merge commits are ugly, branch reviews are weak, ticket/PR namespace
overlaps with Trac, etc etc) or that we'll have to work around.
That means that if we want GitHub to fix the "give us more
contributors" problem, and it has a high cost, it not only has _to fix
the problem_, it also has to do that well enough to offset its cost. I
don't think it's clear that is the case right now, among a lot of
other solutions.
I don't think the root issue is "We _need_ GitHub to get more
contributors". It sounds like the complaint is more "I don't like how
Phabricator works right now". That's an important distinction, because
the latter is not only more specific, it's more actionable:
- Things like Arcanist can be tracked as a Git submodule. There is
little to no pain in this, it's low cost, and it can always be
synchronized with Phabricator. This eliminates the "Must clone
arcanist" and "need to upgrade arcanist" points.
- Similarly when Phabricator sometimes kills a lot of builds, it's
because I do an upgrade. That's mostly an error on my part and I can
simply schedule upgrades regularly, barring hotfixes or somesuch. That
should basically eliminate these. The other build issues are from
picking the wrong base commit from the revision, I think, which I
believe should be fixable upstream (I need to get a solid example of
one that isn't a mega ultra patch.)
- If Harbormaster is not building dependent patches as mentioned in
WhyNotPhabricator, that is a bug, and I have not been aware of it.
Please make me aware of it so I can file bugs! I seriously don't look
at _every_ patch, I need to know this. That could have probably been
fixed ASAP otherwise.
- We can get rid of the awkwardness of squashes etc by using
Phabricator's "immutable" history, although it introduces merge
commits. Whether this is acceptable is up to debate (I dislike merge
commits, but could live with it).
- I do not understand point #3, about answering questions. Here's
the reality: every single one of those cases is *almost always an
error*. That's not a joke. Forgetting to commit a file, amending
changes in the working tree, and specifying a reviewer are all total
errors as it stands today. Why is this a minus? It catches a useful
class of 'interaction bugs'. If it's because sometimes Phabricator
yells about build arifacts in the tree, those should be .gitignore'd.
If it's because you have to 'git stash' sometimes, this is fairly
trivial IMO. Finally, specifying reviewers IS inconvenient, but
currently needed. We could easily assign a '#reviewers' tag that would
add default reviewers.
- In the future, Phabricator will hopefully be able to
automatically assign the right reviewers to every single incoming
patch, based on the source file paths in the tree, using the Owners
tool. Technically, we could do that today if we wanted, it's just a
little more effort to add more Herald rules. This will be far, far
more robust than anything GitHub can offer, and eliminates point #3.
- Styling, linting etc errors being included, because reviews are
hard to create: This is tangential IMO. We need to just bite the
bullet on this and settle on some lint and coding styles, and apply
them to the tree uniformly. The reality is *nobody ever does style
changes on their own*, and they are always accompanied by a diff, and
they always have to redo the work of pulling them out, Phab or not.
Literally 99% of the time we ask for this, it happens this way.
Perhaps instead we should just eliminate this class of work by just
running linters over all of the source code at once, and being happy
with it.
Doing this in fact has other benefits: like `arc lint` will always
_correctly_ report when linting errors are violated. And we can reject
patches that violate them, because they will always be accurate.
- As for some of the quotes, some of them are funny, but the real
message lies in the context. :) In particular, there have been several
cases (such as the DWARF work) where the idea was "write 30 commits
and put them on Phabricator". News flash: *this is bad*, no matter
whether you're using Phabricator or not, because it makes reviewing
the whole thing immensely difficult from a reviewer perspective. The
point here is that we can clear this up by being more communicative
about what we expect of authors of large patches, and communicating
your intent ASAP so we can get patches in as fast as possible. Writing
a patch is the easiest part of the work.
And more:
- Clean up the documentation, it's a mess. It feels nice that
everything has clear, lucid explanations on the wiki, but the wiki is
ridiculously massive and we have a tendancy for 'link creep' where we
spread things out. The contributors docs could probably stand to be
streamlined. We would have to do this anyway, moving to GitHub or not.
- Improve the homepage, directly linking to this aforementioned page.
- Make it clear what we expect of contributors. I feel like a lot of
this could be explained by having a 5 minute drive-by guide for
patches, and then a longer 10-minute guide about A) How to style
things, B) How to format your patches if you're going to contribute
regularly, C) Why it is this way, and D) finally links to all the
other things you need to know. People going into Phabricator expecting
it to behave like GitHub is a problem (more a cultural problem IMO but
that's another story), and if this can't be directly fixed, the best
thing to do is make it clear why it isn't.
Those are just some of the things OTTOMH, but this email is already
way too long. This is what I mean though: fixing most of these is
going to have *seriously smaller cost* than moving to GitHub. It does
not account for "The GitHub factor" of people contributing "just
because it's on GitHub", but again, that value has to outweigh the
other costs. I'm not seriously convinced it does.
I know it's work to fix these things. But GitHub doesn't really
magically make a lot of our needs go away, and it's not going to
magically fix things like style or lint errors, the fact Travis-CI is
still pretty insufficient for us in the long term (and Harbormaster is
faster, on our own hardware, too), or that it will cause needlessly
higher amounts of spam through Trac and GitHub itself. I don't think
settling on it as - what seems to be - a first resort, is a really
good idea.
On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hambüchen
On 02/09/15 22:42, Kosyrev Serge wrote:
As a wild idea -- did anyone look at /Gitlab/ instead?
Hi, yes. It does not currently have a sufficient review functionality (cannot handle multiple revisions easily).
On 02/09/15 20:51, Simon Marlow wrote:
It might feel better for the author, but discovering what changed between two branches of multiple commits on github is almost impossible.
I disagree with the first part of this: When the UI of the review tool is good, it is easy to follow. But there's no open-source implementation of that around.
I agree that it is not easy to follow on Github. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

On Wed, Sep 2, 2015 at 9:41 PM, Austin Seipp
- Make it clear what we expect of contributors. I feel like a lot of this could be explained by having a 5 minute drive-by guide for patches, and then a longer 10-minute guide about A) How to style things, B) How to format your patches if you're going to contribute regularly, C) Why it is this way, and D) finally links to all the other things you need to know. People going into Phabricator expecting it to behave like GitHub is a problem (more a cultural problem IMO but that's another story), and if this can't be directly fixed, the best thing to do is make it clear why it isn't.
This is tangential to the issue of the code review system, and I don't want to derail the discussion here, but if you're talking about a drive-by guide for patches, I'd add E) straightforward instructions on how to get GHC building *fast* for development. A potential contributor won't even reach the patch submission stage if they can't get the build system set up properly, and the current documentation here is spread out and somewhat intimidating for a newcomer.

On Thu, Sep 3, 2015 at 6:41 AM, Austin Seipp
(JFYI: I hate to announce my return with a giant novel of negative-nancy-ness about a proposal that just came up. I'm sorry about this!)
TL;DR: I'm strongly -1 on this, because I think it introduces a lot of associated costs for everyone, the benefits aren't really clear, and I think it obscures the real core issue about "how do we get more contributors" and how to make that happen. Needless to say, GitHub does not magically solve both of these AFAICS.
Let me start off by saying I'm not arguing for GitHub or anything else to replace Phabricator. I'm merely trying to understand the problems with merge commits and patch sets.
- We can get rid of the awkwardness of squashes etc by using Phabricator's "immutable" history, although it introduces merge commits. Whether this is acceptable is up to debate (I dislike merge commits, but could live with it).
I'm genuinely curious about the need to avoid merge commits. I do avoid merge-master-to-topic-branch commits in submitted diffs, but unless you always only merge a single cumulative commit for each diff, merge commits are very useful for vcs history.
- As for some of the quotes, some of them are funny, but the real message lies in the context. :) In particular, there have been several cases (such as the DWARF work) where the idea was "write 30 commits and put them on Phabricator". News flash: *this is bad*, no matter whether you're using Phabricator or not, because it makes reviewing the whole thing immensely difficult from a reviewer perspective. The point here is that we can clear this up by being more communicative about what we expect of authors of large patches, and communicating your intent ASAP so we can get patches in as fast as possible. Writing a patch is the easiest part of the work.
I would also like to understand why reviewing a single commit is easier than the steps (commits) that led to the whole diff. Maybe I review stuff differently, but, as I wrote yesterday, I've always found it easier to follow the changes when it's split into proper commits. And instead of "big patch" I should have written "non-trivial patch". A 100-line unified diff can be equally hard to follow as a 1000-line diff, unless each diff hunk is accompanied with code comments. But comments don't always make sense in the code, and often enough it's best to keep it in the commit message only. Hence the need for splitting the work, and ideally committing as you work on it, with a final cleanup of rearranging commits into a proper set of commits. I'm repeating myself, but git-bisect is much more precise with relevant changes split up as they happened.

I am very much at the ignorant end of this debate: I'll just use whatever I'm told to use. But I do resonate with this observation from Austin:
| For one, having two code review tools of any form is completely
| bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
| switch, we should just switch. Having to have people decide how to
| contribute with two tools is as crazy as having two VCSs and just a
| way of asking people to get *more* confused, and have us answer more
| questions. That's something we need to avoid.
As a code contributor and reviewer, this is awkward. As a contributor, how do I choose? As a reviewer I'm presumably forced to learn both tools.
But I'll go with the flow... I do not have a well-informed opinion about the tradeoffs.
(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
Simon
| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of
| Austin Seipp
| Sent: 03 September 2015 05:42
| To: Niklas Hambüchen
| Cc: Simon Marlow; ghc-devs@haskell.org
| Subject: Re: Proposal: accept pull requests on GitHub
|
| (JFYI: I hate to announce my return with a giant novel of negative-
| nancy-ness about a proposal that just came up. I'm sorry about this!)
|
| TL;DR: I'm strongly -1 on this, because I think it introduces a lot of
| associated costs for everyone, the benefits aren't really clear, and I
| think it obscures the real core issue about "how do we get more
| contributors" and how to make that happen. Needless to say, GitHub
| does not magically solve both of these AFAICS.
|
| As is probably already widely known, I'm fairly against GitHub because
| I think at best its tools are mediocre and inappropriate for GHC - but
| I also don't think this proposal or the alternatives stemming from it
| are very good, and that it reduces visibility of the real, core
| complaints about what is wrong. Some of those problems may be with
| Phabricator, but it's hard to sort the wheat from the chaff, so to
| speak.
|
| For one, having two code review tools of any form is completely
| bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to
| switch, we should just switch. Having to have people decide how to
| contribute with two tools is as crazy as having two VCSs and just a
| way of asking people to get *more* confused, and have us answer more
| questions. That's something we need to avoid.
|
| For the same reason, I'm also not a fan of 'use third party thing to
| augment other thing to remove its deficiencies making it OK', because
| the problem is _it adds surface area_ and other problems in other
| cases. It is a solution that should be considered a last resort,
| because it is a logical solution that applies to everything. If we
| have a bot that moves GH PRs into Phab and then review them there, the
| surface area of what we have to maintain and explain has suddenly
| exploded: because now instead of 1 thing we have 3 things (GH, Phab,
| bot) and the 3 interactions between them, for a multiplier of *six*
| things we have to deal with. And then we use reviewable,io, because GH
| reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian.
| We can logically 'automate' everything in all ways to make all
| contributors happy, but there's a real *cognitive* overhead to this
| and humans don't scale as well as computers do. It is not truly
| 'automated away' if the cognitive burden is still there.
|
| I also find it extremely strange to tell people "By the way, this
| method in which you've contributed, as was requested by community
| members, is actually a complete proxy for the real method of
| contributing, you can find all your imported code here". How is this
| supposed to make contribution *easier* as opposed to just more
| confusing? Now you've got the impression you're using "the real thing"
| when in reality it's shoved off somewhere else to have the nitpicking
| done. Just using Phabricator would be less complicated, IMO, and much
| more direct.
|
| The same thing goes for reviewable.io. Adding it as a layer over
| GitHub just makes the surface area larger, and puts less under our
| control. And is it going to exist in the same form in 2 or 3 years?
| Will it continue to offer the same tools, the same workflows that we
| "like", and what happens when we hit a wall? It's easy to say
| "probably" or "sure" to all this, until we hit something we dislike
| and have no possibility of fixing.
|
| And once you do all this, BTW, you can 'never go back'. It seems so
| easy to just say 'submit pull requests' once and nothing else, right?
| Wrong. Once you commit to that infrastructure, it is *there* and
| simply taking it out from under the feet of those using it is not only
| unfortunate, it is *a huge timesink to undo it all*. Which amounts to
| it never happening. Oh, but you can import everything elsewhere! The
| problem is you *can't* import everything, but more importantly you
| can't *import my memories in another way*, so it's a huge blow to
| contributors to ask them about these mental time sinks, then to forget
| them all. And as your project grows, this becomes more of a memory as
| you made a first and last choice to begin with.
|
| Phabricator was 'lucky' here because it had the gateway into being the
| first review tool for us. But that wasn't because it was *better* than
| GitHub. It was because we were already using it, and it did not
| interact badly with our other tools or force us to compromise things -
| so the *cost* was low. The cost is immeasurably higher by default
| against GitHub because of this, at least to me. That's just how it is
| sometimes.
|
| Keep in mind there is a cost to everything and how you fix it. GitHub
| is not a simple patch to add a GHC feature. It is a question that
| fundamentally concerns itself with the future of the project for a
| long time. The costs must be analyzed more aggressively. Again,
| Phabricator had 'first child' preferential treatment. That's not
| something we can undo now.
|
| I know this sounds like a lot of ad hoc mumbo jumbo, but please bear
| with me: we need to identify the *root issue* here to fix it.
| Otherwise we will pay for the costs of an improper fix for a long
| time, and we are going to keep having this conversation over, and over
| again. And we need to weigh in the cost of fixing it, which is why I
| mention that so much.
|
| So with all this in mind, you're back to just using GitHub. But again
| GitHub is quite mediocre at best. So what is the point of all this?
| It's hinted at here:
|
| > the number of contributions will go up, commits will be smaller, and
| there will be more of them per pull request (contributors will be able
| to put style changes and refactorings into separate commits, without
| jumping through a bunch of hoops).
|
| The real hint is that "the number of contributions will go up". That's
| a noble goal and I think it's at the heart of this proposal.
|
| Here's the meat of it question: what is the cost of achieving this
| goal? That is, what amount of work is sufficient to make this goal
| realizable, and finally - why is GitHub *the best use of our time for
| achieving this?* That's one aspect of the cost - that it's the best
| use of the time. I feel like this is fundamentally why I always seem
| to never 'get' this argument, and I'm sure it's very frustrating on
| behalf of the people who have talked to me about it and like GitHub.
| But I feel like I've never gotten a straight answer for GHC.
|
| If the goal is actually "make more people contribute", that's pretty
| broad. I can make that very easy: give everyone who ever submits a
| patch push access. This is a legitimate way to run large projects that
| has worked. People will almost certainly be more willing to commit,
| especially when overhead on patch submission is reduced so much. Why
| not just do that instead? It's not like we even mandate code review,
| although we could. You could reasonably trust CI to catch and revert
| things a lot of the time for people who commit directly to master. We
| all do it sometimes.
|
| I'm being serious about this. I can start doing that tomorrow because
| the *cost is low*, both now and reasonably speaking into some
| foreseeable future. It is one of many solutions to raw heart of the
| proposal. GitHub is not a low cost move, but also, it is a *long term
| cost* because of the technical deficiencies it won't aim to address
| (merge commits are ugly, branch reviews are weak, ticket/PR namespace
| overlaps with Trac, etc etc) or that we'll have to work around.
|
| That means that if we want GitHub to fix the "give us more
| contributors" problem, and it has a high cost, it not only has _to fix
| the problem_, it also has to do that well enough to offset its cost. I
| don't think it's clear that is the case right now, among a lot of
| other solutions.
|
| I don't think the root issue is "We _need_ GitHub to get more
| contributors". It sounds like the complaint is more "I don't like how
| Phabricator works right now". That's an important distinction, because
| the latter is not only more specific, it's more actionable:
|
| - Things like Arcanist can be tracked as a Git submodule. There is
| little to no pain in this, it's low cost, and it can always be
| synchronized with Phabricator. This eliminates the "Must clone
| arcanist" and "need to upgrade arcanist" points.
|
| - Similarly when Phabricator sometimes kills a lot of builds, it's
| because I do an upgrade. That's mostly an error on my part and I can
| simply schedule upgrades regularly, barring hotfixes or somesuch. That
| should basically eliminate these. The other build issues are from
| picking the wrong base commit from the revision, I think, which I
| believe should be fixable upstream (I need to get a solid example of
| one that isn't a mega ultra patch.)
|
| - If Harbormaster is not building dependent patches as mentioned in
| WhyNotPhabricator, that is a bug, and I have not been aware of it.
| Please make me aware of it so I can file bugs! I seriously don't look
| at _every_ patch, I need to know this. That could have probably been
| fixed ASAP otherwise.
|
| - We can get rid of the awkwardness of squashes etc by using
| Phabricator's "immutable" history, although it introduces merge
| commits. Whether this is acceptable is up to debate (I dislike merge
| commits, but could live with it).
|
| - I do not understand point #3, about answering questions. Here's
| the reality: every single one of those cases is *almost always an
| error*. That's not a joke. Forgetting to commit a file, amending
| changes in the working tree, and specifying a reviewer are all total
| errors as it stands today. Why is this a minus? It catches a useful
| class of 'interaction bugs'. If it's because sometimes Phabricator
| yells about build arifacts in the tree, those should be .gitignore'd.
| If it's because you have to 'git stash' sometimes, this is fairly
| trivial IMO. Finally, specifying reviewers IS inconvenient, but
| currently needed. We could easily assign a '#reviewers' tag that would
| add default reviewers.
| - In the future, Phabricator will hopefully be able to
| automatically assign the right reviewers to every single incoming
| patch, based on the source file paths in the tree, using the Owners
| tool. Technically, we could do that today if we wanted, it's just a
| little more effort to add more Herald rules. This will be far, far
| more robust than anything GitHub can offer, and eliminates point #3.
|
| - Styling, linting etc errors being included, because reviews are
| hard to create: This is tangential IMO. We need to just bite the
| bullet on this and settle on some lint and coding styles, and apply
| them to the tree uniformly. The reality is *nobody ever does style
| changes on their own*, and they are always accompanied by a diff, and
| they always have to redo the work of pulling them out, Phab or not.
| Literally 99% of the time we ask for this, it happens this way.
| Perhaps instead we should just eliminate this class of work by just
| running linters over all of the source code at once, and being happy
| with it.
|
| Doing this in fact has other benefits: like `arc lint` will always
| _correctly_ report when linting errors are violated. And we can reject
| patches that violate them, because they will always be accurate.
|
| - As for some of the quotes, some of them are funny, but the real
| message lies in the context. :) In particular, there have been several
| cases (such as the DWARF work) where the idea was "write 30 commits
| and put them on Phabricator". News flash: *this is bad*, no matter
| whether you're using Phabricator or not, because it makes reviewing
| the whole thing immensely difficult from a reviewer perspective. The
| point here is that we can clear this up by being more communicative
| about what we expect of authors of large patches, and communicating
| your intent ASAP so we can get patches in as fast as possible. Writing
| a patch is the easiest part of the work.
|
| And more:
|
| - Clean up the documentation, it's a mess. It feels nice that
| everything has clear, lucid explanations on the wiki, but the wiki is
| ridiculously massive and we have a tendancy for 'link creep' where we
| spread things out. The contributors docs could probably stand to be
| streamlined. We would have to do this anyway, moving to GitHub or not.
|
| - Improve the homepage, directly linking to this aforementioned
| page.
|
| - Make it clear what we expect of contributors. I feel like a lot of
| this could be explained by having a 5 minute drive-by guide for
| patches, and then a longer 10-minute guide about A) How to style
| things, B) How to format your patches if you're going to contribute
| regularly, C) Why it is this way, and D) finally links to all the
| other things you need to know. People going into Phabricator expecting
| it to behave like GitHub is a problem (more a cultural problem IMO but
| that's another story), and if this can't be directly fixed, the best
| thing to do is make it clear why it isn't.
|
| Those are just some of the things OTTOMH, but this email is already
| way too long. This is what I mean though: fixing most of these is
| going to have *seriously smaller cost* than moving to GitHub. It does
| not account for "The GitHub factor" of people contributing "just
| because it's on GitHub", but again, that value has to outweigh the
| other costs. I'm not seriously convinced it does.
|
| I know it's work to fix these things. But GitHub doesn't really
| magically make a lot of our needs go away, and it's not going to
| magically fix things like style or lint errors, the fact Travis-CI is
| still pretty insufficient for us in the long term (and Harbormaster is
| faster, on our own hardware, too), or that it will cause needlessly
| higher amounts of spam through Trac and GitHub itself. I don't think
| settling on it as - what seems to be - a first resort, is a really
| good idea.
|
|
| On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hambüchen

(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
yes, a github PR is just a branch. Thanks for bringing the discussion back
on track to a productive approach.
This suggestion is what some of us were getting at and it would be better
to just limit the discussion to this idea.
On Mon, Sep 7, 2015 at 6:47 AM, Simon Peyton Jones
I am very much at the ignorant end of this debate: I'll just use whatever I'm told to use. But I do resonate with this observation from Austin:
| For one, having two code review tools of any form is completely | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to | switch, we should just switch. Having to have people decide how to | contribute with two tools is as crazy as having two VCSs and just a | way of asking people to get *more* confused, and have us answer more | questions. That's something we need to avoid.
As a code contributor and reviewer, this is awkward. As a contributor, how do I choose? As a reviewer I'm presumably forced to learn both tools.
But I'll go with the flow... I do not have a well-informed opinion about the tradeoffs.
(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
Simon
| -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Austin Seipp | Sent: 03 September 2015 05:42 | To: Niklas Hambüchen | Cc: Simon Marlow; ghc-devs@haskell.org | Subject: Re: Proposal: accept pull requests on GitHub | | (JFYI: I hate to announce my return with a giant novel of negative- | nancy-ness about a proposal that just came up. I'm sorry about this!) | | TL;DR: I'm strongly -1 on this, because I think it introduces a lot of | associated costs for everyone, the benefits aren't really clear, and I | think it obscures the real core issue about "how do we get more | contributors" and how to make that happen. Needless to say, GitHub | does not magically solve both of these AFAICS. | | As is probably already widely known, I'm fairly against GitHub because | I think at best its tools are mediocre and inappropriate for GHC - but | I also don't think this proposal or the alternatives stemming from it | are very good, and that it reduces visibility of the real, core | complaints about what is wrong. Some of those problems may be with | Phabricator, but it's hard to sort the wheat from the chaff, so to | speak. | | For one, having two code review tools of any form is completely | bonkers, TBQH. This is my biggest 'obvious' blocker. If we're going to | switch, we should just switch. Having to have people decide how to | contribute with two tools is as crazy as having two VCSs and just a | way of asking people to get *more* confused, and have us answer more | questions. That's something we need to avoid. | | For the same reason, I'm also not a fan of 'use third party thing to | augment other thing to remove its deficiencies making it OK', because | the problem is _it adds surface area_ and other problems in other | cases. It is a solution that should be considered a last resort, | because it is a logical solution that applies to everything. If we | have a bot that moves GH PRs into Phab and then review them there, the | surface area of what we have to maintain and explain has suddenly | exploded: because now instead of 1 thing we have 3 things (GH, Phab, | bot) and the 3 interactions between them, for a multiplier of *six* | things we have to deal with. And then we use reviewable,io, because GH | reviews are terrible, adding a 4th mechanism? It's rube goldberg-ian. | We can logically 'automate' everything in all ways to make all | contributors happy, but there's a real *cognitive* overhead to this | and humans don't scale as well as computers do. It is not truly | 'automated away' if the cognitive burden is still there. | | I also find it extremely strange to tell people "By the way, this | method in which you've contributed, as was requested by community | members, is actually a complete proxy for the real method of | contributing, you can find all your imported code here". How is this | supposed to make contribution *easier* as opposed to just more | confusing? Now you've got the impression you're using "the real thing" | when in reality it's shoved off somewhere else to have the nitpicking | done. Just using Phabricator would be less complicated, IMO, and much | more direct. | | The same thing goes for reviewable.io. Adding it as a layer over | GitHub just makes the surface area larger, and puts less under our | control. And is it going to exist in the same form in 2 or 3 years? | Will it continue to offer the same tools, the same workflows that we | "like", and what happens when we hit a wall? It's easy to say | "probably" or "sure" to all this, until we hit something we dislike | and have no possibility of fixing. | | And once you do all this, BTW, you can 'never go back'. It seems so | easy to just say 'submit pull requests' once and nothing else, right? | Wrong. Once you commit to that infrastructure, it is *there* and | simply taking it out from under the feet of those using it is not only | unfortunate, it is *a huge timesink to undo it all*. Which amounts to | it never happening. Oh, but you can import everything elsewhere! The | problem is you *can't* import everything, but more importantly you | can't *import my memories in another way*, so it's a huge blow to | contributors to ask them about these mental time sinks, then to forget | them all. And as your project grows, this becomes more of a memory as | you made a first and last choice to begin with. | | Phabricator was 'lucky' here because it had the gateway into being the | first review tool for us. But that wasn't because it was *better* than | GitHub. It was because we were already using it, and it did not | interact badly with our other tools or force us to compromise things - | so the *cost* was low. The cost is immeasurably higher by default | against GitHub because of this, at least to me. That's just how it is | sometimes. | | Keep in mind there is a cost to everything and how you fix it. GitHub | is not a simple patch to add a GHC feature. It is a question that | fundamentally concerns itself with the future of the project for a | long time. The costs must be analyzed more aggressively. Again, | Phabricator had 'first child' preferential treatment. That's not | something we can undo now. | | I know this sounds like a lot of ad hoc mumbo jumbo, but please bear | with me: we need to identify the *root issue* here to fix it. | Otherwise we will pay for the costs of an improper fix for a long | time, and we are going to keep having this conversation over, and over | again. And we need to weigh in the cost of fixing it, which is why I | mention that so much. | | So with all this in mind, you're back to just using GitHub. But again | GitHub is quite mediocre at best. So what is the point of all this? | It's hinted at here: | | > the number of contributions will go up, commits will be smaller, and | there will be more of them per pull request (contributors will be able | to put style changes and refactorings into separate commits, without | jumping through a bunch of hoops). | | The real hint is that "the number of contributions will go up". That's | a noble goal and I think it's at the heart of this proposal. | | Here's the meat of it question: what is the cost of achieving this | goal? That is, what amount of work is sufficient to make this goal | realizable, and finally - why is GitHub *the best use of our time for | achieving this?* That's one aspect of the cost - that it's the best | use of the time. I feel like this is fundamentally why I always seem | to never 'get' this argument, and I'm sure it's very frustrating on | behalf of the people who have talked to me about it and like GitHub. | But I feel like I've never gotten a straight answer for GHC. | | If the goal is actually "make more people contribute", that's pretty | broad. I can make that very easy: give everyone who ever submits a | patch push access. This is a legitimate way to run large projects that | has worked. People will almost certainly be more willing to commit, | especially when overhead on patch submission is reduced so much. Why | not just do that instead? It's not like we even mandate code review, | although we could. You could reasonably trust CI to catch and revert | things a lot of the time for people who commit directly to master. We | all do it sometimes. | | I'm being serious about this. I can start doing that tomorrow because | the *cost is low*, both now and reasonably speaking into some | foreseeable future. It is one of many solutions to raw heart of the | proposal. GitHub is not a low cost move, but also, it is a *long term | cost* because of the technical deficiencies it won't aim to address | (merge commits are ugly, branch reviews are weak, ticket/PR namespace | overlaps with Trac, etc etc) or that we'll have to work around. | | That means that if we want GitHub to fix the "give us more | contributors" problem, and it has a high cost, it not only has _to fix | the problem_, it also has to do that well enough to offset its cost. I | don't think it's clear that is the case right now, among a lot of | other solutions. | | I don't think the root issue is "We _need_ GitHub to get more | contributors". It sounds like the complaint is more "I don't like how | Phabricator works right now". That's an important distinction, because | the latter is not only more specific, it's more actionable: | | - Things like Arcanist can be tracked as a Git submodule. There is | little to no pain in this, it's low cost, and it can always be | synchronized with Phabricator. This eliminates the "Must clone | arcanist" and "need to upgrade arcanist" points. | | - Similarly when Phabricator sometimes kills a lot of builds, it's | because I do an upgrade. That's mostly an error on my part and I can | simply schedule upgrades regularly, barring hotfixes or somesuch. That | should basically eliminate these. The other build issues are from | picking the wrong base commit from the revision, I think, which I | believe should be fixable upstream (I need to get a solid example of | one that isn't a mega ultra patch.) | | - If Harbormaster is not building dependent patches as mentioned in | WhyNotPhabricator, that is a bug, and I have not been aware of it. | Please make me aware of it so I can file bugs! I seriously don't look | at _every_ patch, I need to know this. That could have probably been | fixed ASAP otherwise. | | - We can get rid of the awkwardness of squashes etc by using | Phabricator's "immutable" history, although it introduces merge | commits. Whether this is acceptable is up to debate (I dislike merge | commits, but could live with it). | | - I do not understand point #3, about answering questions. Here's | the reality: every single one of those cases is *almost always an | error*. That's not a joke. Forgetting to commit a file, amending | changes in the working tree, and specifying a reviewer are all total | errors as it stands today. Why is this a minus? It catches a useful | class of 'interaction bugs'. If it's because sometimes Phabricator | yells about build arifacts in the tree, those should be .gitignore'd. | If it's because you have to 'git stash' sometimes, this is fairly | trivial IMO. Finally, specifying reviewers IS inconvenient, but | currently needed. We could easily assign a '#reviewers' tag that would | add default reviewers. | - In the future, Phabricator will hopefully be able to | automatically assign the right reviewers to every single incoming | patch, based on the source file paths in the tree, using the Owners | tool. Technically, we could do that today if we wanted, it's just a | little more effort to add more Herald rules. This will be far, far | more robust than anything GitHub can offer, and eliminates point #3. | | - Styling, linting etc errors being included, because reviews are | hard to create: This is tangential IMO. We need to just bite the | bullet on this and settle on some lint and coding styles, and apply | them to the tree uniformly. The reality is *nobody ever does style | changes on their own*, and they are always accompanied by a diff, and | they always have to redo the work of pulling them out, Phab or not. | Literally 99% of the time we ask for this, it happens this way. | Perhaps instead we should just eliminate this class of work by just | running linters over all of the source code at once, and being happy | with it. | | Doing this in fact has other benefits: like `arc lint` will always | _correctly_ report when linting errors are violated. And we can reject | patches that violate them, because they will always be accurate. | | - As for some of the quotes, some of them are funny, but the real | message lies in the context. :) In particular, there have been several | cases (such as the DWARF work) where the idea was "write 30 commits | and put them on Phabricator". News flash: *this is bad*, no matter | whether you're using Phabricator or not, because it makes reviewing | the whole thing immensely difficult from a reviewer perspective. The | point here is that we can clear this up by being more communicative | about what we expect of authors of large patches, and communicating | your intent ASAP so we can get patches in as fast as possible. Writing | a patch is the easiest part of the work. | | And more: | | - Clean up the documentation, it's a mess. It feels nice that | everything has clear, lucid explanations on the wiki, but the wiki is | ridiculously massive and we have a tendancy for 'link creep' where we | spread things out. The contributors docs could probably stand to be | streamlined. We would have to do this anyway, moving to GitHub or not. | | - Improve the homepage, directly linking to this aforementioned | page. | | - Make it clear what we expect of contributors. I feel like a lot of | this could be explained by having a 5 minute drive-by guide for | patches, and then a longer 10-minute guide about A) How to style | things, B) How to format your patches if you're going to contribute | regularly, C) Why it is this way, and D) finally links to all the | other things you need to know. People going into Phabricator expecting | it to behave like GitHub is a problem (more a cultural problem IMO but | that's another story), and if this can't be directly fixed, the best | thing to do is make it clear why it isn't. | | Those are just some of the things OTTOMH, but this email is already | way too long. This is what I mean though: fixing most of these is | going to have *seriously smaller cost* than moving to GitHub. It does | not account for "The GitHub factor" of people contributing "just | because it's on GitHub", but again, that value has to outweigh the | other costs. I'm not seriously convinced it does. | | I know it's work to fix these things. But GitHub doesn't really | magically make a lot of our needs go away, and it's not going to | magically fix things like style or lint errors, the fact Travis-CI is | still pretty insufficient for us in the long term (and Harbormaster is | faster, on our own hardware, too), or that it will cause needlessly | higher amounts of spam through Trac and GitHub itself. I don't think | settling on it as - what seems to be - a first resort, is a really | good idea. | | | On Wed, Sep 2, 2015 at 4:09 PM, Niklas Hambüchen
wrote: | > On 02/09/15 22:42, Kosyrev Serge wrote: | >> As a wild idea -- did anyone look at /Gitlab/ instead? | > | > Hi, yes. It does not currently have a sufficient review | functionality | > (cannot handle multiple revisions easily). | > | > On 02/09/15 20:51, Simon Marlow wrote: | >> It might feel better | >> for the author, but discovering what changed between two branches | of | >> multiple commits on github is almost impossible. | > | > I disagree with the first part of this: When the UI of the review | tool | > is good, it is easy to follow. But there's no open-source | > implementation of that around. | > | > I agree that it is not easy to follow on Github. | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | > | | | | -- | Regards, | | Austin Seipp, Haskell Consultant | Well-Typed LLP, http://www.well-typed.com/ | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

(I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
Or: is there a way contributors can create a Phab differential off a GitHub branch? This bypasses the GitHub PR but still provides a similar ease-of-use. This one seems rather easy to imagine. Richard

As a junior ghc contributor, I have to comment that git push -u my-fork my-branch and arc diff are about of the same “cognitive load”. Yes, one must have arc on the machine, but if the right version could live as a submodule in GHC tree: even better. And a bit tangental comment: I can imagine some isolated component to be developed by responsible people in the way they find the most productive, and then pushed to the central repository, in smaller or larger chunks. So not single central repository, but more like a tree. Yet Austin/GHCHQ would need to care only about the root. For example someone with push accept could say “I’ll review and accept GitHub PR touching only base-library in the GHC tree, and submit Phab differentials upstream”. The con, is that the communication distance between original contributor and the end reviewer in Phabricator will increase, but it could still work if the person in between is committed enough. Yet base is a bit of the bad example, as almost every change is first discussed on the libraries-list. In some sense, this is “Phab differential off a GitHub branch”, but with a real person in between, not the script. And maybe someone does something like that already, but not publicly . Git is distributed system after all. Oleg G.
On 09 Sep 2015, at 15:17, Richard Eisenberg
wrote: (I'm tempted naively to ask: is there an automated way to go from a GitHub PR to a Phab ticket? Then we could convert the former (if someone wants to submit that way) into the latter.)
Or: is there a way contributors can create a Phab differential off a GitHub branch? This bypasses the GitHub PR but still provides a similar ease-of-use. This one seems rather easy to imagine.
Richard
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Joe Hillenbrand
As a wild idea -- did anyone look at /Gitlab/ instead?
My personal experience with Gitlab at a previous job is that it is extremely unstable. I'd say even more unstable than trac and phabricator. It's especially bad when dealing with long files.
Curiously, for the nearly three years that we've been dealing with it, I couldn't have pointed at a single instability (or even just a bug), despite using a moderately loaded instance of Gitlab. Also, not being a huge enterprise yet, Gitlab folks /might/ potentially be more responsive to feature requests from a prominent open-source project.. -- с уважениeм / respectfully, Косырев Серёга

On 09/03/2015 09:18 AM, Joe Hillenbrand wrote:
As a wild idea -- did anyone look at /Gitlab/ instead?
My personal experience with Gitlab at a previous job is that it is extremely unstable. I'd say even more unstable than trac and phabricator. It's especially bad when dealing with long files.
If we're talking alternative systems, then I can personally recommend Gerrit (https://www.gerritcodereview.com/) which, while it *looks* pretty basic, it works really well with the general Git workflow. For example, it tracks commits in individual reviews, but tracks dependencies between those commits. So when e.g. you push a new series of commits implementing a feature, all those reviews just get a new "version" and you can diff between different versions of each individual commit -- this often cuts down drastically on how much you have to re-review when a new version is submitted. You can also specify auto-merge when a review gets +2 (or +1, or whatever), including rebase-before-merge-and-ff instead of having merge commits which just clutter the history needlessly. You can set up various rules using a predicate-based rules engine, for example about a review needing two approvals and/or always needing approval from an (external) build system, etc. The only setup it needs in a git hook... which it will tell you exactly how to install with a single command when you push your first review. (It's some scp command, I seem to recall.) Caveat: I haven't tried using it on Windows. Regards,
participants (16)
-
Austin Seipp
-
Bardur Arantsson
-
Edward Z. Yang
-
Greg Weber
-
Joe Hillenbrand
-
Kosyrev Serge
-
Matthias Fischmann
-
Michael Smith
-
Nikita Karetnikov
-
Niklas Hambüchen
-
Oleg Grenrus
-
Richard Eisenberg
-
Simon Marlow
-
Simon Peyton Jones
-
Thomas Miedema
-
Tuncer Ayaz