Re: Trac to Phabricator (Maniphest) migration prototype

I wondered if someone would ask me about this. In principle I don't
see why not but I don't immediately know how to get the correct label.
In fact, this is an interesting example as "comment:21" refers to a
commit comment (https://ghc.haskell.org/trac/ghc/ticket/10547#comment:21)
which I have filtered out whilst doing the conversion and replaced by
actual links to the commits in a more idiomatic style. Thus
"comment:21" should actually point to #178376
(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#178376).
On Wed, Dec 21, 2016 at 1:54 PM, Sylvain Henry
Nice work!
Would it be possible to convert comment references too? For instance in http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#182793 "comment:21" should be a link to the label #178747
If we do the transfer, we should redirect: https://ghc.haskell.org/trac/ghc/ticket/{NN}#comment:{CC} to phabricator.haskell.org/T{NN}#{tracToPhabComment(NN,CC)} where "tracToPhabComment" function remains to be written ;-)
Thanks, Sylvain
On 21/12/2016 11:12, Matthew Pickering wrote:
Dear devs,
I have completed writing a migration which moves tickets from trac to phabricator. The conversion is essentially lossless. The trac transaction history is replayed which means all events are transferred with their original authors and timestamps. I welcome comments on the work I have done so far, especially bugs as I have definitely not looked at all 12000 tickets.
http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com
All the user accounts are automatically generated. If you want to see the tracker from your perspective then send me an email or ping me on IRC and I can set the password of the relevant account.
NOTE: This is not a decision, the existence of this prototype is to show that the migration is feasible in a satisfactory way and to remove hypothetical arguments from the discussion.
I must also thank Dan Palmer and Herbert who helped me along the way. Dan was responsible for the first implementation and setting up much of the infrastructure at the Haskell Exchange hackathon in October. We extensively used the API bindings which Herbert had been working on.
Further information below!
Matt
=====================================================================
Reasons ======
Why this change? The main argument is consolidation. Having many different services is confusing for new and old contributors. Phabricator has proved effective as a code review tool. It is modern and actively developed with a powerful feature set which we currently only use a small fraction of.
Trac is showing signs of its age. It is old and slow, users regularly lose comments through accidently refreshing their browser. Further to this, the integration with other services is quite poor. Commits do not close tickets which mention them and the only link to commits is a comment. Querying the tickets is also quite difficult, I usually resort to using google search or my emails to find the relevant ticket.
Why is Phabricator better? ====================
Through learning more about Phabricator, there are many small things that I think it does better which will improve the usability of the issue tracker. I will list a few but I urge you to try it out.
* Commits which mention ticket numbers are currently posted as trac comments. There is better integration in phabricator as linking to commits has first-class support. * Links with differentials are also more direct than the current custom field which means you must update two places when posting a differential. * Fields are verified so that mispelling user names is not possible (see #12623 where Ben mispelled his name for example) * This is also true for projects and other fields. Inspecting these fields on trac you will find that the formatting on each ticket is often quite different. * Keywords are much more useful as the set of used keywords is discoverable. * Related tickets are much more substantial as the status of related tickets is reflected to parent ticket. (http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T7724)
Implementation ============
Keywords are implemented as projects. A project is a combination of a tag which can be used with any Phabricator object, a workboard to organise tasks and a group of people who care about the topic. Not all keywords are migrated. Only keywords with at least 5 tickets were added to avoid lots of useless projects. The state of keywords is still a bit unsatisfactory but I wanted to take this chance to clean them up.
Custom fields such as architecture and OS are replaced by *projects* just like keywords. This has the same advantage as other projects. Users can be subscribed to projects and receive emails when new tickets are tagged with a project. The large majority of tickets have very little additional metadata set. I also implemented these as custom fields but found the the result to be less satisfactory.
Some users who have trac accounts do not have phab accounts. Fortunately it is easy to create new user accounts for these users which have empty passwords which can be recovered by the appropriate email address. This means tickets can be properly attributed in the migration.
The ticket numbers are maintained. I still advocate moving the infrastructure tickets in order to maintain this mapping. Especially as there has been little activity in thr the last year.
Tickets are linked to the relevant commits, differentials and other tickets. There are 3000 dummy differentials which are used to test that the linking works correctly. Of course with real data, the proper differential would be linked.(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T11044)
There are a couple of issues currently with the migration. There are a few issues in the parser which converts trac markup to remarkup. Most comments have very simple with just paragraphs and code blocks but complex items like lists are sometimes parsed incorrectly. Definition lists are converted to tables as there are no equivalent in remarkup. Trac ticket links are converted to phab ticket links.
The ideal time to migrate is before the end of January The busiest time for the issue tracker is before and after a new major release. With 8.2 planned for around April this gives the transition a few months to settle. We can close the trac issue tracker and continue to serve it or preferably redirect users to the new ticket. I don't plan to migrate the wiki at this stage as I do not feel that the parser is robust enough although there are now few other technical challenges blocking this direction. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I regularly use comment references on Trac, and I know others do, too. While I'm not saying they need to be supported in a prototype before we elect to go ahead with this route, I would say that preserving comment references is a necessary part of this migration. Along similar lines, the comment numbers in Trac are useful. Does Phab support human-readable comment numbers? Or only those hashes? (I consider a 6-digit number too long to be human-readable.) Having nice comment numbers isn't a necessary feature for me, but losing them would be a small loss that might have to be balanced out by other gains. Thanks, Matthew, for doing this! For the record, this email does not express an opinion about the overall merit of this move, just a few technical points. I do not have a considered position on overall merit. Richard
On Dec 21, 2016, at 9:05 AM, Matthew Pickering
wrote: I wondered if someone would ask me about this. In principle I don't see why not but I don't immediately know how to get the correct label.
In fact, this is an interesting example as "comment:21" refers to a commit comment (https://ghc.haskell.org/trac/ghc/ticket/10547#comment:21) which I have filtered out whilst doing the conversion and replaced by actual links to the commits in a more idiomatic style. Thus "comment:21" should actually point to #178376 (http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#178376).
On Wed, Dec 21, 2016 at 1:54 PM, Sylvain Henry
wrote: Nice work!
Would it be possible to convert comment references too? For instance in http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#182793 "comment:21" should be a link to the label #178747
If we do the transfer, we should redirect: https://ghc.haskell.org/trac/ghc/ticket/{NN}#comment:{CC} to phabricator.haskell.org/T{NN}#{tracToPhabComment(NN,CC)} where "tracToPhabComment" function remains to be written ;-)
Thanks, Sylvain
On 21/12/2016 11:12, Matthew Pickering wrote:
Dear devs,
I have completed writing a migration which moves tickets from trac to phabricator. The conversion is essentially lossless. The trac transaction history is replayed which means all events are transferred with their original authors and timestamps. I welcome comments on the work I have done so far, especially bugs as I have definitely not looked at all 12000 tickets.
http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com
All the user accounts are automatically generated. If you want to see the tracker from your perspective then send me an email or ping me on IRC and I can set the password of the relevant account.
NOTE: This is not a decision, the existence of this prototype is to show that the migration is feasible in a satisfactory way and to remove hypothetical arguments from the discussion.
I must also thank Dan Palmer and Herbert who helped me along the way. Dan was responsible for the first implementation and setting up much of the infrastructure at the Haskell Exchange hackathon in October. We extensively used the API bindings which Herbert had been working on.
Further information below!
Matt
=====================================================================
Reasons ======
Why this change? The main argument is consolidation. Having many different services is confusing for new and old contributors. Phabricator has proved effective as a code review tool. It is modern and actively developed with a powerful feature set which we currently only use a small fraction of.
Trac is showing signs of its age. It is old and slow, users regularly lose comments through accidently refreshing their browser. Further to this, the integration with other services is quite poor. Commits do not close tickets which mention them and the only link to commits is a comment. Querying the tickets is also quite difficult, I usually resort to using google search or my emails to find the relevant ticket.
Why is Phabricator better? ====================
Through learning more about Phabricator, there are many small things that I think it does better which will improve the usability of the issue tracker. I will list a few but I urge you to try it out.
* Commits which mention ticket numbers are currently posted as trac comments. There is better integration in phabricator as linking to commits has first-class support. * Links with differentials are also more direct than the current custom field which means you must update two places when posting a differential. * Fields are verified so that mispelling user names is not possible (see #12623 where Ben mispelled his name for example) * This is also true for projects and other fields. Inspecting these fields on trac you will find that the formatting on each ticket is often quite different. * Keywords are much more useful as the set of used keywords is discoverable. * Related tickets are much more substantial as the status of related tickets is reflected to parent ticket. (http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T7724)
Implementation ============
Keywords are implemented as projects. A project is a combination of a tag which can be used with any Phabricator object, a workboard to organise tasks and a group of people who care about the topic. Not all keywords are migrated. Only keywords with at least 5 tickets were added to avoid lots of useless projects. The state of keywords is still a bit unsatisfactory but I wanted to take this chance to clean them up.
Custom fields such as architecture and OS are replaced by *projects* just like keywords. This has the same advantage as other projects. Users can be subscribed to projects and receive emails when new tickets are tagged with a project. The large majority of tickets have very little additional metadata set. I also implemented these as custom fields but found the the result to be less satisfactory.
Some users who have trac accounts do not have phab accounts. Fortunately it is easy to create new user accounts for these users which have empty passwords which can be recovered by the appropriate email address. This means tickets can be properly attributed in the migration.
The ticket numbers are maintained. I still advocate moving the infrastructure tickets in order to maintain this mapping. Especially as there has been little activity in thr the last year.
Tickets are linked to the relevant commits, differentials and other tickets. There are 3000 dummy differentials which are used to test that the linking works correctly. Of course with real data, the proper differential would be linked.(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T11044)
There are a couple of issues currently with the migration. There are a few issues in the parser which converts trac markup to remarkup. Most comments have very simple with just paragraphs and code blocks but complex items like lists are sometimes parsed incorrectly. Definition lists are converted to tables as there are no equivalent in remarkup. Trac ticket links are converted to phab ticket links.
The ideal time to migrate is before the end of January The busiest time for the issue tracker is before and after a new major release. With 8.2 planned for around April this gives the transition a few months to settle. We can close the trac issue tracker and continue to serve it or preferably redirect users to the new ticket. I don't plan to migrate the wiki at this stage as I do not feel that the parser is robust enough although there are now few other technical challenges blocking this direction. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I was interested to see how many times the raw comment syntax was used as I don't use it myself. Here are the three queries I ran. -- Occurrences of the piece of syntax SELECT COUNT(*) FROM ticket_change WHERE field='comment' AND newvalue LIKE '%comment:%';
3783
-- Instances of the syntax from using the reply button SELECT COUNT(*) FROM ticket_change WHERE field='comment' AND newvalue LIKE '%[comment:%';
2957
-- Total comments SELECT COUNT(*) FROM ticket_change WHERE field='comment';
75967
So the syntax is only used in about 1% of all comments.
Then looking at the culprits for some fun:
(simonpj,192)
(goldfire,123)
(bgamari,116)
(thomie,102)
(nomeata,30)
(rwbarton,28)
(RyanGlScott,19)
(simonmar,18)
were the most frequent comment referencers.
I don't think keeping these internal references would be too
difficult. I have now worked out where the number comes from and it is
easy to get.
Matt
On Wed, Dec 21, 2016 at 2:39 PM, Richard Eisenberg
I regularly use comment references on Trac, and I know others do, too. While I'm not saying they need to be supported in a prototype before we elect to go ahead with this route, I would say that preserving comment references is a necessary part of this migration. Along similar lines, the comment numbers in Trac are useful. Does Phab support human-readable comment numbers? Or only those hashes? (I consider a 6-digit number too long to be human-readable.) Having nice comment numbers isn't a necessary feature for me, but losing them would be a small loss that might have to be balanced out by other gains.
Thanks, Matthew, for doing this!
For the record, this email does not express an opinion about the overall merit of this move, just a few technical points. I do not have a considered position on overall merit.
Richard
On Dec 21, 2016, at 9:05 AM, Matthew Pickering
wrote: I wondered if someone would ask me about this. In principle I don't see why not but I don't immediately know how to get the correct label.
In fact, this is an interesting example as "comment:21" refers to a commit comment (https://ghc.haskell.org/trac/ghc/ticket/10547#comment:21) which I have filtered out whilst doing the conversion and replaced by actual links to the commits in a more idiomatic style. Thus "comment:21" should actually point to #178376 (http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#178376).
On Wed, Dec 21, 2016 at 1:54 PM, Sylvain Henry
wrote: Nice work!
Would it be possible to convert comment references too? For instance in http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T10547#182793 "comment:21" should be a link to the label #178747
If we do the transfer, we should redirect: https://ghc.haskell.org/trac/ghc/ticket/{NN}#comment:{CC} to phabricator.haskell.org/T{NN}#{tracToPhabComment(NN,CC)} where "tracToPhabComment" function remains to be written ;-)
Thanks, Sylvain
On 21/12/2016 11:12, Matthew Pickering wrote:
Dear devs,
I have completed writing a migration which moves tickets from trac to phabricator. The conversion is essentially lossless. The trac transaction history is replayed which means all events are transferred with their original authors and timestamps. I welcome comments on the work I have done so far, especially bugs as I have definitely not looked at all 12000 tickets.
http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com
All the user accounts are automatically generated. If you want to see the tracker from your perspective then send me an email or ping me on IRC and I can set the password of the relevant account.
NOTE: This is not a decision, the existence of this prototype is to show that the migration is feasible in a satisfactory way and to remove hypothetical arguments from the discussion.
I must also thank Dan Palmer and Herbert who helped me along the way. Dan was responsible for the first implementation and setting up much of the infrastructure at the Haskell Exchange hackathon in October. We extensively used the API bindings which Herbert had been working on.
Further information below!
Matt
=====================================================================
Reasons ======
Why this change? The main argument is consolidation. Having many different services is confusing for new and old contributors. Phabricator has proved effective as a code review tool. It is modern and actively developed with a powerful feature set which we currently only use a small fraction of.
Trac is showing signs of its age. It is old and slow, users regularly lose comments through accidently refreshing their browser. Further to this, the integration with other services is quite poor. Commits do not close tickets which mention them and the only link to commits is a comment. Querying the tickets is also quite difficult, I usually resort to using google search or my emails to find the relevant ticket.
Why is Phabricator better? ====================
Through learning more about Phabricator, there are many small things that I think it does better which will improve the usability of the issue tracker. I will list a few but I urge you to try it out.
* Commits which mention ticket numbers are currently posted as trac comments. There is better integration in phabricator as linking to commits has first-class support. * Links with differentials are also more direct than the current custom field which means you must update two places when posting a differential. * Fields are verified so that mispelling user names is not possible (see #12623 where Ben mispelled his name for example) * This is also true for projects and other fields. Inspecting these fields on trac you will find that the formatting on each ticket is often quite different. * Keywords are much more useful as the set of used keywords is discoverable. * Related tickets are much more substantial as the status of related tickets is reflected to parent ticket. (http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T7724)
Implementation ============
Keywords are implemented as projects. A project is a combination of a tag which can be used with any Phabricator object, a workboard to organise tasks and a group of people who care about the topic. Not all keywords are migrated. Only keywords with at least 5 tickets were added to avoid lots of useless projects. The state of keywords is still a bit unsatisfactory but I wanted to take this chance to clean them up.
Custom fields such as architecture and OS are replaced by *projects* just like keywords. This has the same advantage as other projects. Users can be subscribed to projects and receive emails when new tickets are tagged with a project. The large majority of tickets have very little additional metadata set. I also implemented these as custom fields but found the the result to be less satisfactory.
Some users who have trac accounts do not have phab accounts. Fortunately it is easy to create new user accounts for these users which have empty passwords which can be recovered by the appropriate email address. This means tickets can be properly attributed in the migration.
The ticket numbers are maintained. I still advocate moving the infrastructure tickets in order to maintain this mapping. Especially as there has been little activity in thr the last year.
Tickets are linked to the relevant commits, differentials and other tickets. There are 3000 dummy differentials which are used to test that the linking works correctly. Of course with real data, the proper differential would be linked.(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T11044)
There are a couple of issues currently with the migration. There are a few issues in the parser which converts trac markup to remarkup. Most comments have very simple with just paragraphs and code blocks but complex items like lists are sometimes parsed incorrectly. Definition lists are converted to tables as there are no equivalent in remarkup. Trac ticket links are converted to phab ticket links.
The ideal time to migrate is before the end of January The busiest time for the issue tracker is before and after a new major release. With 8.2 planned for around April this gives the transition a few months to settle. We can close the trac issue tracker and continue to serve it or preferably redirect users to the new ticket. I don't plan to migrate the wiki at this stage as I do not feel that the parser is robust enough although there are now few other technical challenges blocking this direction. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Dec 21, 2016, at 10:02 AM, Matthew Pickering
wrote: Then looking at the culprits for some fun:
(simonpj,192) (goldfire,123) (bgamari,116) (thomie,102) (nomeata,30) (rwbarton,28) (RyanGlScott,19) (simonmar,18)
Ha. I also use the long form comment:XX:ticket:YY sometimes, for cross-ticket and from-wiki comment referencing. These are somewhat rare, but I doubt it would be hard to preserve this form, if you're aware of it. Thanks for looking into it! Richard
participants (2)
-
Matthew Pickering
-
Richard Eisenberg