[core-workflow] GSoC idea: bug.python.org improvements

Thu Mar 26 13:42:56 CET 2015

On Thu, Mar 26, 2015 at 12:37 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 26 March 2015 at 19:05, Ezio Melotti <ezio.melotti at gmail.com> wrote:
>> This is actually something that I wanted to bring up on python-dev.
>> Currently our workflow is mostly patch-based, but adding support for
>> pull requests from BitBucket/GitHub is one of the things that we are
>> considering adding during GSoC.  In addition, the GSoC students will
>> work on a separate clone of the tracker, likely hosted on BitBucket.
>> While we could still use the patch-based approach for GSoC, this would
>> be a good chance to experiment with a more modern approach and also
>> test on the meta-tracker both the new workflow and the code that will
>> enable it.  If successful, the same approach can also be adopted for
>> CPython.
>
> I managed to forget about the idea of allowing roundup patch
> generation direct from GitHub/BitBucket, so in that case it definitely
> makes sense to pursue some of this for the current bugs.python.org
> workflow.

Note that there are 3 distinct but related features here:
  1) given a link to a separate clone/branch (e.g. on
BitBucket/GitHub), compute the differences and create a diff file
(this is already implemented for hg thanks to MvL);
  2) add a button to commit patches directly from the bug tracker
(this might happen only once we switch to Kallithea/Phabricator);
  3) automatically detect/update pull requests against the official
BitBucket/GitHub CPython mirrors.

There are also two major approaches:
  1) diff-based (we get a diff and apply/commit it ourself, we are the
"user" in the cs);
  2) changeset-based (we get a changeset and merge it, the contributor
is the "user" in the cs);
(user is the name of the metadata field used by HG to store the
committer in the changeset.)

For the second approach, the changeset can be provided either as a
patch created with hg export, or a URL of a clone.  In both cases I
can either import/pull on my local clone and push to the main repo, or
Roundup/Kallithea/Phabricator could do the same (possibly directly on
the main repo).

Switching to the second approach changes the semantics of the user
field, since it will start representing the contributor, not the
core-dev/committer.  This will affect hg blame/annotate, and will also
make things such as finding out how many patches have been committed
by a given core-dev more difficult.
If we decide not to change the semantics, we will have to edit the
user field (assuming that's possible) and save the contributor name
somewhere else.

> It especially makes sense as both forge.python.org proposal
> target the support repos first (devguide, peps, etc), so regardless of
> which of those moves forward, it's going to be a long time before it
> impacts the CPython workflows. By contrast, improvements to Roundup's
> integration with other services can start being helpful as soon as
> they get deployed.
>

Note that currently nothing is preventing us to switch to the second
approach, and some patches have already been imported with the
contributor name as "user".

>> One of the main issues is, as you mentioned, how to track both the
>> committer and the author.  This is both a technical and a
>> "philosophical" issue -- that's why I wanted to bring it up on
>> python-dev.
>
> I think there's plenty of precedent from the Git/Gerrit world here,
> but agree there should be a discussion to check for any disagreement
> with us following that precedent.
>

With git both fields are available, but not on mercurial (unless we
use the extension you linked).  This has already been discussed a few
times in the past, see e.g.:
https://mail.python.org/pipermail/python-dev/2011-November/114540.html

>> Another issue is establishing a policy regarding branches and rebases
>> (rebasements?).
>
> In terms of actually *make* changes, I think we'd still want changes
> to effectively involve apply patches until we're able to adopt a more
> capable repo hosting service with integrated review management.
>

While this is certainly the easiest way, it could also be possible to
pull/import some changesets, tweak a few things (add Misc/ACKS and
Misc/NEWS) and make a new commit (and possibly collapse/rebase as
well).  This doesn't need any new tool.

>> These issues might eventually be solved by Kallithea/Phabricator, but
>> I expect a transition period where different workflows will be used. I
>> would like our repo to be still intact by the time we settle on the
>> new workflow.
>
> Agreed.
>
>> IIUC hgcommitter adds extra metadata to the changeset, and if we go
>> down this route we might also consider adding metadata for the issue
>> number and tweak hgweb to display both.
>
> The other possible direction to go is the direction Gerrit and the
> Linux kernel go, which is to just have a footer on commit messages for
> relevant key:value data fields, rather than using VCS metadata. This
> has the benefit of being easy to port to new VCS's later, since
> "committer & commit message" is setting the "what metadata does the
> VCS track?" quite low.
>
> So your commit message might look like:
>
>     Apples are no longer counted as oranges
>
>     Apples were being counted as oranges in some cases. They're now
> always counted as apples.
>
>     Issue: 12345
>     Contributed-by: Pat Chauthor <pat at chauthor.invalid>
>

This is also an option, but it requires everyone to be consistent.
Having these in the commit message will also make them less accessible
(including for the tool we will use to convert the repo to the next
quantum VCS).

Best Regards,
Ezio Melotti

> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia