[Pandas-dev] Thoughts on adopting a 1-PR-1-commit policy?

Wes McKinney wesmckinn at gmail.com
Sun Jan 17 08:34:07 EST 2016


hey Jan,

I'm adding the mailing list back. Several comments inline

On Sun, Jan 17, 2016 at 1:09 AM, Jan Schulz <jasc at gmx.net> wrote:
> Hi,
>
> Just a different opinion: I like having commits do one logical thing
> and not squash multiple "logical complete"things together (this means
> that commit is a logical step, not the PR and that the commits should
> be clean and not contain "fixup", "typo" style commits). During the
> categorical work, I found that a few times I regretted that I couldn't
> go back to look up the specific change in a commit and look up what
> and why that commit was done because it was all mixed up with the rest
> of the squashed commits in that PR.
>

I'm not suggesting that you should have to squash your commits inside
the PR. This only concerns how *patches are applied to pandas's master
branch". Ideally, no squashing occurs inside the developer branch (so
the "story", so to speak, about the patch is preserved), but what the
Apache patch tool does is

- Turns a multi-commit PR into a single-commit patch
- Puts the individual commit hashes in the commit message; so you can
always visit the original commits
- Puts the description from the PR into the commit message
- Cherry-picks instead of merging, so you can observe evolution of
pandas/master in a clear and linear way

> My feeling was always that squashes are performed because the rebases
> are so hard because of multiple PRs fixing stuff in the same files at
> the same time. If this refactorings come through (both better
> separation of backend-frontend specific code and the suggested split
> up of frame.py, etc), I think this is not so much a problem anymore.
>
> I'm not sure where the commit history with merges is a problem: during
> balme (in github, never used git itself), I don't see any merges?!
>

Here is something that would be very hard: create release notes given
the commit history.

> So my suggestion would go in a different direction: better commits in
> the PRs with proper commit messages (not only the headline but also
> explanations in the message.
>
> Jan
> --
> Jan Schulz
> mail: jasc at gmx.net
> web: http://www.katzien.de
>
>
> On 17 January 2016 at 07:29, Wes McKinney <wesmckinn at gmail.com> wrote:
>> The script performs the squashing automatically, so users can squash
>> manually if they wish or let us do it.
>>
>> On Sat, Jan 16, 2016 at 8:11 PM, Jeff Reback <jeffreback at gmail.com> wrote:
>>> seems find to use the cherry picking script
>>>
>>> though I don't think should relax users from squashing
>>>
>>>> On Jan 16, 2016, at 6:46 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>
>>>> Copying the mailing list. Indeed makes rebasing unnecessary if there
>>>> are no cherry-pick conflicts.
>>>>
>>>> On Sat, Jan 16, 2016 at 3:34 PM, Tom Augspurger
>>>> <tom.augspurger88 at gmail.com> wrote:
>>>>> Rebasing can be tough for new contributors, so for that alone I'd say let's try it.
>>>>>
>>>>> -Tom
>>>>>
>>>>>> On Jan 16, 2016, at 5:20 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>>>>
>>>>>> I've grown very fond of the PR cherry-picking style used in many
>>>>>> Apache projects.
>>>>>>
>>>>>> Here's an example of a very large commit to Apache Spark that was
>>>>>> performed in this fashion:
>>>>>>
>>>>>> https://github.com/apache/spark/commit/2fe0a1aaeebbf7f60bd4130847d738c29f1e3d53#diff-e1e1d3d40573127e9ee0480caf1283d6
>>>>>>
>>>>>> If you compare pandas's commit history with a project like this,
>>>>>> you'll see it is much easier to follow because there is one commit for
>>>>>> each patch to the project, rather than a merge commit plus 1 or more
>>>>>> merged commits (depending on whether the person merging the PR did an
>>>>>> interactive rebase).
>>>>>>
>>>>>> The script to do this is not too complex, and is even less complex for
>>>>>> pandas because we do not use JIRA:
>>>>>>
>>>>>> https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
>>>>>>
>>>>>> I've been using a pared down version of the script in Ibis:
>>>>>>
>>>>>> https://github.com/cloudera/ibis/blob/master/dev/merge-pr.py
>>>>>>
>>>>>> Here is an example of what a merge commit with multiple subcommits
>>>>>> looks like using this tool:
>>>>>>
>>>>>> https://github.com/cloudera/ibis/commit/eafabe060dcaaea0a6076342eaa374929b91cf47
>>>>>>
>>>>>> It's pretty easy to use: run the script and enter the PR # you are
>>>>>> merging. It automatically squashes and closes the merged PR.
>>>>>>
>>>>>> Let me know if this is something that would interest the team. I know
>>>>>> there are varying opinions on the GitHub Green Button =)
>>>>>>
>>>>>> - Wes
>>>>>> _______________________________________________
>>>>>> Pandas-dev mailing list
>>>>>> Pandas-dev at python.org
>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev


More information about the Pandas-dev mailing list