[Pandas-dev] Thoughts on adopting a 1-PR-1-commit policy?

Wes McKinney wesmckinn at gmail.com
Wed Jun 8 18:11:23 EDT 2016


+1 -- For small patches I think the squash-merge-cherry-pick button is
fine. For larger issues (or where there is some useful discussion of
the motivation for the patch in the PR description), it might be
easier / better to use the patch tool.

On Mon, Jun 6, 2016 at 6:58 AM, Jeff Reback <jeffreback at gmail.com> wrote:
> I c, ok then that seems reasonable for small ones.
>
> Occasionally I have to fix a PR, so i pull it down, fix and then merge. It
> ends up looking very similar. Again these
> are usually very small / limited changes, but author just doesn't fix
> things.
>
> I usually have to manually add the issue reference in that case as the PR
> header text is not included.
>
> But that's why I review the commit message anyhow.
>
> So ok by me.
>
> Jeff
>
> On Mon, Jun 6, 2016 at 9:46 AM, Joris Van den Bossche
> <jorisvandenbossche at gmail.com> wrote:
>>
>> I tried out the button yesterday to see what it does with a small patch:
>> https://github.com/pydata/pandas/pull/13369 (and the actual commit:
>> https://github.com/pydata/pandas/commit/e90d411714e7deac73e3e6b763ba9dccd3549871).
>>
>> So you can see I also edited the commit message (as the PR title didn't
>> reflect the actual change anymore).
>> So in general the button can capture a similar workflow (certainly for
>> small patches). One difference is that you do not get links to the
>> individual commits (only a list of the messages), but I am not sure this is
>> essential (you can also see those in the PR).
>>
>> Joris
>>
>> 2016-06-06 15:25 GMT+02:00 Jeff Reback <jeffreback at gmail.com>:
>>>
>>> This IS currently enabled (merge squashing only). Though I haven't
>>> clicked the button so not exactly sure what it actually outputs. So I
>>> suppose it gives basically the same output then it would be ok.
>>>
>>> Using the script, I generally edit the commit to remove the
>>>
>>>  - [ ] closes #xxxx
>>>  - [ ] tests added / passed
>>>  - [ ] passes ``git diff upstream/master | flake8 --diff``
>>>  - [ ] whatsnew entry
>>>
>>> which is what our PR template gives (and unless the user removes it it
>>> will propogate to the log)
>>>
>>> I *mostly* make them look like this, which is concise, yet retains all of
>>> the pertinent info.
>>>
>>> commit 6edf4471d1e55ce6b587a7e37dd540d787716413
>>> Author: Mike Graham <mikegraham at gmail.com>
>>> Date:   Mon Jun 6 08:10:10 2016 -0400
>>>
>>>     DOC: Fix wording/grammar for rolling's win_type argument.
>>>
>>>     Author: Mike Graham <mikegraham at gmail.com>
>>>
>>>     Closes #13376 from mikegraham/master and squashes the following
>>> commits:
>>>
>>>     ec0c88e [Mike Graham] DOC: Fix wording/grammar for rolling's win_type
>>> argument.
>>>
>>> So all for using the button if it can capture a similar workflow (just
>>> haven't tried it out).
>>>
>>> The biggest reason NOT to use the button is that it gives you a chance to
>>> do a final check / test on your local machine with the actual merged
>>> content. More that a few times I have found issues and backed out of the
>>> process. Though for simpler changes (like a doc-typo that's not necessary).
>>>
>>> Finally merging locally DOES allow me to edit the whatsnew for
>>> consistency and such (or other files). Generally these are trivial things
>>> like spacing / doc-style and such, where a back-and-forth is not necessary.
>>>
>>> Jeff
>>>
>>>
>>> On Mon, Jun 6, 2016 at 8:56 AM, Joris Van den Bossche
>>> <jorisvandenbossche at gmail.com> wrote:
>>>>
>>>> Triggered by a recent tweet of Wes
>>>> (https://twitter.com/wesmckinn/status/738111494852775936), I thought it
>>>> would maybe a good time to evaluate the change in merging practices in
>>>> pandas.
>>>> I am certainly positive on the change in general (my only concern is
>>>> that we shouldn't do additional changes in the middle of using the merge
>>>> script, so you can still look at the 'Files changed' on github to see what
>>>> actually changed).
>>>>
>>>> But, the main thing I was wondering: in the meantime github added the
>>>> feature to squash on merge through the green merge button
>>>> (https://github.com/blog/2141-squash-your-commits). Are there still
>>>> advantages of using our own custom script over the green button?
>>>>
>>>> - As far as I see, the main difference is that the github button does
>>>> not include the text of the first comment in the squashed commit (but
>>>> certainly for smaller PRs, this is not really a problem IMO, and the commit
>>>> message gets a link to the PR)
>>>> - On the plus side, it is easier (certainly for quickly merging a small
>>>> PR, using the merge script is some overhead IMO), and you get the visual
>>>> indication on github that the PR is 'merged' instead of 'closed' (but
>>>> actually merged).
>>>>
>>>> Jeff, would you be OK with starting to use the green button again (with
>>>> the squash on merge option, but you already disabled the merge commit
>>>> option), at least for the smaller PRs?
>>>>
>>>> Regards,
>>>> Joris
>>>>
>>>>
>>>> 2016-01-18 23:59 GMT+01:00 Wes McKinney <wesmckinn at gmail.com>:
>>>>>
>>>>> On Sun, Jan 17, 2016 at 3:34 PM, Wes McKinney <wesmckinn at gmail.com>
>>>>> wrote:
>>>>> > Yeah, the point of this issue is not "there shall be no more than 1
>>>>> > commit per PR" but rather that smaller patches (i.e., most patches)
>>>>> > should not degrade the signal-to-noise ratio of our commit history.
>>>>> > Further, we should avoid merging commits that don't stand on their
>>>>> > own. Lastly, merge commits generally only serve to degrade the SnR
>>>>> > ratio.
>>>>> >
>>>>> > Let's look at a sample of yesterday's commits:
>>>>> >
>>>>> >
>>>>> > https://www.dropbox.com/s/mp5yfp76h6h8z3y/commit-log-20160116.png?dl=0
>>>>> >
>>>>> > No mistakes were made here, except that our current process (which
>>>>> > Jeff has been following diligently) is resulting in a commit history
>>>>> > that is less useful than it could be.
>>>>> >
>>>>> > My preference is:
>>>>> >
>>>>> > - Use the patch tool for smaller patches and large patches that
>>>>> > haven't been split out into a series of incremental, standalone
>>>>> > patches
>>>>> > - For large patches that make sense as multiple incremental commits,
>>>>> > none of which breaks the build, merge with --ff-only (rebasing as
>>>>> > needed). I expect this to be rare.
>>>>> >
>>>>> > I really like to avoid "edge-case driven development" -- we are bound
>>>>> > to have patches where this guidance doesn't feel right, and we
>>>>> > definitely don't have to dogmatically follow it.
>>>>> >
>>>>>
>>>>> For the record -- I spent some time reviewing the major category dtype
>>>>> pull requests that were merged in 2014, and given the sprawling nature
>>>>> of those changes and the huge amount of collaboration that took place,
>>>>> I agree it would have been preferable to fast-forward merge the
>>>>> incremental commits instead of squashing them into a couple of
>>>>> monolithic commits.
>>>>>
>>>>>
>>>>> https://github.com/pydata/pandas/commit/0f62d3fc62f317538044ed3d349bfb89fb7ee9de
>>>>>
>>>>> https://github.com/pydata/pandas/commit/ea0a13c172761348d08285a19ebf731cdabb2db3
>>>>>
>>>>> > - Wes
>>>>
>>>>
>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>
>>>
>>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


More information about the Pandas-dev mailing list