[Pandas-dev] Thoughts on adopting a 1-PR-1-commit policy?

Sun Jan 17 13:37:02 EST 2016

On Sun, Jan 17, 2016 at 9:04 AM, Jan Schulz <jasc at gmx.net> wrote:
> Hi,
>
> On 17 January 2016 at 14:34, Wes McKinney <wesmckinn at gmail.com> wrote:
>> I'm adding the mailing list back. Several comments inline
>
> Oups, sorry!
>
>> I'm not suggesting that you should have to squash your commits inside
>> the PR. This only concerns how *patches are applied to pandas's master
>> branch". Ideally, no squashing occurs inside the developer branch (so
>> the "story", so to speak, about the patch is preserved), but what the
>> Apache patch tool does is
>
> I would still argue against this: the master branch is what is used in
> blame and figuring out why something was done in that way is much
> harder if you always have to get back to some commits in obscure
> branches which might even be removed from the repo.

These issues should be addressed during the code review process. It is
worse, in my opinion, to have a mix of intermediate (possibly broken)
and verified commits in master as opposed to atomic, verified commits.

Additionally, lack of "atomicity" with patches has more issues:

- Difficult to revert patches
- Difficult to port patches into maintenance branches

Requiring patches to be atomic is common practice in large software
teams because otherwise codebase maintenance is a nightmare with >
5-10 developers working in parallel.

> IMO, all this squashing is an incentive not to write good commit
> messages as these are in the end more or less discarded as they are
> all mixed up :-(

Squash or no squash, the only way to have good commit messages is to
expect a certain level of professionalism from pandas contributors. If
the commit / PR description is inadequate, this is the responsibility
of the code reviewer to address with the developer proposing the
patch.

> At least that what happened with me: I tried to write "unit of change"
> commits (`rebase -i` all "typo" and "fixup" commits + good commits
> messages) but then these got squashed and I stopped writing such
> messages because it felt that this was simply wasted and not
> appreciated.
>
> The result was that some decisions which I took are not explained in
> the commits and were lost when the topic were revisited half a year
> latter.
>
>
>> Here is something that would be very hard: create release notes given
>> the commit history.
>
> I don't think this gets any easier as long as some manual things are
> done. There will still be simple commits which correct a typo and
> which should not show up in the release notes. Whatever technical
> solution is used for this, some discipline has to be taken to make
> that successfull.
>
> If release notes generation is what is this all about, then there
> could be several solutions:
>
> * only generate release notes from merge commits (remove headline,
> only use message)
> * only generate release notes from commits which include a tag
> * only generate release notes from merge commits which include a tag
>

No, the release notes aren't a major factor, but one of many issues
caused by a non-atomic, non-linear commit history.

> Jan
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev