[Pandas-dev] On bug-fix releases and maintenance branches

Wes McKinney wesmckinn at gmail.com
Tue Feb 23 14:21:37 EST 2016


hi Joris,

I'm sorry it's taken a couple weeks to write a reply -- been really
busy and wanted to put some thought into this.

This is a really important discussion given how important pandas has
become to so many people, thank you for bringing it up.

On Tue, Feb 9, 2016 at 4:59 PM, Joris Van den Bossche
<jorisvandenbossche at gmail.com> wrote:
> Hi all,
>
> I wanted to stir some discussion on pandas its policy on bug-fx releases and
> upgrading pains. First some context:
>
> Context part 1: Currently we do not use maintenance branches for bugfix
> releases, and we actually also do not really do bugfix releases. We just
> develop further on master, and try to not merge breaking changes the first
> weeks/months, so we can do a minor kind of bug-fix release (but usually also
> with a lot of new features).
> But we don't, for example, backport fixes of regressions if they are fixed
> after master is pointing to the next major release.

I think in general it would be a good idea to tilt development away
from new feature development and toward bug fixes and stability. Given
that we are contemplating making some breaking changes in a 1.x
development branch (like removing the Panel classes), we should decide
as some point to create a 0.X.Y maintenance line where we can backport
bug fixes only, so that "legacy pandas" users can have a "LTS" (in
Ubuntu parlance) maintenance branch. This introduces some development
overhead but it seems worth it.

>
> Context part 2: pandas is not yet that stable, in the sense that there are
> still quite some breaking changes in each release. I am not arguing for not
> doing these breaking changes, as some of these changes are really needed to
> clean up the API  (although there are also arguments for that, but I think
> that is another discussion). This has the consequence that updating your
> pandas version is not always that pleasant.

Over the years I've heard many horror stories from companies who have
created and maintained internal 0.7.x, 0.8.x, or 0.9.x pandas forks
because of the API breakage issues. This is definitely an anti-pattern
that we should try to avoid happening in the future, but API breakages
in many cases are the inevitable price of progress.

Some of the API breakage has resulted from experiences accumulated
over a long period of time -- I made a lot of decisions early on in
the project that ended up not being the right ones (e.g. resample
default arguments changed at one point). There wasn't enough community
engagement at that point to have a thorough design process to
potentially come up with the "right" design first. In other cases, the
"right" choice was perhaps more ambiguous.

API changes are most painful for users who do not write tests for
their code that depends on pandas. That problem is probably not
fixable =)

I think having stable releases with backports of serious correctness
bugs helps mitigate this problem, whereas modest API changes between
major releases. I would also be in favor of having point releases only
contain bug fixes rather than the current system of point releases
being a stable snapshot of trunk.

Since Jeff is the most affected by this on a day to day basis as de
facto steward of the PR queue I would be curious what process he feels
would be the most helpful.

- Wes

>
> Sidenote: I have not that much experience with using pandas in a larger
> company or in larger codebases that need to be upgraded, rather with just my
> own code for my PhD. So it is difficult for me to judge on how much this is
> a problem or if bug-fx releases would help.
>
> Questions:
>
> What are other people's experiences with upgrading pandas? And would more
> bug-fix releases actually ease the upgrading?
> Do we want to do more bug-fix releases?
> Having a maintenance branch and backporting fixes is extra work. Would we be
> able to handle this? Would it be worth the effort?
>
> (It has been mentioned before, but I think the main point raised was lack of
> manpower to maintain separate branches)
>
> To put it another way. In our whatsnew notice there is "We recommend that
> all users upgrade to this version", but I am actually not sure we should
> recommend that. I personally do not always recommend that no matter what
> without careful consideration.
>
> Regards,
> Joris
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


More information about the Pandas-dev mailing list