[Pandas-dev] Version Policy following 1.0 (Marc Garcia)

Tue Jul 23 17:44:31 EDT 2019

All:

Let me add some perspective on version numbering from having lived as a
product manager  in the software industry for a number of years where we
had a lot of discussions on these issues.

We didn't tie deprecation events to "major" or "minor" releases.  When we
decided to deprecate something, we said it would be deprecated in a
"future" release, which was typically in 3 releases, major or minor.

At ILOG, the decision on having a major (increment the number before the
decimal point) was solely a marketing decision. If we felt that the changes
in the new release were significant, then we would bump up the major
release number.  At IBM, there were some "rules" that said whether we could
have a major release.  So we had a lot of minor releases (from a version
12.0 up to a version 12.9 today), and we still counted each "minor" release
in terms of deprecation counts.

So a policy could be
Announce "will be deprecated in the future" in release X
Announce "will be deprecated in 2 releases" in release X+1
Announce "will be deprecated in next release" in release X+2 and include
warning messages in code.
Announce "has been deprecated" in release X+3 with the code removed.

So it doesn't matter whether X=1.0, X+1=1.1, X+2=1.2, and X+3=1.3, or
X=1.0, X+1=1.1, X+2=2.0 and X+3=2.1

Don't let the numbering decide the deprecation policy.  Just call each
"major" or "minor" release a "release" and use something like the "3
release" policy stated above.

And if you do decide on the "3 release" policy, have some way to keep track
of when each thing is going to be deprecated.  I'm not sure that exists
today for pandas.

   -Irv Lustig  (Dr-Irv)

On Tue, Jul 23, 2019 at 12:00 PM <pandas-dev-request at python.org> wrote:

> Send Pandas-dev mailing list submissions to
>         pandas-dev at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/pandas-dev
> or, via email, send a message with subject or body 'help' to
>         pandas-dev-request at python.org
>
> You can reach the person managing the list at
>         pandas-dev-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Pandas-dev digest..."
>
>
> Today's Topics:
>
>    1. Re: Version Policy following 1.0 (Marc Garcia)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 22 Jul 2019 17:25:17 +0100
> From: Marc Garcia <garcia.marc at gmail.com>
> To: Tom Augspurger <tom.augspurger88 at gmail.com>
> Cc: Matthew Rocklin <mrocklin at gmail.com>, pandas-dev
>         <pandas-dev at python.org>
> Subject: Re: [Pandas-dev] Version Policy following 1.0
> Message-ID:
>         <
> CAEk5N5sNDOFnG+3-JAvANi7pGFvLH5sSvtUknNdEx0zOquaoOQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> My comment was thinking on final users, and not developers of other
> packages. I would say that a company that has a large code base with many
> dependencies including pandas, would prefer to be able to keep updating
> pandas in its version 1.* without worrying much about breaking anything,
> and plan well a 1.* to 2.* migration.
>
> I'm also unsure if removing deprecated things in a rolling way will cause a
> faster progress. I'm more biased to forget about removing stuff most of the
> time, and for major versions just remove everything. I think it would make
> our life easier to not have the overhead of
> https://github.com/pandas-dev/pandas/issues/6581
>
> My feeling is that everything will be simpler for everyone with SemVer, and
> we are the ones deciding when we release a major version, so we'll keep
> deprecated stuff for as long as we want, if the main advantage of rolling
> deprecations is to remove things faster. But may be in practice there are
> other factors that I'm not considering. If the rest of people think rolling
> deprecations will be better, I'm ok with it, I may be wrong.
>
> On Mon, Jul 22, 2019 at 4:54 PM Tom Augspurger <tom.augspurger88 at gmail.com
> >
> wrote:
>
> > Thanks Matt,
> >
> > Marc, do you have thoughts on that? As you say, SemVer is more popular in
> > absolute terms. But within our little community (NumPy, pandas,
> > scikit-learn), rolling deprecations seems to be the preferred approach.
> > I think there's some value in being consistent with those libraries.
> >
> > Joris / Wes, do you know what Arrow's policy will be after its 1.0?
> >
> > Tom
> >
> > On Sun, Jul 21, 2019 at 10:10 AM Matthew Rocklin <mrocklin at gmail.com>
> > wrote:
> >
> >> Hi All,
> >>
> >> I hope you don't mind the intrusion of a non-pandas dev here.  In my
> >> opinion SemVer makes more sense for libraries with a well defined and
> >> narrowly scoped API, and less sense for an API as vast as the Pandas
> API.
> >>
> >> My ardent hope as a user is that you all will clean up and improve the
> >> Pandas API continuously.  While doing this work I fully expect small
> bits
> >> of the API to break on pretty much every release (I think that it would
> be
> >> hard to avoid this).  My guess is that if this community adopted SemVer
> >> then devs would be far more cautious about tidying things, which I think
> >> would be unfortunate.  As someone who is very sensitive to changes in
> the
> >> Pandas API I'm fully in support of the devs breaking things regularly
> if it
> >> means faster progress.
> >>
> >> Best,
> >> -matt
> >>
> >> On Sun, Jul 21, 2019 at 2:24 AM Marc Garcia <garcia.marc at gmail.com>
> >> wrote:
> >>
> >>> Personally, I think will make things easier for users if we use SemVer.
> >>> Mainly because as a user I think it's somehow intuitive that I can
> upgrade
> >>> for example from 1.1.0 to 1.8.0 without having to edit code. But I'd
> expect
> >>> to have to take a closer look and see incompatibilities when I migrate
> from
> >>> 1.* to 2.*.
> >>>
> >>> Personally I don't usually know the deprecation policies of packages,
> >>> and I don't expect most pandas users to know ours even now. So, the
> >>> simplest the better.
> >>>
> >>> Also, for ourselves, I think it's easier and more efficient to forget
> >>> about removing code in minors, and to do all the removals for majors.
> >>>
> >>> I agree this comes at the cost of bigger changes in major releases,
> >>> compared to a rolling policy. But IMHO it's worth.
> >>>
> >>>
> >>> On Fri, 19 Jul 2019, 21:58 Tom Augspurger, <tom.augspurger88 at gmail.com
> >
> >>> wrote:
> >>>
> >>>> On Wed, Jul 17, 2019 at 4:42 PM Brock Mendel <jbrockmendel at gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Do we anticipate the rate of deprecations decreasing significantly?
> >>>>> i.e. if right now we deprecated everything on which there is a
> consensus in
> >>>>> GH, would we be done for a while?
> >>>>>
> >>>>> If not, then I think we're better off sticking with zero-dot-*, or
> >>>>> else we'll be bumping major versions really frequently.
> >>>>>
> >>>>
> >>>> I think this is why I prefer sticking with rolling. If every release
> >>>> bumps the major version number, then no release is a major release.
> >>>>
> >>>> So my preference would be
> >>>>
> >>>> 1. Formally adopt and document that we using a rolling deprecation
> cycle
> >>>> 2. State that deprecations will be around for `N` major releases (3?)
> >>>> 3. Require that every new deprecation includes the version it'll be
> >>>> enforced in. e.g.
> >>>>
> >>>> ```
> >>>> DataFrame.get_dtype_counts is deprecated, and will be removed in
> pandas
> >>>> 1.3.0.
> >>>> Use DataFrame.dtypes.value_counts() instead.
> >>>> ```
> >>>>
> >>>> Tom
> >>>>
> >>>> On Wed, Jul 17, 2019 at 1:49 PM Tom Augspurger <
> >>>>> tom.augspurger88 at gmail.com> wrote:
> >>>>>
> >>>>>> Split from
> >>>>>> https://mail.python.org/pipermail/pandas-dev/2019-July/001030.html
> >>>>>>
> >>>>>> Following 1.0, I think we stop outright breaking APIs. I think that
> >>>>>> stability will be welcome to users.
> >>>>>>
> >>>>>> We still have to decide how we deprecate APIs. The two options are
> >>>>>>
> >>>>>> 1. Rolling deprecations: Essentially what we do today: An API is
> >>>>>> deprecated in release 1.1.0 and can be removed in (say) 1.4.0.
> >>>>>> 2. SemVer: An API may be deprecated in 1.x.0. It can be removed in
> >>>>>> 2.0.0
> >>>>>>
> >>>>>> Do people have preferences between the two? The (dis?)advantage of
> >>>>>> Semver is that all the API-breaking changes are restricted to a
> single
> >>>>>> releases. With rolling deprecations, the upgrades from any 1.x to
> 1.y
> >>>>>> should be smoother than 1.x to 2.x.
> >>>>>>
> >>>>>> Once we choose a strategy, we may want to formalize release
> schedules
> >>>>>> around it.
> >>>>>>
> >>>>>> Tom
> >>>>>> _______________________________________________
> >>>>>> Pandas-dev mailing list
> >>>>>> Pandas-dev at python.org
> >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
> >>>>>>
> >>>>> _______________________________________________
> >>>> Pandas-dev mailing list
> >>>> Pandas-dev at python.org
> >>>> https://mail.python.org/mailman/listinfo/pandas-dev
> >>>>
> >>> _______________________________________________
> >>> Pandas-dev mailing list
> >>> Pandas-dev at python.org
> >>> https://mail.python.org/mailman/listinfo/pandas-dev
> >>>
> >>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/pandas-dev/attachments/20190722/4abdf697/attachment-0001.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
> ------------------------------
>
> End of Pandas-dev Digest, Vol 74, Issue 16
> ******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20190723/6ff3c757/attachment-0001.html>