[Python-Dev] Guidance regarding what counts as breaking backwards compatibility

Mon Feb 3 18:01:11 CET 2014

On Sat, Feb 1, 2014 at 9:14 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 2 February 2014 11:06, Steven D'Aprano <steve at pearwood.info> wrote:
> > Hi all,
> >
> > Over on the Python-ideas list, there's a thread about the new statistics
> > module, and as the author of that module, I'm looking for a bit of
> > guidance regarding backwards compatibility. Specifically two issues:
> >
> >
> > (1) With numeric code, what happens if the module become more[1]
> > accurate in the future? Does that count as breaking backwards
> > compatibility?
> >
> > E.g. Currently I use a particular algorithm for calculating variance.
> > Suppose that for a particular data set, this algorithm is accurate to
> > (say) seven decimal places:
> >
> > # Python 3.4
> > variance(some_data) == 1.23456700001
> >
> > Later, I find a better algorithm, which improves the accuracy of the
> > result:
> >
> > # Python 3.5 or 3.6
> > variance(some_data) == 1.23456789001
> >
> >
> > Would this count as breaking backwards compatibility? If so, how should
> > I handle this? I don't claim that the current implementation of the
> > statistics module is optimal, as far as precision and accuracy is
> > concerned. It may improve in the future.
>
> For this kind of case, we tend to cover it in the "Porting to Python
> X.Y" section of the What's New guide. User code *shouldn't* care about
> this kind of change, but it *might*, so we split the difference and
> say "It's OK in a feature release, but not in a maintenance release".
> There have been multiple changes along these lines in our floating
> handling as Tim Peters, Mark Dickinson et al have made various
> improvements to reduce platform dependent behaviour (especially around
> overflow handling, numeric precision, infinity and NaN handling, etc).
>

I agree with Nick that it's a feature release change, not a bugfix one.

I think the key way to rationalize this particular case is the use of the
words "better" and "improves". Notice you never said "broken" or "wrong",
just that you were making the estimate better. Since the previous behaviour
was not fundamentally broken and going to cause errors in correct code it
then should only go in a feature release.

-Brett

>
> However, we also sometimes have module specific disclaimers - the
> decimal module, for example, has an explicit caveat that updates to
> the General Decimal Arithmetic Specification will be treated as bug
> fixes, even if they would normally not be allowed in maintenance
> releases.
>
> For a non-math related example, a comment from Michael Foord at the
> PyCon US 2013 sprints made me realise that the implementation of
> setting the __wrapped__ attribute in functools was just flat out
> broken - when applied multiple times it was supposed to create a chain
> of references that eventually terminated in a callable without the
> attribute set, but due to the bug every layer actually referred
> directly to the innermost callable (the one without the attribute
> set). Unfortunately, the docs I wrote for it were also ambiguous, so a
> lot of folks (including Michael) assumed it was working as intended. I
> have fixed the bug in 3.4, but there *is* a chance it will break
> introspection code that assumed the old behaviour was intentional and
> doesn't correctly unravel __wrapped__ chains.
>
> > Or would that count as a bug-fix? "Variance function was inaccurate, now
> > less wrong", perhaps.
> >
> > I suppose the math module has the same issue, except that it just wraps
> > the C libraries, which are mature and stable and unlikely to change.
>
> They may look that way *now*, but that's only after Tim, Mark et al
> did a lot of work on avoiding platform specific issues and
> inconsistencies
>
> > The random module has a similar issue:
> >
> > http://docs.python.org/3/library/random.html#notes-on-reproducibility
>
> I think a disclaimer in the statistics module similar to the ones in
> the math module and this one in the random module would be appropriate
> - one of the key purposes of the library/language reference is to let
> us distinguish between "guaranteed behaviour user code can rely on"
> and "implementation details that user code should not assume will
> remain unchanged forever".
>
> In this case, it would likely be appropriate to point out that the
> algorithms used internally may change over time, thus potentially
> changing the error bounds in the module output.
>
> > (2) Mappings[2] are iterable. That means that functions which expect
> > sequences or iterators may also operate on mappings by accident. For
> > example, sum({1: 100, 2: 200}) returns 3. If one wanted to reserve the
> > opportunity to handle mappings specifically in the future, without being
> > locked in by backwards-compatibility, how should one handle it?
> >
> > a) document that behaviour with mappings is unsupported and may
> >    change in the future;
> >
> > b) raise a warning when passed a mapping, but still iterate over it;
> >
> > c) raise an exception and refuse to iterate over the mapping;
> >
> > d) something else?
> >
> >
> > Question (2) is of course a specific example of a more general
> > question, to what degree is the library author responsible for keeping
> > backwards compatibility under circumstances which are not part of the
> > intended API, but just work by accident?
>
> In this particular case, I consider having a single API treat mappings
> differently from other iterables is a surprising anti-pattern and
> providing a separate API specifically for mappings is clearer (cf
> format() vs format_map()).
>
> However, if you want to preserve maximum flexibility, the best near
> term option is typically c (just disallow the input you haven't
> decided how to handle yet entirely), but either a or b would also be
> an acceptable way of achieving the same end (they're just less user
> friendly, since they let people do something that you're already
> considering changing in the future).
>
> In the case where you allow an API to escape into the wild without
> even a documented caveat that it isn't covered by the normal standard
> library backwards compatibility guarantees, then you're pretty much
> stuck. This is why you tend to see newer stdlib APIs often exposing
> functions that return private object types, rather than exposing the
> object type itself: exposed a function just promises a callable() API,
> while exposing a class directly promises a lot more in terms of
> supporting inheritance, isinstance(), issubclass(), etc, which can
> make future evolution of that API substantially more difficult.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140203/5e41a6cc/attachment.html>