[Python-Dev] Guidance regarding what counts as breaking backwards compatibility

Sun Feb 2 03:11:22 CET 2014

On 2/1/2014 8:06 PM, Steven D'Aprano wrote:
> Hi all,
>
> Over on the Python-ideas list, there's a thread about the new statistics
> module, and as the author of that module, I'm looking for a bit of
> guidance regarding backwards compatibility. Specifically two issues:
>
>
> (1) With numeric code, what happens if the module become more[1]
> accurate in the future? Does that count as breaking backwards
> compatibility?
>
> E.g. Currently I use a particular algorithm for calculating variance.
> Suppose that for a particular data set, this algorithm is accurate to
> (say) seven decimal places:
>
> # Python 3.4
> variance(some_data) == 1.23456700001
>
> Later, I find a better algorithm, which improves the accuracy of the
> result:
>
> # Python 3.5 or 3.6
> variance(some_data) == 1.23456789001
>
>
> Would this count as breaking backwards compatibility? If so, how should
> I handle this? I don't claim that the current implementation of the
> statistics module is optimal, as far as precision and accuracy is
> concerned. It may improve in the future.
>
> Or would that count as a bug-fix? "Variance function was inaccurate, now
> less wrong", perhaps.

That is my inclination.

> I suppose the math module has the same issue, except that it just wraps
> the C libraries, which are mature and stable and unlikely to change.

Because C libraries differ, math results differ even in the same 
version, so they can certainly change (hopefully improve) in future 
versions. I think the better analogy is cmath, which I believe is more 
than just a wrapper.

> The random module has a similar issue:
>
> http://docs.python.org/3/library/random.html#notes-on-reproducibility
>
>
> (2) Mappings[2] are iterable. That means that functions which expect
> sequences or iterators may also operate on mappings by accident.

I think 'accident' is the key. (Working with sets is not an accident.) 
Anyone who really wants the mean of keys should be explicit:
    mean(d.keys())

> example, sum({1: 100, 2: 200}) returns 3. If one wanted to reserve the
> opportunity to handle mappings specifically in the future, without being
> locked in by backwards-compatibility, how should one handle it?
>
> a) document that behaviour with mappings is unsupported and may
>     change in the future;

I think the doc should in any case specify the proper domain. In this 
case, I think it should exclude mappings: 'non-empty non-mapping 
iterable of numbers', or 'an iterable of numbers that is neither empty 
nor a mapping'. That makes the behavior at best undefined and subject to 
change. There should also be a caveat about mixing types, especially 
Decimals, if not one already. Perhaps rewrite the above as 'an iterable 
that is neither empty nor a mapping of numbers that are mutually summable'.

> b) raise a warning when passed a mapping, but still iterate over it;
>
> c) raise an exception and refuse to iterate over the mapping;

This, if possible. An empty iterable will raise at '/ 0'. Most anything 
that is not an iterable of number will eventually raise at '/ n'
Testing both that an exception is raised and that it is one we want is 
why why unittest has assertRaises.

> Question (2) is of course a specific example of a more general
> question, to what degree is the library author responsible for keeping
> backwards compatibility under circumstances which are not part of the
> intended API, but just work by accident?

> [1] Or, for the sake of the argument, less accurate.
>
> [2] And sets.

-- 
Terry Jan Reedy