[Python-ideas] statistics module in Python3.4

Andrew Barnert abarnert at yahoo.com
Fri Jan 31 04:47:54 CET 2014


On Jan 30, 2014, at 17:32, Chris Angelico <rosuav at gmail.com> wrote:

> On Fri, Jan 31, 2014 at 12:07 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> One of my aims is to avoid raising TypeError unnecessarily. The
>> statistics module is aimed at casual users who may not understand, or
>> care about, the subtleties of numeric coercions, they just want to take
>> the average of two values regardless of what sort of number they are.
>> But having said that, I realise that mixed-type arithmetic is difficult,
>> and I've avoided documenting the fact that the module will work on mixed
>> types.
> 
> Based on the current docs and common sense, I would expect that
> Fraction and Decimal should normally be there exclusively, and that
> the only type coercions would be int->float->complex (because it makes
> natural sense to write a list of "floats" as [1.4, 2, 3.7], but it
> doesn't make sense to write a list of Fractions as [Fraction(1,2),
> 7.8, Fraction(12,35)]). Any mishandling of Fraction or Decimal with
> the other three types can be answered with "Well, you should be using
> the same type everywhere". (Though it might be useful to allow
> int->anything coercion, since that one's easy and safe.)

Except that large enough int values lose information, and even larger ones raise an exception:

    >>> float(pow(3, 50)) == pow(3, 50)
    False
    >>> float(1<<2000)
    OverflowError: int too large to convert to float

And that first one is the reason why statistics needs a custom sum in the first place.

When there are only 2 types involved in the sequence, you get the answer you wanted. The only problem raised by the examples in this thread is that with 3 or more types that aren't all mutually coercible but do have a path through them, you can sometimes get imprecise answers and other times get exceptions, and you might come to rely on one or the other.

So, rather than throwing out Stephen's carefully crafted and clearly worded rules and trying to come up with new ones, why not (for 3.4) just say that the order of coercions given values of 3 or more types is not documented and subject to change in the future (maybe even giving the examples from the initial email)?


More information about the Python-ideas mailing list