[issue39218] Assertion failure when calling statistics.variance() on a float32 Numpy array

Reed report at bugs.python.org
Sun Jan 5 15:37:33 EST 2020


Reed <readuw at gmail.com> added the comment:

Thank you all for the comments! Either using (x-c)*(x-c), or removing the assertion and changing the final line to `return (U, total)`, seem reasonable. I slightly prefer the latter case, due to Mark's comments about x*x being faster and simpler than x**2. But I am not an expert on this.

> I am inclined to have the stdev of float32 return a float32 is possible. What do you think?

Agreed.

> OTOH, (x-c)*(x-c) repeats the subtraction unnecessarily, but perhaps assignment expressions could rescue us?

Yeah, we should avoid repeating the subtraction. Another method of doing so is to define a square function. For example:

    def square(y):
        return y*y
    sum(square(x-c) for x in data)

> Would that also imply intermediate calculations being performed only with float32, or would intermediate calculations be performed with a more precise type?

Currently, statistics.py computes sums in infinite precision (https://github.com/python/cpython/blob/422ed16fb846eec0b5b2a4eb3a978c9862615665/Lib/statistics.py#L123) for any type. The multiplications (and exponents if we go that route) would still be float32.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39218>
_______________________________________


More information about the Python-bugs-list mailing list