[issue38382] statistics.harmonic_mean fails to raise error with negative input that follows a 0

Warren Weckesser report at bugs.python.org
Sun Oct 6 17:31:16 EDT 2019


Warren Weckesser <warren.weckesser at gmail.com> added the comment:

I find it hard to accept the first option.  It seems to let a detail of the current implementation take precedence over API design.  I don't see why we would want an API in which harmonic_mean([1, 0, -1]) returns 0 but harmonic_mean([-1, 0, 1]) raises an error.  The harmonic mean should be invariant to permutations of the input (well, at least to within normal floating point precision when the input is floating point).  I wouldn't expect such a radical change in behavior based on how I ordered my input to harmonic_mean.

The second option appears to be the API that was intended when the function was implemented.  Returning 0 when one or more inputs are 0 (and the rest are positive numbers) is justified by taking a limit.  Just like we *define* the sinc function (https://en.wikipedia.org/wiki/Sinc_function) to be 1 at x=0 and sin(x)/x when x != 0 (because the limit as x approaches 0 of sin(x)/x is 1), we can define the value of the harmonic mean with zeros in the input as the limiting value as one (or more) positive values goes to 0.  That limit is 0.  (Also note that, in the case where just one input is 0, the expression for the harmonic mean can be manipulated into a form that gives 0 without requiring a limit. E.g. for three values, 1/(1/x0 + 1/x1 + 1/x2) = x0*x1*x2/(x1*x2 + x0*x2 + x0*x1).  If just one of those values is 0, the denominator is nonzero, so the result is simply 0.)

There is a nice analogy from electrical circuit theory.  Given, say, three resistances R1, R2 and R3 wired in parallel, the total resistance R is

   R = 1/(1/R1 + 1/R2 + 1/R3) = harmonic_mean([R1, R2, R3])/3

(https://en.wikipedia.org/wiki/Series_and_parallel_circuits#Resistance_units_2). Intuitively, we know that if any of those resistances is 0 (i.e. there is a short circuit), the total resistance is also 0.

The resistance analogy also gives the correct interpretation for the case where the input includes both 0 and +inf.  An infinite resistance is an open circuit (i.e. no connection), so putting that in parallel with a short circuit still results in a total resistance of 0.

The third option, in which any zero in the input is treated as an error, is a change in behavior. Given the justification for returning 0, and given that currently a call such as harmonic_mean([1, 0, 2]) *does* return 0, making that an error seems like an undesirable change.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38382>
_______________________________________


More information about the Python-bugs-list mailing list