[issue35892] Fix awkwardness of statistics.mode() for multimodal datasets

Sat Feb 16 06:28:23 EST 2019

Francis MB <francismb at email.de> added the comment:

Good point Raymond!

Only a minor observation on the packages API:  

[1] SciPy: scipy.stats.mode(a, axis=0, nan_policy='propagate')
"Returns an array of the modal (most common) **value** in the passed array." --> Here it claims to return just ONE value

And use of different policies on parameters :
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.

Equivalent one could say 'multivalue_policy'

[2] Matlab: Mode: "Most frequent **values** in array"

...returns the sample mode of A, which is the most frequently occurring *value* in A...."

IMHO it seems inconsistent *values* vs. *value* (or a doc-bug ?).

An a question:
Does it that mean that mode in that case really should potentially return an array of values, e.g. all the values with equal frequency?

In that case the user has the chance to get the first, the last or just all, ...

----
[1] https://docs.scipy.org/doc/scipy-0.19.1/reference/generated/scipy.stats.mode.html

[2] https://la.mathworks.com/help/matlab/ref/mode.html

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35892>
_______________________________________