[issue35892] Fix awkwardness of statistics.mode() for multimodal datasets

Raymond Hettinger report at bugs.python.org
Sun Feb 17 04:10:34 EST 2019


Raymond Hettinger <raymond.hettinger at gmail.com> added the comment:

> Did I miss something?

Yes.  It doesn't really matter which mode is returned as long as it is deterministically chosen.  We're proposing to return the first mode rather than the smallest mode.  

Scipy returns the smallest mode because that is convenient given that the underlying operation is np.unique() which returns unique values in sorted order [1]. 

We want to return the first mode encountered because that is convenient given that the underlying operation is max() which returns the first maximum value it encounters.

Another advantage of return-first rather than return-smallest is that our mode() would work for data values that are hashable but not orderable (i.e. frozensets).

[1] https://github.com/scipy/scipy/blob/v0.19.1/scipy/stats/stats.py#L378-L443

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35892>
_______________________________________


More information about the Python-bugs-list mailing list