[SciPy-Dev] binned_statistic: binnumber and array shapes
Luke Zoltan Kelley
lzkelley at gmail.com
Tue Nov 3 10:03:49 EST 2015
The docs for `scipy.stats.binned_statistic` explain the `binnumber` returned arrays as:
binnumber : 1-D ndarray of ints
This assigns to each observation an integer that represents the bin
in which this observation falls. Array has the same length as `values`.
However, it's very difficult to understand how the returned values match this description. For example:
>>> a1 = [0.1, 0.1, 0.1, 0.6]
>>> a2 = [2.1, 2.6, 2.1, 2.1]
>>> b1 = [0.0, 0.5, 1.0]
>>> b2 = [2.0, 2.5, 3.0]
>>> stats = scipy.stats.binned_statistic_2d(a1, a2, None, 'count', bins=[b1,b2])
BinnedStatistic2dResult(statistic=array([[ 2., 1.],
[ 1., 0.]]), x_edge=array([ 0. , 0.5, 1. ]), y_edge=array([ 2. , 2.5, 3. ]), binnumber=array([5, 6, 5, 9]))
The resulting 'statistic' array makes sense; but the 'binnumber' array is... cryptic...
Before being returned, [`statistic` is reshaped and cleaned-up](https://github.com/scipy/scipy/blob/master/scipy/stats/_binned_statistic.py#L452-L461 <https://github.com/scipy/scipy/blob/master/scipy/stats/_binned_statistic.py#L452-L461>)
Should the same thing be happening to `binnumber`?
(Unfortunately) I created an [issue for this](https://github.com/scipy/scipy/issues/5449 <https://github.com/scipy/scipy/issues/5449>), but it seemed like this (the mailing list) was probably far more appropriate; woops. One other minor point is that the docstring for `binned_statistic_2d` says that `x` and `y` can have different lengths. I think that's a mistake; they have to be the same shape right?
Luke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20151103/373c9888/attachment.html>
More information about the SciPy-Dev
mailing list