[SciPy-user] scipy.stats.scoreatpercentile(...)
Robert Kern
rkern at ucsd.edu
Mon Aug 15 19:15:37 EDT 2005
David K wrote:
> Hi,
>
> I was trying the scipy.stats.scoreatpercentile function:
>
>
>>>>import scipy
>>>>a = scipy.arange(1,11)
>>>>a
>
> array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>
>>>>scipy.stats.scoreatpercentile( a, 50 ) # find the 50th percentile
>
> 5.9050000000000002
>
> Shouldn't the result be 5.5? Or perhaps I've misunderstood something?
I think scoreatpercentile() is kind of broken for this input. It uses
histogram() under the covers; I think the defaults for histogram() (e.g.
10 bins) and its boundary heuristics are a bit pathological for this
input. Well, they're not so bad by themselves, but scoreatpercentile()
trusts them more than is wise in this case.
In [3]: a = arange(1,11)
In [4]: a
Out[4]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [5]: stats.scoreatpercentile(a, 100.0)
Out[5]: 10.265000000000001
In [6]: stats.scoreatpercentile(a, 10.0)
Out[6]: -9.3550000000000004
In [7]: stats.scoreatpercentile(a, 11.0)
Out[7]: 1.6540000000000001
In [8]: stats.histogram(a)
Out[8]:
(array([1, 1, 1, 1, 1, 1, 2, 1, 1, 0]),
0.45499999999999996,
1.0900000000000001,
0)
Alternately,
In [9]: a = stats.uniform.rvs(1.0, 9.0, size=1000)
In [10]: stats.histogram(a)
Out[10]:
(array([ 49, 124, 123, 118, 117, 124, 115, 137, 93, 0]),
0.46266254770569504,
1.0880739132501185,
0)
In [11]: stats.scoreatpercentile(a, 50.0)
Out[11]: 5.6147390258301879
In [12]: stats.scoreatpercentile(a, 10.0)
Out[12]: 1.9982507317280398
--
Robert Kern
rkern at ucsd.edu
"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
More information about the SciPy-User
mailing list