scipy.stats.itemfreq: overflow with add.reduce

Hans Georg Krauthaeuser hgk at et.uni-magdeburg.de
Wed Dec 21 05:23:53 EST 2005


Hans Georg Krauthaeuser schrieb:
> Hi All,
> 
> I was playing with scipy.stats.itemfreq when I observed the following 
> overflow:
> 
> In [119]:for i in [254,255,256,257,258]:
>    .....:    l=[0]*i
>    .....:    print i, stats.itemfreq(l), l.count(0)
>    .....:
> 254 [ [  0 254]] 254
> 255 [ [  0 255]] 255
> 256 [ [0 0]] 256
> 257 [ [0 1]] 257
> 258 [ [0 2]] 258
> 
> itemfreq is pretty small (in stats.py):
> 
> ----------------------------------------------------------------------
> def itemfreq(a):
>     """
> Returns a 2D array of item frequencies.  Column 1 contains item values,
> column 2 contains their respective counts.  Assumes a 1D array is passed.
> 
> Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies)
> """
>     scores = _support.unique(a)
>     scores = sort(scores)
>     freq = zeros(len(scores))
>     for i in range(len(scores)):
>         freq[i] = add.reduce(equal(a,scores[i]))
>     return array(_support.abut(scores, freq))
> ----------------------------------------------------------------------
> 
> It seems that add.reduce is the source for the overflow:
> 
> In [116]:from scipy import *
> 
> In [117]:for i in [254,255,256,257,258]:
>    .....:    l=[0]*i
>    .....:    print i, add.reduce(equal(l,0))
>    .....:
> 254 254
> 255 255
> 256 0
> 257 1
> 258 2
> 
> Is there any possibility to avoid the overflow?
> 
> BTW:
> Python 2.3.5 (#2, Aug 30 2005, 15:50:26)
> [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2
> 
> scipy_version.scipy_version  --> '0.3.2'
> 
> 
> Thanks and best regards
> Hans Georg Krauthäuser
After some further investigation:

In [150]:add.reduce(array(equal([0]*256,0),typecode='l'))
Out[150]:256

In [151]:add.reduce(equal([0]*256,0))
Out[151]:0

The problem occurs with arrays with typecode 'b' (as returned by equal).

Workaround patch for itemfreq is obvious, but ... is it a bug or a feature?

regards
Hans Georg



More information about the Python-list mailing list