[SciPy-dev] possible bug in numpy.random.zipf

Wed Jul 9 09:56:40 EDT 2008

Robert Kern wrote:
> On Tue, Jul 8, 2008 at 19:46, Alan Jackson <alan at ajackson.org> wrote:
>   
>> I think I may have found a bug in numpy.random.zipf
>>
>> As I was documenting the zipf function, I started playing with it to create
>> some examples. It's pretty cool, actually. I had never heard of it
>> before.
>>
>> To my surprise, it started giving occasional very large negative numbers as
>> results whenever the 'a' parameter was less than 1.5. It appears that
>> there is some numerical instability for parameters less than 1.5.
>>
>> In [44]: min(np.random.zipf(1.4, 1000))
>> Out[44]: 1
>>
>> In [45]: min(np.random.zipf(1.4, 1000))
>> Out[45]: -2147483648
>>     
>
> This looks like an integer wraparound problem rather than numerical
> instability per se. The correct result is above upper bound that a
> signed long can represent. It gets casted to -sys.maxint-1. Since the
> algorithm is just a simple rejection algorithm, we can include this
> test in the rejection condition. The function then models a Zipf
> distribution truncated to the range [1,sys.maxint].
>
> The fix will be in SVN on the trunk and 1.1.x branch shortly.
>
>   
Hi,

Shouldn't this really depend on the precision being used because NumPy 
provides different precisions?

Also sys.maxint will be history in Python 3.0 
(http://docs.python.org/dev/3.0/whatsnew/3.0?highlight=maxint) as  
'[i]ntegers have unlimited precision'. Instead it could be sys.maxsize 
(http://docs.python.org/dev/3.0/library/sys.html#sys.maxsize) .

Bruce