[SciPy-dev] possible bug in numpy.random.zipf

Tue Jul 8 21:54:54 EDT 2008

On Tue, Jul 8, 2008 at 19:46, Alan Jackson <alan at ajackson.org> wrote:
> I think I may have found a bug in numpy.random.zipf
>
> As I was documenting the zipf function, I started playing with it to create
> some examples. It's pretty cool, actually. I had never heard of it
> before.
>
> To my surprise, it started giving occasional very large negative numbers as
> results whenever the 'a' parameter was less than 1.5. It appears that
> there is some numerical instability for parameters less than 1.5.
>
> In [44]: min(np.random.zipf(1.4, 1000))
> Out[44]: 1
>
> In [45]: min(np.random.zipf(1.4, 1000))
> Out[45]: -2147483648

This looks like an integer wraparound problem rather than numerical
instability per se. The correct result is above upper bound that a
signed long can represent. It gets casted to -sys.maxint-1. Since the
algorithm is just a simple rejection algorithm, we can include this
test in the rejection condition. The function then models a Zipf
distribution truncated to the range [1,sys.maxint].

The fix will be in SVN on the trunk and 1.1.x branch shortly.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco