[Numpy-discussion] ndarray.count() ?

Thu Sep 7 20:14:22 EDT 2006

Robert Kern <robert.kern at gmail.com> [2006-09-07 16:35]:
> rex wrote:
> > Charles R Harris <charlesr.harris at gmail.com> [2006-09-07 15:04]:
> >> I don't know about count, but you can gin up something like this
> >>
> >> In [78]: a = ran.randint(0,2, size=(10,))
> >>
> >> In [79]: a
> >> Out[79]: array([0, 1, 0, 1, 1, 0, 0, 1, 1, 1])
> > 
> > This exposed inconsistent randint() behavior between SciPy and the Python
> > random module. The Python randint includes the upper endpoint. The SciPy
> > version excludes it.
> 
> numpy.random.random_integers() includes the upper bound, if you like. 
> numpy.random does not try to emulate the standard library's random module.

I'm not in a position to argue the merits, but IMHO, when code that
previously worked silently starts returning subtly bad results after
importing numpy, there is a problem. What possible upside is there in
having randint() behave one way in the random module and silently behave
differently in numpy? 

More generally, since numpy.random does not try to emulate the random
module, how does one convert from code that uses the random module to
numpy? Is randint() the only silent problem, or are there others? If so,
how does one discover them? Are they documented anywhere?

I deeply appreciate the countless hours the core developers have
contributed to numpy/scipy, but sometimes I think you are too close to
the problems to fully appreciate the barriers to widespread adoption such
silent "gotchas" present. If the code breaks, fine, you know there's a
problem. When it runs, but returns wrong -- but not obviously wrong --
results, there's a serious problem that will deter a significant number
of people from ever trying the product again.

Again, what is the upside of changing the behavior of the standard
library's randint() without also changing the name?

-rex