[Numpy-discussion] fancy indexing/broadcasting question

Sat Jul 7 14:59:17 EDT 2007

Mark.Miller wrote:
> Sorry...here's a minor correction to the code.
> 
> #1st part
> import numpy
> normal=numpy.random.normal
> 
> RNDarray = normal(25,15,(50,50))
> tmp1 = (RNDarray < 0) | (RNDarray > 25)
> while tmp1.any():
>      print tmp1.size, tmp1.shape, tmp1[tmp1].size
>      RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size)
>      tmp1 = (RNDarray < 0) | (RNDarray > 25)
> 
> #2nd part
> import numpy
> normal=numpy.random.normal
> 
> RNDarray = normal(25,15,(50,50))
> tmp1 = (RNDarray < 0) | (RNDarray > 25)
> while tmp1.any():
>      print tmp1.size, tmp1.shape, tmp1[tmp1].size
>      RNDarray[tmp1] = normal(25,15, size = RNDarray[tmp1].size)
> *   tmp1 = (RNDarray[tmp1] < 0) | (RNDarray[tmp1] > 25)

The reason is that tmp1 is no longer a mask into RNDarray, but into
RNDarray[tmp1] (the old tmp1). For something as small as (50, 50) and simple
criteria (no looping), the first version will probably be faster than any
attempt to optimize it.

However, if you do have larger arrays or slower criteria, you can reduce the
size of the re-evaluated array pretty simply. I'm still not sure it will be
faster, but here it is:

import numpy
normal = numpy.random.normal

RNDarray = normal(25, 15, (50, 50))
badmask = (RNDarray < 0) | (RNDarray > 25)
nbad = badmask.sum()
while nbad > 0:
  new = normal(25, 15, size=nbad)
  RNDarray[badmask] = new
  newbad = (new < 0) | (new > 25)
  badmask[badmask] = newbad
  nbad = newbad.sum()

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco