[Numpy-discussion] picking elements with boolean masks

Mon Mar 25 08:50:42 EDT 2013

Neal Becker wrote:

> starting with a NxM array, I want to select elements of the array using a set
> of
> boolean masks.  The masks are simply where the indexes have a 0 or 1 in the
> corresponding bit position.  For example, consider the case where M = 4.
> 
> all_syms = np.arange (4)
> all_bits = np.arange (2)
> bit_mask = (all_syms[:,np.newaxis] >> all_bits) & 1
> mask0 = bit_mask == 0
> mask1 = bit_mask == 1
> 
> Maybe there's a more straightforward way to generate these masks.  That's not
> my question.
> 
> In [331]: mask1
> Out[331]:
> array([[False, False],
>        [ True, False],
>        [False,  True],
>        [ True,  True]], dtype=bool)
> 
> OK, now I want to use this mask on D
> In [333]: D.shape
> Out[333]: (32400, 4)
> 
> Just to simplify, let's just try the first row of D
> 
> In [336]: D[0]
> Out[336]: array([ 0.,  2.,  2.,  4.])
> 
> In [335]: D[0][mask1[...,0]]
> Out[335]: array([ 2.,  4.])
> 
> that worked fine.  But I want not just to apply one of the masks in the set
> (mask1 is [4,2], it has 2 masks), I want the results of applying all the masks
> (2 in this case)
> 
> 
> In [334]: D[0][mask1]
> ---------------------------------------------------------------------------
> ValueError                                Traceback (most recent call last)
> <ipython-input-334-243c7a5e45a4> in <module>()
> ----> 1 D[0][mask1]
> 
> ValueError: boolean index array should have 1 dimension
> 
> Any ideas what's the best approach here?

Perhaps what I need is to use integer indexing, rather than boolean.

all_syms = np.arange (const.size)
all_bits = np.arange (BITS_PER_SYM)
bit_mask = (all_syms[:,np.newaxis] >> all_bits) & 1

ind = np.array ([np.nonzero (bit_mask[...,i])[0] for i in range (BITS_PER_SYM)])
In [366]: ind
Out[366]: 
array([[1, 3],
       [2, 3]])

So now we have the 1-d indexes of the elements we want to select from D.
D = np.arange (4)+1

In [376]: D
Out[376]: array([1, 2, 3, 4])

In [377]: D[ind]
Out[377]: 
array([[2, 4],
       [3, 4]])

Looks like that does the job