[SciPy-User] masking an array ends up flattening it

Zachary Pincus zachary.pincus at yale.edu
Wed Feb 29 11:50:26 EST 2012


> Hi Zach, thanks a lot. I should know by now that naive expectations that are not met in numpy are generally so for lack of generalization! Your example makes perfect sense.
> My use case is a covariance matrix that has the dimension of all the parameters available, but some of them are fix in a fit, and I have a bool array that tells me which parameters are fixed. I then would like to "extract" the covariance matrix of the free parameters.
> 
> I would rather go for masking and then reshaping than fancy indexing, which if too fancy start scaring me :)
> Of course if there is a clean solution, I am all ears.

OK, so you have a list of parameter indices that are "good" and you want to get the sub-matrix out corresponding to just the rows and columns at those indices? E.g.:

a = numpy.arange(25).reshape((5,5))
print a
array([[ 0,  1,  2,  3,  4],
      [ 5,  6,  7,  8,  9],
      [10, 11, 12, 13, 14],
      [15, 16, 17, 18, 19],
      [20, 21, 22, 23, 24]])

Then, say you want to get the sub-matrix of 'a' corresponding to rows/columns 1 and 3? Is this equivalent to what you need to do?
That is, you want the following:
array([[ 6,  8],
      [16, 18]])

For this you might think to do the following:
a[[1,3], [1,3]]
but this returns 'array([ 6, 18])' -- you have pulled out a flat list of two elements, at indices [1,1] and [3,3]... This sort of fancy indexing is VERY useful in many cases, but not the case you want, which is more like a "cross product" sort of indexing problem.

It turns out that what you really want is:
a[ [[1,1],[3,3]], [[1,3],[1,3]] ]
which yields:
array([[ 6,  8],
      [16, 18]])

This makes sense -- you pass in a two 2D arrays, one containing the x-coords and one the y-coords, and you get out a 2D array of the same shape.

Perhaps-insanely, the above can be simplified to:
a[ [[1],[3]], [[1,3]] ]

If you understand numpy broadcasting rules, you may see how:
[[1],[3]], [[1,3]]
broadcasts to be the same as:
[[1,1],[3,3]], [[1,3],[1,3]]

Fortunately, all of this mind-bending stuff is can be done behind the scenes with a cross-product indexing helper function:
a[ numpy.ix_([1,3], [1,3]) ]
takes care of it for you, and gives the desired
array([[ 6,  8],
      [16, 18]])

This is all pretty advanced-sounding stuff... but most of it's laid out in sections 5 and 6 of the tentative tutorial:
http://www.scipy.org/Tentative_NumPy_Tutorial
You might also want to peruse Stéfan's advanced numpy tutorial -- the broadcasting and indexing sections are really useful.
http://mentat.za.net/numpy/numpy_advanced_slides/


Zach


> thanks again,
> johann
> 
> On 02/28/2012 11:35 PM, Zachary Pincus wrote:
>> Hi Johann,
>> 
>>> In [146]: mask
>>> Out[146]:
>>> array([[ True,  True,  True, False],
>>>       [ True,  True,  True, False],
>>>       [ True,  True,  True, False],
>>>       [False, False, False, False]], dtype=bool)
>>> 
>>> Naively, I thought I would end up with a (3,3) shaped array when
>>> applying the mask to m
>> 
>> So that would make some sense for the above mask, but obviously doesn't generalize... what shape output would you expect if 'mask' looked like the following?
>> 
>> array([[ True,  True,  True, False],
>>       [ True,  True,  True, False],
>>       [ True,  True,  True, False],
>>       [False, False, False,  True]], dtype=bool)
>> 
>> Flattening turns out to be the most-sensible general-case thing to do. Fortunately, this is generally not a problem, because often one winds up doing things like:
>> a[mask] = b[mask]
>> where a and b can both be n-dimensional, and the fact that you go through a flattened intermediate is no problem.
>> 
>> If, on the other hand, your task requires slicing square regions out of arrays, you could do that directly by other sorts of fancy-indexing or using programatically-generated slice objects, or some such. Can you describe the overall task? Perhaps then someone could suggest the "idiomatic numpy" solution?
>> 
>> Zach
>> 
>> 
>> 
>>> , but instead I get :
>>> 
>>> In [147]: m[mask]
>>> Out[147]:
>>> array([  1.82243247e-23,  -5.53103453e-14,   4.32071039e-13,
>>>        -5.52425949e-14,   6.26697129e-02,  -5.12076585e-02,
>>>         4.31598429e-13,  -5.12102340e-02,   6.27539118e-02])
>>> 
>>> In [148]: m[mask].shape
>>> Out[148]: (9,)
>>> 
>>> Is there another way to proceed and get directly the (3,3) shaped masked
>>> array, or do I need to reshape it by hand?
>>> 
>>> thanks a lot in advance,
>>> Johann
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 




More information about the SciPy-User mailing list