[Numpy-discussion] Broadcasting and indexing

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Jan 22 09:51:13 EST 2010


On Thu, Jan 21, 2010 at 1:03 PM, Emmanuelle Gouillart
<emmanuelle.gouillart at normalesup.org> wrote:
>
> Hi Thomas,
>
> broadcasting rules are only for ufuncs (and by extension, some numpy
> functions using ufuncs). Indexing obeys different rules and always starts
> by the first dimension.

Just a clarification: If there are several index arrays, then standard
broadcasting rules apply for them. It's a bit messier when arrays and
slice objects are mixed.
An informative explanation was in the thread March 2009 about "Is this
a bug?" and lots of examples are on the mailing list

Josef

>
> However, you don't have to use broadcasting for such indexing operations:
>>>> a[:, c] = 0
> zeroes columns indexed by c.
>
> If you want to index along the 3rd dimension, you can use a[:, :, c],
> etc. If the dimension along which you index is a variable, you can also
> use the function np.rollaxis that allows to change the order of the
> dimensions of an array. You may then index along the first dimension
> (a[c]), then change back the order of the dimensions. Here is an example:
>>>> a = np.ones((3,4,5,6))
>>>> c = np.array([1,0,1,0,1], dtype=bool)
>>>> tmp_a = np.rollaxis(a, 2, 0)
>>>> tmp_a.shape
> (5, 3, 4, 6)
>>>> tmp_a[c] = 0
>>>> a = np.rollaxis(tmp_a, 0, 3)
>>>> a.shape
> (3, 4, 5, 6)
>
> Hope this helps.
>
> Cheers,
>
> Emmanuelle
>
> On Thu, Jan 21, 2010 at 11:37:09AM -0500, Thomas Robitaille wrote:
>> Hello,
>
>> I'm trying to understand how array broadcasting can be used for indexing. In the following, I use the term 'row' to refer to the first dimension of a 2D array, and 'column' to the second, just because that's how numpy prints them out.
>
>> If I consider the following example:
>
>> >>> a = np.random.random((4,5))
>> >>> b = np.random.random((5,))
>> >>> a + b
>> array([[ 1.45499556,  0.60633959,  0.48236157,  1.55357393,  1.4339261 ],
>>        [ 1.28614593,  1.11265001,  0.63308615,  1.28904227,  1.34070499],
>>        [ 1.26988279,  0.84683018,  0.98959466,  0.76388223,  0.79273084],
>>        [ 1.27859505,  0.9721984 ,  1.02725009,  1.38852061,  1.56065028]])
>
>> I understand how this works, because it works as expected as described in
>
>> http://docs.scipy.org/doc/numpy/reference/ufuncs.html#broadcasting
>
>> So b gets broadcast to shape (1,5), then because the first dimension is 1, the operation is applied to all rows.
>
>> Now I am trying to apply this to array indexing. So for example, I want to set specific columns, indicated by a boolean array, to zero, but the following fails:
>
>> >>> c = np.array([1,0,1,0,1], dtype=bool)
>> >>> a[c] = 0
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> IndexError: index (4) out of range (0<=index<3) in dimension 0
>
>> However, if I try reducing the size of c to 4, then it works, and sets rows, not columns, equal to zero
>
>> >>> c = np.array([1,0,1,0], dtype=bool)
>> >>> a[c] = 0
>> >>> a
>> array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
>>        [ 0.41526315,  0.7425491 ,  0.39872546,  0.56141914,  0.69795153],
>>        [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
>>        [ 0.40771227,  0.60209749,  0.7928894 ,  0.66089748,  0.91789682]])
>
>> But I would have thought that the indexing array would have been broadcast in the same way as for a sum, i.e. c would be broadcast to have dimensions (1,5) and then would have been able to set certain columns in all rows to zero.
>
>> Why is it that for indexing, the broadcasting seems to happen in a different way than when performing operations like additions or multiplications? For background info, I'm trying to write a routine which performs a set of operations on an n-d array, where n is not known in advance, with a 1D array, so I can use broadcasting rules for most operations without knowing the dimensionality of the n-d array, but now that I need to perform indexing, and the convention seems to change, this is a real issue.
>
>> Thanks in advance for any advice,
>
>> Thomas
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list