[Numpy-discussion] Incrementing with advanced indexing: why don't repeated indexes repeatedly increment?

Wed Jun 6 11:52:13 EDT 2012

On 06/06/2012 05:34 PM, Nathaniel Smith wrote:
> On Wed, Jun 6, 2012 at 4:30 PM, Robert Cimrman<cimrman3 at ntc.zcu.cz>  wrote:
>> On 06/06/2012 05:06 PM, Nathaniel Smith wrote:
>>> On Wed, Jun 6, 2012 at 9:48 AM, John Salvatier
>>> <jsalvati at u.washington.edu>    wrote:
>>>> Hello,
>>>>
>>>> I've noticed that If you try to increment elements of an array with advanced
>>>> indexing, repeated indexes don't get repeatedly incremented. For example:
>>>>
>>>> In [30]: x = zeros(5)
>>>>
>>>> In [31]: idx = array([1,1,1,3,4])
>>>>
>>>> In [32]: x[idx] += [2,4,8,10,30]
>>>>
>>>> In [33]: x
>>>> Out[33]: array([  0.,   8.,   0.,  10.,  30.])
>>>>
>>>> I would intuitively expect the output to be array([0,14, 0,10,30]) since
>>>> index 1 is incremented by 2+4+8=14, but instead it seems to only increment
>>>> by 8. What is numpy actually doing here?
>>>>
>>>> The authors of Theano noticed this behavior a while ago so they python loop
>>>> through the values in idx (this kind of calculation is necessary for
>>>> calculating gradients), but this is a bit slow for my purposes, so I'd like
>>>> to figure out how to get the behavior I expected, but faster.
>>>>
>>>> I'm also not sure how to navigate the numpy codebase, where would I look for
>>>> the code responsible for this behavior?
>>>
>>> Strictly speaking, it isn't actually in the numpy codebase at all --
>>> what's happening is that the Python interpreter sees this code:
>>>
>>>     x[idx] += vals
>>>
>>> and then it translates it into this code before running it:
>>>
>>>     tmp = x.__getitem__(idx)
>>>     tmp = tmp.__iadd__(vals)
>>>     x.__setitem__(idx, tmp)
>>>
>>> So you can find the implementations of the ndarray methods
>>> __getitem__, __iadd__, __setitem__ (they're called
>>> array_subscript_nice, array_inplace_add, and array_ass_sub in the C
>>> code), but there's no way to fix them so that this works the way you
>>> want it to, because there's no way for __iadd__ to know that the
>>> temporary values that it's working with are really duplicate copies of
>>> "the same" value in the original array.
>>>
>>> It would be nice if numpy had some sort of standard API for doing what
>>> you want, but not sure what a good API would look like, and someone
>>> would have to implement it.
>>
>> This operation is also heavily used for the finite element assembling, and a
>> similar question has been raised already several times (e.g.
>> http://old.nabble.com/How-to-assemble-large-sparse-matrices-effectively-td33833855.html).
>> So why not adding a function np.assemble()?
>
> I read that message, but I don't see what it has to do with this
> discussion? It seemed to be about fast ways to assign dense matrices
> into sparse matrices, not fast ways of applying in-place arithmetic to
> specific spots in a dense matrix.

Yes (in that thread), but it applies also adding/assembling vectors into a 
global vector - this is just x[idx] += vals. I linked that discussion as that 
was recent enough for me to recall it, but there were other.

Anyway, my point was that a having a function with the "adding" semantics in 
NumPy would be handy.

r.