[Numpy-discussion] optimizing ndarray.setitem

Thu May 5 12:54:40 EDT 2011

On Thu, May 5, 2011 at 02:29, Christoph Groth <cwg at falma.de> wrote:
>> On Wed, May 4, 2011 at 6:19 AM, Christoph Groth <cwg at falma.de> wrote:
>>>
>>>     Dear numpy experts,
>>>
>>>     I have noticed that with Numpy 1.5.1 the operation
>>>
>>>     m[::2] += 1.0
>>>
>>>     takes twice as long as
>>>
>>>     t = m[::2]
>>>     t += 1.0
>
> Mark Wiebe <mwwiebe at gmail.com> writes:
>
>> You'd better time this in 1.6 too. ;)
>>
>> https://github.com/numpy/numpy/commit/f60797ba64ccf33597225d23b893b6eb11149860
>
> This seems to be exactly what I had in mind.  Thanks for finding this.
>
>> The case of boolean mask indexing can't benefit so easily from this
>> optimization, but I think could see a big performance benefit if
>> combined __index__ + __i<op>__ operators were added to
>> Python. Something to consider, anyway.
>
> Has something like __index_iadd__ ever been considered seriously?  Not
> to my (limited) knowledge.

Only on this list, I think. :-)

I don't think it will ever happen. Only numpy really cares about it,
and adding another __special__ method for each __iop__ is a lot of
additional methods that need to be supported.

> Indeed, the second loop executes twice as fast than the first in the
> following example (again with Numpy 1.5.1).
>
> import numpy
> m = numpy.zeros((1000, 1000))
> mask = numpy.arange(0, 1000, 2, dtype=int)
>
> for i in xrange(40):
>    m[mask] += 1.0
>
> for i in xrange(40):
>    t = m[mask]
>    t += 1.0
>
> But wouldn't it be easy to optimize this as well, by not executing
> assignments where the source and the destination is indexed by the same
> mask object?

No. These two are not semantically equivalent. Your second example
does not actually modify m. For integer and bool mask arrays, m[mask]
necessarily makes a copy, so when you modify t via inplace addition,
you have only modified t and not m. The assignment back to m[mask] is
necessary.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

[Numpy-discussion] optimizing ndarray.__setitem__

[Numpy-discussion] optimizing ndarray.setitem