[Numpy-discussion] Re: Histograms via indirect index arrays

Fri Mar 17 12:21:01 EST 2006

On Friday 17 March 2006 14:58, Robert Kern wrote:
> Piotr Luszczek wrote:
> > On Friday 17 March 2006 13:29, Travis Oliphant wrote:
> >>how does python interpret
> >>
> >>g[idx] += 1
> >>
> >>How does this get compiled to byte-code?
>
> In [161]: c = compile('g[idx] += 1', '<str>', 'single')
>
> In [162]: import dis
>
> In [163]: dis.dis(c)
>   1           0 LOAD_NAME                0 (g)
>               3 LOAD_NAME                1 (idx)
>               6 DUP_TOPX                 2
>               9 BINARY_SUBSCR
>              10 LOAD_CONST               0 (1)
>              13 INPLACE_ADD
>              14 ROT_THREE
>              15 STORE_SUBSCR
>              16 LOAD_CONST               1 (None)
>              19 RETURN_VALUE

This proves my point.

> >>There are two possibilities:
> >>
> >>1) g[idx] creates a new object which then has 1 added to it using
> >>in-place addition.
> >>
> >>     This would not produce the desired behavior as g[idx] is a
> >> copy of the data when idx is a
> >>      general indexing array as it is in this case.  So, you make a
> >>copy of those indices, add 1 to them
> >>      and then do what with the resut?
> >>
> >>2) g[idx] += 1  gets converted to g[idx] = g[idx] + 1
> >>
> >>    This appears to be effectively what Python actually does. 
> >> Notice that there is no way for us to control this behavior
> >> because there is no __inplace_with_indexing_add__ operator to
> >> over-ride.
> >>
> >>There is no such single operation to over-ride for the object.   In
> >>other words, I don't see anyay for us to even alter the object to
> >> get the behavior you want from that syntax.  We can, of course,
> >> add a function or method to do that, but I we would have to extend
> >> Python to get the behavior you want here.
> >
> > Hardly. At least from what I'm seeing happens on a small example.
> > 'g[idx] += 1' becomes ('g' and 'idx' are generic objects):
> > __getitem__(self, idx)
> > __iadd__(1)
> > __setitem__(result of __iadd__)
> >
> > By design numpy returns views from __getitem__
>
> Only for slices.
>
> In [132]: a = arange(10)
>
> In [133]: idx = [2,2,3]
>
> In [134]: a[idx]
> Out[134]: array([2, 2, 3])
>
> In [135]: b = a[idx]
>
> In [136]: b[-1] = 100
>
> In [137]: a
> Out[137]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Your example uses lists as indices. This is not interesting.
I'm talking solely about arrays indexing other arrays.
To me it is a special and very important case.

> > In this case, it would be view into 'self' and 'idx' so the
> > __iadd__ would just use the 'idx' directly rather than a copy.
> > Finally, __setitem__ doesn't do anything since 'self' and 'value'
> > will be the same.
>
> No, value is the result of __iadd__ on the temporary array.
>
> 'g[idx] += 1' expands to:
>
>   tmp = g.__getitem__(idx)
>   val = tmp.__iadd__(1)
>   g.__setitem__(idx, val)

You're missing the point. 'tmp' can be of a very specific type
so that 'g.__setitem__' doesn't have to do anything: the 'add 1'
was done by '__iadd__'.

> Given these class definitions:
>
>   class A(object):
>       def __getitem__(self, idx):
>           print 'A.__getitem__(%r)' % idx
>           return B()
>       def __setitem__(self, idx, value):
>           print 'A.__setitem__(%r, %r)' % (idx, value)
>
>
>   class B(object):
>       def __iadd__(self, x):
>           print 'B.__iadd__(%r)' % x
>           return self
>       def __repr__(self):
>           return 'B()'
>
> In [153]: a = A()
>
> In [154]: a[[0, 2, 2, 1]] += 1
> A.__getitem__([0, 2, 2, 1])
> B.__iadd__(1)
> A.__setitem__([0, 2, 2, 1], B())
>
> > Of course, this is just a quick draft. I don't know how it would
> > work in practice and in other cases.
>
> Aye, there's the rub.

Show me a code that breaks.