[Numpy-discussion] Structured array copying by field name (was: Numpy 1.6 schedule)

Fri Mar 11 17:51:12 EST 2011

On Fri, Mar 11, 2011 at 2:20 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Fri, Mar 11, 2011 at 1:07 AM, Ralf Gommers <ralf.gommers at googlemail.com
> > wrote:
>
>> On Tue, Mar 8, 2011 at 1:35 AM, Pauli Virtanen <pav at iki.fi> wrote:
>> >
>> > Structured array copying copies by field name.
>> >
>> > Commit 22d96096bf7d5fb199ca80f2fcd04e8d27815476
>> >
>> > Before:
>> >
>> >>>> x = np.array([(0, 1)], dtype=[('a', int), ('b', int)])
>> >>>> y = np.array([(2, 3)], dtype=[('a', int), ('b', int)])
>> >>>> y = np.array([(2, 3)], dtype=[('b', int), ('a', int)])
>> >>>> x[:] = y
>> >>>> x
>> > array([(2, 3)],
>> >      dtype=[('a', '<i4'), ('b', '<i4')])
>> >
>> > After:
>> >
>> >>>> x = np.array([(0, 1)], dtype=[('a', int), ('b', int)])
>> >>>> y = np.array([(2, 3)], dtype=[('b', int), ('a', int)])
>> >>>> x[:] = y
>> >>>> x
>> > array([(3, 2)],
>> >      dtype=[('a', '<i4'), ('b', '<i4')])
>> >
>> > This seems like a pretty hazardous change. Granted, it's in
>> > a bit of a grey area, but people may rely on this.
>>
>> This is still backwards incompatible in current master. Should it be
>> changed back for 1.6?
>>
>
> I strongly dislike the behavior in 1.5 and earlier, for reasons such as the
> example given here:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055214.html
>
> <http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055214.html>No
> problems so far have been traced back to this change, indicating that this
> type of assignment was previously utilized very little, so I'm strongly in
> favor of keeping it in. Based on my knowledge of the code, I'm pretty sure
> it's a significant performance improvement as well.
>
> -Mark
>
>
Ditto on this.  However, it isn't like this is fully featured anyway.
Consider the following:

x = np.array([(0, 1.1)], dtype=[('a', int), ('b', float)])
y = np.array([(2.1, 4)], dtype=[('b', float), ('a', int)])

print x + y

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and
'numpy.ndarray'

So, how much has people been relying on either behaviors, given that basic
math operations weren't permissible in a similar manner?  I doubt either
approaches would be that noticeable.

Note, I would love to be able to do the above eventually.  I see a lot of
uses for labeled arrays, especially for in the matplotlib library.  I
personally use the larry package from time to time.

Just my two cents.
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110311/f09224ea/attachment.html>