[Numpy-discussion] broadcasting behavior for 1.6 (was: Numpy 1.6 schedule)

Travis Oliphant oliphant at enthought.com
Fri Mar 11 11:19:33 EST 2011


This discussion is interesting and useful for NumPy 2.0, but the subtle change is not acceptable for NumPy 1.6.

The rules were consistent, even if seen as unintuitive by some.

The fact that two libraries we know of already had tests break should be a big red warning flag.  There are a lot of other libraries that may have tests break.  

Having to fix tests would be expected for NumPy 2.0, but not 1.6.

I am a big -10 for rolling out a change like this in NumPy 1.6

For NumPy 2.0, the discussion is interesting, but I am still a -0

Travis


--
(mobile phone of)
Travis Oliphant
Enthought, Inc.
http://www.enthought.com

On Mar 11, 2011, at 10:51 AM, Charles R Harris <charlesr.harris at gmail.com> wrote:

> 
> 
> On Fri, Mar 11, 2011 at 8:06 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Fri, Mar 11, 2011 at 9:57 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Fri, Mar 11, 2011 at 7:42 AM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >>
> >> On Fri, Mar 11, 2011 at 2:01 AM, Ralf Gommers
> >> <ralf.gommers at googlemail.com> wrote:
> >>>
> >>> I'm just going through the very long 1.6 schedule thread to see what
> >>> is still on the TODO list before a 1.6.x branch can be made. So I'll
> >>> send a few separate mails, one for each topic.
> >>>
> >>> On Mon, Mar 7, 2011 at 8:30 PM, Francesc Alted <faltet at pytables.org>
> >>> wrote:
> >>> > A Sunday 06 March 2011 06:47:34 Mark Wiebe escrigué:
> >>> >> I think it's ok to revert this behavior for backwards compatibility,
> >>> >> but believe it's an inconsistent and unintuitive choice. In
> >>> >> broadcasting, there are two operations, growing a dimension 1 -> n,
> >>> >> and appending a new 1 dimension to the left. The behaviour under
> >>> >> discussion in assignment is different from normal broadcasting in
> >>> >> that only the second one is permitted. It is broadcasting the output
> >>> >> to the input, rather than broadcasting the input to the output.
> >>> >>
> >>> >> Suppose a has shape (20,), b has shape (1,20), and c has shape
> >>> >> (20,1). Then a+b has shape (1,20), a+c has shape (20,20), and b+c
> >>> >> has shape (20,20).
> >>> >>
> >>> >> If we do "b[...] = a", a will be broadcast to match b by adding a 1
> >>> >> dimension to the left. This is reasonable and consistent with
> >>> >> addition.
> >>> >>
> >>> >> If we do "a[...]=b", under 1.5 rules, a will once again be broadcast
> >>> >> to match b by adding a 1 dimension to the left.
> >>> >>
> >>> >> If we do "a[...]=c", we could broadcast both a and c together to the
> >>> >> shape (20,20). This results in multiple assignments to each element
> >>> >> of a, which is inconsistent. This is not analogous to a+c, but
> >>> >> rather to np.add(c, c, out=a).
> >>> >>
> >>> >> The distinction is subtle, but the inconsistent behavior is harmless
> >>> >> enough for assignment that keeping backwards compatibility seems
> >>> >> reasonable.
> >>> >
> >>> > For what is worth, I also like the behaviour that Mark proposes, and
> >>> > have updated tables test suite to adapt to this.  But I'm fine if it is
> >>> > decided to revert to the previous behaviour.
> >>>
> >>> The conclusion on this topic, as I read the discussion, is that we
> >>> need to keep backwards compatible behavior (even though the proposed
> >>> change is more intuitive). Has backwards compatibility been fixed
> >>> already?
> >>>
> >>
> >> I don't think an official conclusion was reached, at least in so far as
> >> numpy has an official anything ;) But this change does show up as an error
> >> in one of the pandas tests, so it is likely to affect other folks as well.
> >> Probably the route of least compatibility hassle is to revert to the old
> >> behavior and maybe switch to the new behavior, which I prefer, for 2.0.
> >>
> >
> > That said, apart from pandas and pytables, and the latter has been fixed,
> > the new behavior doesn't seem to have much fallout. I think it actually
> > exposes unoticed assumptions in code that slipped by because there was no
> > consequence.
> >
> > Chuck
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> 
> I've fixed the pandas issue-- I'll put out a bugfix release whenever
> NumPy 1.6 final is out. I don't suspect it will cause very many
> problems (and those problems will--hopefully--be easy to fix).
> __
> 
> Now I'm really vacillating. I do prefer the new behavior and the fallout does seem minimal. Put me +1 for the change unless a strong objection surfaces.
> 
> Chuck 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110311/3ccee885/attachment.html>


More information about the NumPy-Discussion mailing list