[Numpy-discussion] Broadcasting rules (Ticket 76).

Tue Apr 25 22:26:01 EDT 2006

Sasha wrote:
> On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
>   
>> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>     
>>> Sasha wrote:
>>>       
>>>> In this category, I would suggest to allow broadcasting to any
>>>> multiple of the dimension even if the dimension is not 1.  I don't see
>>>> what makes 1 so special.
>>>>
>>>>         
>>> What's so special about 1 is that the code for it is relatively
>>> straightforward and already implemented using strides.  Altering the
>>> code to allow any multiple of the dimension would be harder and slower.
>>>       
>
> I don't think so. The same zero-stride trick that allows size-1
> broadcasting can be used to implement repetition.  I did not review
> the C code, but the following Python fragment shows that the loop that
> is already in numpy can be used to implement repetition by simply
> manipulating shapes and strides:
>   

I don't think anyone is fundamentally opposed to multiple repetitions.   
We're just being cautious.   Also, as you've noted, the assignment code 
is currently not using the ufunc broadcasting code and so they really 
aren't the same thing, yet.
>   
>> It's my expectation that oppening up broadcasting will be more effective in masking
>> errors than in enabling useful new behaviour.
>>
>>     
> In my experience broadcasting length-1 and not broadcasting other
> lengths is very error prone as it is. 

That's not been my experience.  But, I don't know R very well.  I'm very 
interested in what ideas you can bring. 

>  I understand that restricting
> broadcasting to make it a strictly dimension-increasing operation is
> not possible for two reasons:
>
> 1. Numpy cannot break legacy Numeric code.
> 2. It is not possible to differentiate between 1-d array that
> broadcasts column-wise vs. one that broadcasts raw-wise.
>
> In my view none of these reasons is valid.  In my experience Numeric
> code that relies on dimension-preserving broadcasting is already
> broken, only in a subtle and hard to reproduce way.

I definitely don't agree with you here.  Dimension-preserving 
broadcasting is at the heart of the utility of broadcasting and it is 
very, very useful for that.  Calling it subtly broken suggests that you 
don't understand it and have never used it for it's intended purpose.   
I've used dimension-preserving broadcasting literally hundreds of 
times.  It's rather bold of you to say that all of that code is "broken"

Now, I'm sure there are other useful ways to "broadcast",  but 
dimension-preserving is essentially what broadcasting *is* in NumPy.   
If anything it is the dimension-increasing rule that is somewhat 
arbitrary (e.g. why prepend with ones).

Perhaps you want to introduce some other way for non-commensurate shapes 
to interact in an operation.   I think you will find many open minds on 
this list (although probably not anyone who will want to code it up :-) 
).     We do welcome the discussion.    Your experience with other 
array-like languages is helpful.

>   Similarly the
> need to broadcast over non-leading dimension is a sign of bad design. 
> In rare cases where such broadcasting is desirable, it can be easily
> done via swapaxes which is a cheap operation.
>   

Again, it would help if you would refrain from using negative words 
about coding styles that are different from your own.     Such 
broadcasting is not that rare.  It happens quite frequently, actually.   
The point of a language like Python is that you can write algorithms 
simply without struggling with optimization questions up front like you 
seem to be hinting at. 

> On the other hand I don't see much problem in making
> dimension-preserving broadcasting more permissive.  In R, for example,
> (1-d) arrays can be broadcast to arbitrary size.  This has an
> additional benefit that 1-d to 2-d broadcasting requires no special
> code, it just happens because matrices inherit arithmetics from
> vectors.  I've never had a problem with R rules being too loose.
>   

So, please explain exactly what you mean.   Only a few on this list know 
what the R rules even are. 

> In my view the problem that your ticket highlighted is not so much in
> the particular set of broadcasting rules, but in the fact that a[...]
> = b uses one set of rules while a[...] += b uses another.  This is
> *very* confusing.
>   

Yes, this is admittedly confusing.  But, it's an outgrowth of the way 
Numeric code developed.  Broadcasting was always only a ufunc concept in 
Numeric, and copying was not a ufunc.    NumPy grew out of Numeric 
code.   I was not trying to mimick broadcasting behavior when I wrote 
the array copy and array setting code.  Perhaps I should have been. 

I'm willing to change the code on this one, but only if the new copy 
code actually does implement broadcasting behavior equivalently.  And 
going through the ufunc machinery is probably a waste of effort because 
the copy code must be written for variable length arrays anyway (and 
ufuncs don't support them). 

However, the broadcasting machinery has been abstracted in NumPy and can 
therefore be re-used in the copying code.  In Numeric, broadcasting was 
basically implemented deep inside a confusing while loop. 

-Travis