[Numpy-discussion] Broadcasting rules (Ticket 76).

Wed Apr 26 21:02:10 EDT 2006

I haven't fully waded through all the various replies and to this 
thread. I plan to do that and send a reply on specific points later. 
This is message is more of a historical, motivational or possibly 
philosophical nature.

First off, NumPy has used the term "broadcast" to mean the same thing 
since its inception and changing the terminology now is asking for 
confusion. *In the context of this mailing list *,I think we should use 
"broadcast" in the numpy sense and use appropriate qualifiers when 
referring to how other array packages practice broadcasting.  Referring 
to broadcasting as "shape-preserving broadcasting" or some such doesn't 
seems to make things any clearer and adds a bunch of excess verbiage. In 
any event, I plan to omit any "broadcast" qualifiers here.

The following understanding was formed by using and occasionally helping 
with development of NumPy since it was developed in 1995 or thereabouts. 
That doesn't mean that my understanding aggrees with the primary 
developers of the time, I may misremember things and my recollections 
are likely tinged by the experience I've had with NumPy in the interim. 
So, don't take this as definitive, but perhaps it will help provide some 
insight into what NumPy's broadcasting is supposed to be.

Let's first dispense with the padding of dimensions. As I recall, this 
was a way to make matrix like operations easier. This was way before 
there was a matrix class and by defining padding in this way 1-D vectors 
could generally be treated as column vectors. Row vectors still needed 
to be 2-D (1xN), but they tended to be less frequent, so that was less 
of a burden. Or maybe I have that backwards, in any event they were put 
there to to facilitate matrix-like uses of numpy arrays. Given that 
there is a matrix class at this point, I doubt I would automagically pad 
the dimensions if I were designing numpy from scratch now. Since the 
dimension padding is at least partly historical accident and since it is 
in some sense orthogonal to the main point of numpy's broadcasting I'm 
going to pretend it doesn't exist for the rest of this discussion.

At it's core broadcasting is about adjusting the shapes of two arrays so 
that they match. Consider an array 'A' and an array 'B' with shaps (3, 
Any) and (Any, 4). Here, 'Any' means that the given dimension of the 
array is unspecified and can take on any value that is convenient for 
functions operating on the array.  If we add 'A' and 'B' together we'd 
like the two 'Any' dimensions to stretch appropriately so that the 
result was an array of shape (3, 4). Similarly adding and array of shape 
(3, 4) to an array of shape (Any, 4) should work and produce an array of 
shape (3, 4). So far, this is pretty straightforward; I believe, it also 
bears a fair amount of resemblance to Sasha's 0-stride ideas.

The complicating factor is that there wasn't a good way to spell 'Any' 
at the time. Or maybe we were lazy. Or maybe there was some other reason 
that I'm forgetting. In any event, we ended up spelling 'Any' as '1'. 
That means that there's no way to distinguish between a dimension that's 
of length-1 for some legitimate reason and one that is that length just 
for stretchability. It would be an interesting experiment to see how 
things would work with no padding and with an explicit 'Any' value 
available for dimensions. However, it's probably too much work and would 
result in too many backwards compatibility problems for NumPy proper.

[Half baked thoughts on how to do this though: newaxis would produce a 
new axis with length -1 (or some other marker length). This would be 
treated as length-1 axes are treated now. However, length-1axes would no 
longer broadcast. Padding would be right out.]

In summary, the platonic ideal of broadcasting is simple and clean. In 
practice it's more complicated for two reasons. First, padding the 
dimensions.I believe that this is mostly historical baggage. The second 
is the conflation of '1' and 'Any' (a name that I made up for this 
message, so don't go searching for it). This may be an hostorical 
accident and/or implementation artifact, but there may actually be some 
more practical reasons behind this as well that I am forgetting.

Hopefully that is mildly informative,

Regards,

-tim