[Numpy-discussion] What is up with raw boolean indices (like a[False])?

Aaron Meurer asmeurer at gmail.com
Thu Aug 20 14:21:50 EDT 2020


You're right. I was confusing the broadcasting logic for boolean arrays.

However, I did find this example

>>> np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], dtype=np.int64), False]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast
together with shapes (1,5) (0,)

That certainly seems to imply there is some broadcasting being done.

Aaron Meurer

On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg
<sebastian at sipsolutions.net> wrote:
>
> On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote:
> > > > 3. If you have multiple advanced indexing you get annoying
> > > > broadcasting
> > > >    of all of these. That is *always* confusing for boolean
> > > > indices.
> > > >    0-D should not be too special there...
> >
> > OK, now that I am learning more about advanced indexing, this
> > statement is confusing to me. It seems that scalar boolean indices do
> > not broadcast. For example:
>
> Well, broadcasting means you broadcast the *nonzero result* unless I am
> very confused... There is a reason I dismissed it. We could (and
> arguably should) just deprecate it.  And I have doubts anyone would
> even notice.
>
> >
> > > > > np.arange(2)[False, np.array([True, False])]
> > array([], dtype=int64)
> > > > > np.arange(2)[tuple(np.broadcast_arrays(False, np.array([True,
> > > > > False])))]
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> > IndexError: too many indices for array: array is 1-dimensional, but 2
> > were indexed
> >
> > And indeed, the docs even say, as you noted, "the nonzero equivalence
> > for Boolean arrays does not hold for zero dimensional boolean
> > arrays,"
> > which I guess also applies to the broadcasting.
>
> I actually think that probably also holds. Nonzero just behave weird
> for 0D because arrays (because it returns a tuple).
> But since broadcasting the nonzero result is so weird, and since 0-D
> booleans require some additional logic and don't generalize 100% (code
> wise), I won't rule out there are differences.
>
> > From what I can tell, the logic is that all integer and boolean
> > arrays
>
> Did you try that? Because as I said above, IIRC broadcasting the
> boolean array without first calling `nonzero` isn't really whats going
> on. And I don't know how it could be whats going on, since adding
> dimensions to a boolean index would have much more implications?
>
> - Sebastian
>
>
> > (and scalar ints) are broadcast together, *except* for boolean
> > scalars. Then the first boolean scalar is replaced with and(all
> > boolean scalars) and the rest are removed from the index. Then that
> > index adds a length 1 axis if it is True and 0 if it is False.
> >
> > So they don't broadcast, but rather "fake broadcast". I still contend
> > that it would be much more useful, if True were a synonym for newaxis
> > and False worked like newaxis but instead added a length 0 axis.
> > Alternately, True and False scalars should behave exactly like all
> > other boolean arrays with no exceptions (i.e., work like
> > np.nonzero(),
> > broadcast, etc.). This would be less useful, but more consistent.
> >
> > Aaron Meurer
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


More information about the NumPy-Discussion mailing list