[Numpy-discussion] Why are empty arrays False?

Michael Lamparski diagonaldevice at gmail.com
Fri Aug 18 17:45:23 EDT 2017


Greetings, all.  I am troubled.

The TL;DR is that `bool(array([])) is False` is misleading, dangerous, and
unnecessary. Let's begin with some examples:

>>> bool(np.array(1))
True
>>> bool(np.array(0))
False
>>> bool(np.array([0, 1]))
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
>>> bool(np.array([1]))
True
>>> bool(np.array([0]))
False
>>> bool(np.array([]))
False

One of these things is not like the other.

The first three results embody a design that is consistent with some of the
most fundamental design choices in numpy, such as the choice to have
comparison operators like `==` work elementwise.  And it is the only such
design I can think of that is consistent in all edge cases. (see footnote 1)

The next two examples (involving arrays of shape (1,)) are a
straightforward extension of the design to arrays that are isomorphic to
scalars.  I can't say I recall ever finding a use for this feature... but
it seems fairly harmless.

So how about that last example, with array([])?  Well... it's /kind of/
like how other python containers work, right? Falseness is emptiness (see
footnote 2)...  Except that this is actually *a complete lie*, due to /all
of the other examples above/!

Here's what I would like to see:

>>> bool(np.array([]))
ValueError: The truth value of a non-scalar array is ambiguous. Use a.any()
or a.all()

Why do I care?  Well, I myself wasted an hour barking up the wrong tree
while debugging some code when it turned out that I was mistakenly using
truthiness to identify empty arrays. It just so happened that the arrays
always contained 1 or 0 elements, so it /appeared/ to work except in the
rare case of array([0]) where things suddenly exploded.

I posit that there is no usage of the fact that `bool(array([])) is False`
in any real-world code which is not accompanied by a horrible bug writhing
in hiding just beneath the surface. For this reason, I wish to see this
behavior *abolished*.

Thank you.
-Michael

Footnotes:
1: Every now and then, I wish that `ndarray.__{bool,nonzero}__` would just
implicitly do `all()`, which would make `if a == b:` work like it does for
virtually every other reasonably-designed type in existence.  But then I
recall that, if this were done, then the behavior of `if a != b:` would
stand out like a sore thumb instead.  Truly, punting on 'any/all' was the
right choice.

2: np.array([[[[]]]]) is also False, which makes this an interesting sort
of n-dimensional emptiness test; but if that's really what you're looking
for, you can achieve this much more safely with `np.all(x.shape)` or
`bool(x.flat)`
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170818/4d1da794/attachment.html>


More information about the NumPy-Discussion mailing list