Rich Comparisons Gotcha
James Stroud
jstroud at mbi.ucla.edu
Sun Dec 7 22:36:17 EST 2008
Robert Kern wrote:
> James Stroud wrote:
>> I'm missing how a.all() solves the problem Rasmus describes, namely
>> that the order of a python *list* affects the results of containment
>> tests by numpy.array. E.g. "y in ll1" and "y in ll2" evaluate to
>> different results in his example. It still seems like a bug in numpy
>> to me, even if too much other stuff is broken if you fix it (in which
>> case it apparently becomes an "issue").
>
> It's an issue, if anything, not a bug. There is no consistent
> implementation of bool(some_array) that works in all cases. numpy's
> predecessor Numeric used to implement this as returning True if at least
> one element was non-zero. This works well for bool(x!=y) (which is
> equivalent to (x!=y).any()) but does not work well for bool(x==y) (which
> should be (x==y).all()), but many people got confused and thought that
> bool(x==y) worked. When we made numpy, we decided to explicitly not
> allow bool(some_array) so that people will not write buggy code like
> this again.
>
> The deficiency is in the feature of rich comparisons, not numpy's
> implementation of it. __eq__() is allowed to return non-booleans;
> however, there are some parts of Python's implementation like
> list.__contains__() that still expect the return value of __eq__() to be
> meaningfully cast to a boolean.
>
You have explained
py> 112 = [1, y]
py> y in 112
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is...
but not
py> ll1 = [y,1]
py> y in ll1
True
It's this discrepancy that seems like a bug, not that a ValueError is
raised in the former case, which is perfectly reasonable to me.
All I can imagine is that something like the following lives in the
bowels of the python code for list:
def __contains__(self, other):
foundit = False
for i, v in enumerate(self):
if i == 0:
# evaluates to bool numpy array
foundit = one_kind_of_test(v, other)
else:
# raises exception for numpy array
foundit = another_kind_of_test(v, other)
if foundit:
break
return foundit
I'm trying to imagine some other way to get the results mentioned but I
honestly can't. It's beyond me why someone would do such a thing, but
perhaps it's an optimization of some sort.
James
More information about the Python-list
mailing list