Rich Comparisons Gotcha

James Stroud jstroud at mbi.ucla.edu
Sun Dec 7 16:57:54 EST 2008


Rasmus Fogh wrote:
> Current behaviour is both inconsistent and counterintuitive, as these
> examples show.
> 
>>>> x = float('NaN')
>>>> x == x
> False

Perhaps this should raise an exception? I think the problem is not with 
comparisons in general but with the fact that nan is type float:

py> type(float('NaN'))
<type 'float'>

No float can be equal to nan, but nan is a float. How can something be 
not a number and a float at the same time? The illogicality of nan's 
type creates the possibility for the illogical results of comparisons to 
nan including comparing nan to itself.

>>>> ll = [x]
>>>> x in ll
> True
>>>> x == ll[0]
> False

But there is consistency on the basis of identity which is the test for 
containment (in):

py> x is x
True
py> x in [x]
True

Identity and equality are two different concepts. Comparing identity to 
equality is like comparing apples to oranges ;o)

> 
>>>> import numpy
>>>> y = numpy.zeros((3,))
>>>> y
> array([ 0.,  0.,  0.])
>>>> bool(y==y)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()

But the equality test is not what fails here. It's the cast to bool that 
fails, which for numpy works like a unary ufunc. The designers of numpy 
thought that this would be a more desirable behavior. The test for 
equality likewise is a binary ufunc and the behavior was chosen in numpy 
for practical reasons. I don't know if you can overload the == operator 
in C, but if you can, you would be able to achieve the same behavior.

>>>> ll1 = [y,1]
>>>> y in ll1
> True
>>>> ll2 = [1,y]
>>>> y in ll2
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()

I think you could be safe calling this a bug with numpy. But the fact 
that someone can create a bug with a language is not a condemnation of 
the language. For example, C makes it real easy to crash a program by 
overrunning the limits of an array, but no one would suggest to remove 
arrays from C.

> Can anybody see a way this could be fixed (please)? I may well have to
> live with it, but I would really prefer not to.

Your only hope is to somehow convince the language designers to remove 
the ability to overload == then get them to agree on what you think the 
proper behavior should be for comparisons. I think the probability of 
that happening is about zero, though, because such a change would run 
counter to the dynamic nature of the language.

James


-- 
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com



More information about the Python-list mailing list