Rich Comparisons Gotcha
James Stroud
jstroud at mbi.ucla.edu
Sun Dec 7 16:57:54 EST 2008
Rasmus Fogh wrote:
> Current behaviour is both inconsistent and counterintuitive, as these
> examples show.
>
>>>> x = float('NaN')
>>>> x == x
> False
Perhaps this should raise an exception? I think the problem is not with
comparisons in general but with the fact that nan is type float:
py> type(float('NaN'))
<type 'float'>
No float can be equal to nan, but nan is a float. How can something be
not a number and a float at the same time? The illogicality of nan's
type creates the possibility for the illogical results of comparisons to
nan including comparing nan to itself.
>>>> ll = [x]
>>>> x in ll
> True
>>>> x == ll[0]
> False
But there is consistency on the basis of identity which is the test for
containment (in):
py> x is x
True
py> x in [x]
True
Identity and equality are two different concepts. Comparing identity to
equality is like comparing apples to oranges ;o)
>
>>>> import numpy
>>>> y = numpy.zeros((3,))
>>>> y
> array([ 0., 0., 0.])
>>>> bool(y==y)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
But the equality test is not what fails here. It's the cast to bool that
fails, which for numpy works like a unary ufunc. The designers of numpy
thought that this would be a more desirable behavior. The test for
equality likewise is a binary ufunc and the behavior was chosen in numpy
for practical reasons. I don't know if you can overload the == operator
in C, but if you can, you would be able to achieve the same behavior.
>>>> ll1 = [y,1]
>>>> y in ll1
> True
>>>> ll2 = [1,y]
>>>> y in ll2
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
I think you could be safe calling this a bug with numpy. But the fact
that someone can create a bug with a language is not a condemnation of
the language. For example, C makes it real easy to crash a program by
overrunning the limits of an array, but no one would suggest to remove
arrays from C.
> Can anybody see a way this could be fixed (please)? I may well have to
> live with it, but I would really prefer not to.
Your only hope is to somehow convince the language designers to remove
the ability to overload == then get them to agree on what you think the
proper behavior should be for comparisons. I think the probability of
that happening is about zero, though, because such a change would run
counter to the dynamic nature of the language.
James
--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095
http://www.jamesstroud.com
More information about the Python-list
mailing list