Rich Comparisons Gotcha

Sat Dec 6 11:42:54 EST 2008

Dear All,

For the first time I have come across a Python feature that seems
completely wrong. After the introduction of rich comparisons, equality
comparison does not have to return a truth value, and may indeed return
nothing at all and throw an error instead. As a result, code like
  if foo == bar:
or
  foo in alist
cannot be relied on to work.

This is clearly no accident. According to the documentation all comparison
operators are allowed to return non-booleans, or to throw errors. There is
explicitly no guarantee that x == x is True.

Personally I would like to get these !@#$%&* misfeatures removed, and
constrain the __eq__ function to always return a truth value. That is
clearly not likely to happen. Unless I have misunderstood something, could
somebody explain to me

1) Why was this introduced? I can understand relaxing the restrictions on
'<', '<=' etc. - after all you cannot define an ordering for all types of
object. But surely you can define an equal/unequal classification for all
types of object, if you want to? Is it just the numpy people wanting to
type 'a == b' instead of 'equals(a,b)', or is there a better reason?

2) If I want to write generic code, can I somehow work around the fact
that
  if foo == bar:
or
  foo in alist
does not work for arbitrary objects?

Yours,

Rasmus

Some details:

CCPN has a table display class that maintains a list of arbitrary objects,
one per line in the table. The table class is completely generic, and
subclassed for individual cases. It contains the code:

  if foo in tbllist:
    ...
  else:
    ...
    tbllist.append(foo)
    ...

One day the 'if' statement gave this rather obscure error:
"ValueError:
 The truth value of an array with more than one element is ambiguous.
 Use a.any() or a.all()"
A subclass had used objects passed in from some third party code, and as
it turned out foo happened to be a tuple containing a tuple containing a
numpy array.

Some more precise tests gave the following:
# Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22)
# [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
# set up
import numpy
a = float('NaN')
b = float('NaN')
ll = [a,b]
c = numpy.zeros((2,3))
d = numpy.zeros((2,3))
mm = [c,d]

# try NaN
print (a == a)        # gives False
print (a is a)        # gives True
print (a == b)        # gives False
print (a is b)        # gives False
print (a in ll)       # gives True
print (b in ll)       # gives True
print (ll.index(a))   # gives 0
print (ll.index(b))   # gives 1

# try numpy array
print (c is c)       # gives True
print (c is d)       # gives False
print (c in mm)      # gives True
print (mm.index(c))  # 0
print (c == c)       # gives [[ True  True  True][ True  True  True]]
print (c == d)       # gives [[ True  True  True][ True  True  True]]
print (bool(1 == c)) # raises error - see below
print (d in mm)      # raises error - see below
print (mm.index(d))  # raises error - see below
print (c in ll)      # raises error - see below
print (ll.index(c))  # raises error - see below

The error was the same in each case:
"ValueError:
 The truth value of an array with more than one element is ambiguous.
 Use a.any() or a.all()"

---------------------------------------------------------------------------
Dr. Rasmus H. Fogh                  Email: r.h.fogh at bioc.cam.ac.uk
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK.     FAX (01223)766002