Rich Comparisons Gotcha

Sat Dec 6 14:56:32 EST 2008

Rasmus Fogh wrote:
> Dear All,
> 
> For the first time I have come across a Python feature that seems
> completely wrong. After the introduction of rich comparisons, equality
> comparison does not have to return a truth value, and may indeed return
> nothing at all and throw an error instead. As a result, code like
>   if foo == bar:
> or
>   foo in alist
> cannot be relied on to work.
> 
> This is clearly no accident. According to the documentation all comparison
> operators are allowed to return non-booleans, or to throw errors. There is
> explicitly no guarantee that x == x is True.

You have touched on a real and known issue that accompanies dynamic 
typing and the design of Python.  *Every* Python function can return any 
Python object and may raise any exception either actively, by design, or 
passively, by not catching exceptions raised in the functions *it* calls.

> Personally I would like to get these !@#$%&* misfeatures removed,

What you are calling a misfeature is an absence, not a presence that can 
be removed.

> and constrain the __eq__ function to always return a truth value.

It is impossible to do that with certainty by any mechanical 
creation-time checking.  So the implementation of operator.eq would have 
to check the return value of the ob.__eq__ function it calls *every 
time*.  That would slow down the speed of the 99.xx% of cases where the 
check is not needed and would still not prevent exceptions.  And if the 
return value was bad, all operator.eq could do is raise and exception 
anyway.

> That is clearly not likely to happen. Unless I have misunderstood something, could
> somebody explain to me.

a. See above.
b. Python programmers are allowed to define 'weird' but possibly 
useful-in-context behaviors, such as try out 3-value logic, or to 
operate on collections element by element (as with numpy).

> 1) Why was this introduced?

The 6 comparisons were previously done with one __cmp__ function that 
was supposed to return -1, 0, or 1 and which worked with negative, 0, or 
positive response, but which could return anything or raise an 
exception.  The compare functions could mask but not prevent weird returns.

  I can understand relaxing the restrictions on
> '<', '<=' etc. - after all you cannot define an ordering for all types of
> object. But surely you can define an equal/unequal classification for all
> types of object, if you want to? Is it just the numpy people wanting to
> type 'a == b' instead of 'equals(a,b)', or is there a better reason?
> 
> 2) If I want to write generic code, can I somehow work around the fact
> that
>   if foo == bar:
> or
>   foo in alist
> does not work for arbitrary objects?

Every Python function is 'generic' unless restrained by type tests. 
However, even 'generic' functions can only work as expected with objects 
that meet the assumptions embodied in the function.  In my Python-based 
algorithm book-in-progess, I am stating this explicitly.  In particular, 
I say taht the book only applies to objects for which '==' gives a 
boolean result that is reflexive, symmetric, and transitive.  This 
exludes float('nan'), for instance (as I see you discovered), which 
follows the IEEE mandate to act otherwise.

> CCPN has a table display class that maintains a list of arbitrary objects,
> one per line in the table. The table class is completely generic,

but only for the objects that meet the implied assumption.  This is true 
for *all* Python code.  If you want to apply the function to other 
objects, you must either adapt the function or adapt or wrap the objects 
to give them an interface that does meet the assumptions.

 > and subclassed for individual cases. It contains the code:
> 
>   if foo in tbllist:
>     ...
>   else:
>     ...
>     tbllist.append(foo)
>     ...
> 
> One day the 'if' statement gave this rather obscure error:
> "ValueError:
>  The truth value of an array with more than one element is ambiguous.
>  Use a.any() or a.all()"
> A subclass had used objects passed in from some third party code, and as
> it turned out foo happened to be a tuple containing a tuple containing a
> numpy array.

Right.  'in' calls '==' and assumes a boolean return.  Assumption 
violated, exception raised.  Completely normal.  The error message even 
suggests a solution: wrap the offending objects in an adaptor class that 
gives them a normal interface with .all (or perhaps the all() builtin).

Terry Jan Reedy