Would there be support for a more general cmp/cmp

Mon Oct 24 05:23:58 EDT 2005

Op 2005-10-21, Christopher Subich schreef <csubich.spam.block at spam.subich.block.com>:
> Antoon Pardon wrote:
>> It would be better if cmp would give an indication it
>> can't compare two objects instead of giving incorrect
>> and inconsistent results.
>
> If two objects aren't totally comparable, then using 'cmp' on them is 
> ill-defined to begin with.

Where is that documented?

> The Standard Thing To Do is throw an 
> exception; see the Highly Obscure Case of the Complex Numbers.
>
> >>>1 == 1j
> False
> >>>1 != 1j
> True
> >>>1 < 1j
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
> TypeError: cannot compare complex numbers using <, <=, >, >=

I would say this is the wrong answer. The right answer should
be False IMO.

Especially in the following case do I think the TypeError is
the wrong answer:

>>> 3 + 0j < 2 + 0j
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=
>>> 

Look at sets:

>>> set([1]) <=  set([2])
False
>>> set([2]) <= set([1])
False
>>> 

> >>>cmp(1j,1j)
> 0
> >>>cmp(1,1j)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
> TypeError: cannot compare complex numbers using <, <=, >, >=
>
> So using the well-known case of complex numbers, the semantics are 
> already well-defined.

I think one can argue against this, but I'm willing to leave this
as it is for complex numbers. But I think the situation with sets
should be trated differently than currently.

> >>> class Incomparable:
> ...     def __cmp__(self,other):
> ...             raise TypeError("cannot compare Incomparables using <, 
><=, >, >=")
> ...     def __eq__(self,other):
> ...             return self is other
> ...     def __neq__(self,other):
> ...             return self is not other
> >>> a1 = Incomparable()
> >>> a2 = Incomparable()
> >>> a1 == a1
> 1 # I'm running on python 2.2.3 at the moment, so hence the 1.
> >>> a2 == a2
> 1
> >>> a1 == a2
> 0
> >>> a1 < a2
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "<stdin>", line 3, in __cmp__
> TypeError: cannot compare Incomparables using <, <=, >, >=
> >>> cmp(a1,a2)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "<stdin>", line 3, in __cmp__
> TypeError: cannot compare Incomparables using <, <=, >, >=
>
>
> So your use-case is already well-defined, and rather simple.  Make 
> __cmp__ raise an exception of your choice,

That is not well defined. It doesn't allow to distinghuish between
something went wrong in __cmp__ and and answer that indicates the
only usefull thing you can say is that they are unequal.

Besides I find the following result inconsistent:

>>> set([1]) <= set([1,2])
True
>>> cmp(set([1]), set([1,2]))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: cannot compare sets using cmp()

The first set is smaller than the second so acoording to the
documentation, I should get a negative number as a result.

> and define rich comparators 
> only for the comparisons that are supported.  If, as you say in another 
> post, "some" pairs in D cross D are comparable on an operator but not 
> all of them (and further that this graph is not transitive),

I was talking about partial orders, partial orders are transitive
and python still doesn't handle them right.

> then your 
> _ONLY_ choices, no matter your implementation, is to return some 
> possibly inconsistent result (a < b == 1, b < c == 1, a < c == 0) or 
> raise an exception for unapplicable comparisons.
>
> This isn't a Python problem; this is a problem from formal mathematics.

I think it is python's problem if python gives the wrong results.

when a < b then cmp(a,b) should give a negative number as a result.
Python sometimes does not!

when not a < b then cmp(a,b) should not give a negative number as
a result. Python sometimes does!

I think those are python problems.

> Personally, I'd favor the "raise an exception" case, which is exactly 
> what will happen if you define rich comparisons and let cmp throw an 
> exception.

But I don't want cmp to raise an exception when the two elements are
comparable.

That cmp(set([1]), set([2])) would raise an exception, I can understand,
although I would prefer it returned None, but why should
cmp(set([1]), set([1,2])) raise an exception. There is a perfectly
well defined answer for this.

> Operators that assume comparable objects, like sort, also 
> almost always assume a total order -- inconsistent operations can give 
> weird results, while throwing an exception will at least (usually) give 
> some sort of error that can be caught.

No it wont. the sort method will happily sort a list of sets.

>>> lst
[set([1]), set([2])]
>>> lst.sort()

I agree that it would have been better if an exception had been in
raised in the sort method here.

> Another poster mentioned the B-tree example, and that isn't solvable in 
> this case -- B-trees assume a total order, but by their nature aren't 
> explicit about checking for it; inserting a "partial plus exception" 
> order might result in tree failure at weird times.  An inconsistent 
> order, however, is even worse -- it would corrupt the tree at the same 
> times.

Yes, and it is almost unavoidable now in python. There is no
documentation about handling these cases. So the most likely
scenario is that the person writing the partial ordered class
will use rich comparisons and leave it at that. In that case
using such a class in a tree will certainly result in a corrupt
tree. But even if __cmp__ raises an exceptions at the appropiate
time, that is no good if the implementation of the tree won't
be affected by it because it uses the comparasons operators instead
of the cmp function.

That is why I would suggest the following:

Define a new exception: UnequalException.

If two elements are just unequal whithout one being smaller
or greater than the other, this exception should be raised
by cmp and programmers can do so in __cmp__.

If __cmp__ is used to solve one of the comparisons, then
this exception should result in 'False' except for !=

If the rich comparisons are used to give a result for cmp(a,b)
then the following should happen. 

  test if a <= b

  if so test a == b to decide between a negative number
                    and zero

  if not test a > b to decide between a positive number
                    and raising an exception.

This would allow for people implementing tree's or sort
algorithms to detect when the data used is not appropiate
and throw an exception instead of giving meaningless results
or corrupt data structures.

-- 
Antoon Pardon

Would there be support for a more general cmp/__cmp__

Would there be support for a more general cmp/cmp