Python 2.x breaks cmp() (was Re: A suspected bug)

Tim Peters tim.one at home.com
Thu Feb 22 01:42:31 EST 2001


[Huaiyu Zhu]
> ...
> Now that we have rich comparison, is there any reason, other than
> historical, to allow __cmp__ to raise exceptions?  What would
> happen if sort just treat exception of cmp(a,b) as cmp(a,b)==0?
> Will it go into an infinite loop?

list.sort() only looks at __lt__, unless you pass your own comparison
function.  So __cmp__ is usually irrelevant here.  (BTW, there's nothing you
can do to send list.sort() into an infinite loop, no matter how whacky a cmp
function you supply -- return a random result each time, and it will still
end eventually.)

But even if list.sort() did still use cmp(), I don't see what's gained by
treating exceptions as equality, except to open the door to unbounded
surprises:  as you said earlier, a partial ordering can be extended to a
full ordering, but this treatment of cmp exceptions isn't good enough to do
that.  For example, consider

    x, y, z == [5, 3j, 1]

Under the "exception implies equality" gimmick, we would have

    cmp(x, y) == 0 == cmp(y, z)

because those both raise exceptions.  But in a total ordering that would
imply cmp(x, z) == 0 (5==1) too.  Ain't so.  In fact, that specific list
would be a fixed-point for list.sort() (based on details of its
implementation I happen to know).  That list reversed would also be a
fixed-point.  So it doesn't even manage to get objects of the same type next
to each other.

Because complex numbers are built in, Python could impose a total ordering
on complex numbers (and, indeed, it used to!).  That surprises people too.
Mixing 8-bit with Unicode strings in the absence of intended encoding
information about the former is wholly analogous, and not so esoteric that
people are willing to endure senseless orderings:

    x, y, z = ["\xbb", u"\u3456", "\x81"]

This is "just like" the mixed int and complex case:  both that list and its
reverse would be sort fixed-points, and for the same reasons.

list.sort() simply can't do something sensible in the presence of
exceptions.  So what's better:  Pass exceptions up to the caller?  Or
silently return a senseless result?  Note that I don't say "senseless" just
because it doesn't make sense <wink>, but also because the result is
*unpredictable* to the point of uselessness (as in the examples above).

Disclaimer:  If list.sort() were willing to embrace a quadratic-time
algorithm, it could deduce a total ordering consistent with whatever partial
ordering is present, provided that the latter is itself consistent (but note
that we have no a priori guarantee about that, and user-defined __cmp__s
often aren't consistent).

doesn't-sound-likely-to-me-ly y'rs  - tim





More information about the Python-list mailing list