Python 2.x breaks cmp() (was Re: A suspected bug)
Tim Peters
tim.one at home.com
Thu Feb 22 01:42:31 EST 2001
[Huaiyu Zhu]
> ...
> Now that we have rich comparison, is there any reason, other than
> historical, to allow __cmp__ to raise exceptions? What would
> happen if sort just treat exception of cmp(a,b) as cmp(a,b)==0?
> Will it go into an infinite loop?
list.sort() only looks at __lt__, unless you pass your own comparison
function. So __cmp__ is usually irrelevant here. (BTW, there's nothing you
can do to send list.sort() into an infinite loop, no matter how whacky a cmp
function you supply -- return a random result each time, and it will still
end eventually.)
But even if list.sort() did still use cmp(), I don't see what's gained by
treating exceptions as equality, except to open the door to unbounded
surprises: as you said earlier, a partial ordering can be extended to a
full ordering, but this treatment of cmp exceptions isn't good enough to do
that. For example, consider
x, y, z == [5, 3j, 1]
Under the "exception implies equality" gimmick, we would have
cmp(x, y) == 0 == cmp(y, z)
because those both raise exceptions. But in a total ordering that would
imply cmp(x, z) == 0 (5==1) too. Ain't so. In fact, that specific list
would be a fixed-point for list.sort() (based on details of its
implementation I happen to know). That list reversed would also be a
fixed-point. So it doesn't even manage to get objects of the same type next
to each other.
Because complex numbers are built in, Python could impose a total ordering
on complex numbers (and, indeed, it used to!). That surprises people too.
Mixing 8-bit with Unicode strings in the absence of intended encoding
information about the former is wholly analogous, and not so esoteric that
people are willing to endure senseless orderings:
x, y, z = ["\xbb", u"\u3456", "\x81"]
This is "just like" the mixed int and complex case: both that list and its
reverse would be sort fixed-points, and for the same reasons.
list.sort() simply can't do something sensible in the presence of
exceptions. So what's better: Pass exceptions up to the caller? Or
silently return a senseless result? Note that I don't say "senseless" just
because it doesn't make sense <wink>, but also because the result is
*unpredictable* to the point of uselessness (as in the examples above).
Disclaimer: If list.sort() were willing to embrace a quadratic-time
algorithm, it could deduce a total ordering consistent with whatever partial
ordering is present, provided that the latter is itself consistent (but note
that we have no a priori guarantee about that, and user-defined __cmp__s
often aren't consistent).
doesn't-sound-likely-to-me-ly y'rs - tim
More information about the Python-list
mailing list