[Python-Dev] RE: Rich comparison of lists and tuples

Tim Peters tim.one@home.com
Mon, 21 May 2001 03:53:24 -0400


[Guido]
> I would like to break this down by defining the mapping between cmp()
> and rich comparisons.

Good idea!

> I propose:
>
> - If cmp() is requested but not defined, and rich comparisons are
>   defined, try ==, <, > in order; if all three yield false, act as if
>   rich comparisons were not defined, and use the fallback comparison
>   (i.e. by address).

Here and below didn't cover the case where cmp() is requested and is defined.
I believe it's agreed now (but wasn't yet at the time you wrote this) that
cmp() will be called in that case (and which requires changes to the current
implementation).

> - If a rich comparison is requested but not defined, use cmp() and use
>   the obvious mapping.

Cool, except this is missing what I believe was intended detail, like that
when given "x < y" and x.__lt__ is not implemented then y.__gt__ will be
tried before falling back to cmp().  Also note this today:

class C:
    def __lt__(x, y):
        print "in __lt__"
        return NotImplemented

    def __gt__(x, y):
        print "in __gt__"
        return NotImplemented

C() < C()

That prints

in __lt__
in __gt__
in __gt__
in __lt__

I don't know to explain why each method gets called twice (well, I do, but
it's hard to swallow <wink>).  Again this can have semantic consequences,
e.g. if the methods have side-effects; and unclear whether this is intended,
a bug, or implementation-defined.

> - Continue to define the comparison of unequal sequences in terms of
>   cmp().

"the comparison" is ambiguous there:  you mean all comparisons?  just cmp()
comparisons?  just rich comparisons?

In any case, also unclear what "in terms of cmp()" means:  that every pair of
corresponding elements must be compared via cmp()?  Or that only the first
non-Py_EQ pair must be compared via cmp()?  Pseudo-code would be much clearer
than English here.

> - Testing == or != for sequences takes these shortcuts:

Must take these shortcuts, or may take these shortcuts?

>   1. if the lengths differ, the sequences differ

Note that I removed the tuple_richcompare code for doing this, because I
never found a case where tuples were compared via Py_EQ/Py_NE and the lengths
differed.  So the length-check in this case was a waste of time.  It isn't
true of lists or strings that it's a waste of time, but I believe there are
strong reasons for why programs simply will not compare different-sized
tuples for equality.  I would not like to pay for tuple length checks if only
one case in 500 billion would benefit, but if #1 is a mandatory shortcut
there's no choice.

>   2. compare the elements using == until a false return is found

Currently the sequence rich-compare code does #2 for all 6 comparison
operators.  Is that wrong?  Looked reasonable to me!

> Note that this defines 'x!=y' as 'not x==y' for sequences.  We could
> easily go the extra mile and define != to use only != on the items;
> but is this worth the extra complexity?

Not at all:  tuples and lists are Python's sequence types, so Python is
entitled to define what comparison means for them in any way it likes.  We've
already got cases where (see the first msg in this thread)

    [x] cmpop [y]

may yield a different result than

    x cmpop y

so we've already punted on doing the best-possible job of mimicking whatever
crazy-ass comparisons user-defined objects implement, when those objects are
contained in Python sequences.

My bias is showing <wink>:  I want Python's builtin sequence types to be as
efficient as possible.

Nasty example:  two conformable (same rank and dimensions) NumPy matrices A
and B return a conformable matrix of 0/1 bits when compared via "<" (well,
maybe they actually don't, but that's what drove richcmps to begin with!).
It may well be *convenient* for them if

    (A1, A2, A3) < (B1, B2, B3)

always returned a list (or tuple) of 3 0/1 matrices too:

    [A1 < B1, A2 < B2, A3 < B3]

So builtin sequence comparisons can't be all things to all people regardless.