[Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects tupleobject.c,2.48,2.49

Wed, 16 May 2001 15:07:49 -0400

[Tim]
> The question remaining is how much of this list/tuple richcmp behavior is
> guaranteed by the language and how much is just implementation-dependent
> fuzz.

[Guido]
> Unclear what you're asking.  The language doesn't require any
> particular semantics for sequence comparisons, but the language of
> course includes the tuple and list squence types, and it describes
> (albeing lacking some rigorous detail) what comparisons for those do.

The current

    Tuples and lists are compared lexicographically using comparison
    of corresponding items.

was quite clear in a cmp-only world.  In a richcmp world, "compared
lexicographically" is fuzzy enough that different implementations may do
different things in good faith, competent users may disagree about what it
means in specific cases, and programs may yield different results across
implementations (or random CVS patches <wink>).

> If there are specific lacks of detail, it probably helps to think
> about filling those in.

The *level* of additional detail intended is the cutoff between what's
guaranteed by the language and what's left up to the implementation.

The full truth before was relatively simple.  For a pair x, y of lists or
tuples,

def __cmp__(x, y):  # pretending this is a method on lists and tuples
    i = 0
    while i < len(x) and i < len(y):
        c = cmp(x[i], y[i])
        if c:
            return c
        i += 1
    return cmp(len(x), len(y))

was *almost* the entire tale, incl. that lengths were re-fetched on each
iteration.  What's left unexplained is the treatment of recursive lists, and
so the result of comparing them is a prime suspect for different behavior
across implementations and releases.

In a richcmp world, there are several additional ways in which the above
fails to capture the full truth, and each of those ways is another prime
suspect for surprises.

For example, I believe it's *intended* that:

1. Element comparisons continue to be strictly left-to-right, and
   that no element comparisons are to be performed after the leftmost
   element comparison that settles the issue (if any).

2. tuple/list comparison via == or != must use only == comparison on
   elements, and that implementations are allowed (but not required)
   to skip all element comparisons when == or != comparison is given
   lists/tuples of different sizes.

OTOH, I doubt (but don't know) it's intended that all implementations must
emulate other semantically significant details of the current implementation,
like:

1. <=, <, > and >= comparisons will do at most one element comparison
   that is not an == comparison.

2. Whenever a <, <=, > or >= element comparison is needed, the long-
   winded details of how that works, incl. but not limited to the
   specific "first try ==, then try <, then try >" strategy used to
   simulate a pre-richcmp cmp() when all else fails.

Going back to the original example:

>>> class C:
...     def __lt__(x, y): return 1
...     __eq__ = __lt__
...
>>> a, b = C(), C()
>>> a < b       #1
1
>>> [a] < [b]   #2
0
>>> cmp(a, b)   #3
0
>>> a > b       #4
1
>>> a == b      #5
1
>>> a != b      #6
1
>>>

Which of those results are *required* by the language, and which merely
*allowed*?

+ I believe #1, #4 and #5 are required.

+ I have no idea whether to call it "a bug" if the #2 and/or #3
  and/or #6 results differed, e.g., under Jython, or under
  CPython 2.3.  Indeed, I'm not even sure why #6 returns 1 under
  CPython today, and I've been staring at this a lot lately <wink>
  ... OK, #6 ends up getting resolved by comparing object
  addresses, which leaves "required or not?" fuzzy (i.e., *must*
  it be resolved that way?  or is it implementation-defined?).