Python 3.0, rich comparisons and sorting order

Andrew Dalke adalke at mindspring.com
Tue Sep 21 16:01:10 EDT 2004


Carlos Ribeiro suggested the following as a use case for
requiring that any two Python objects be compare-able:
> Assume that you're implementing a spreadsheet like application in
> Python. The user fills a column with arbitrary data, and asks for it
> to be sorted. What is the sorting order? Excel, for instance, defines
> an ordering (it's arbitrary, but it's deterministic).

In that case they most likely have a CellData object which
can be compared with other CellData objects.  To the list
it's a bunch of homogenous objects.

To sort then either the CellData objects define how to
compare two of themselves, or there's a way to generate
a value which used as the sort key.

> sort() should work regardless of the list elements, and return a
> reasonable result, even if not strictly correct in the numerical
> sense.

L = [open("/etc/passwd"), {"A": 1}, urllib.urlopen("http://python.org"),
       IOException(), Tkinter.Toplevel(), PIL.Image.open("my.jpg")]

L.sort()

What's a reasonable sort here?

All solutions eventually end up comparing object ids when
everything else fails.  But that breaks another reasonable
argument which is that

L1 = [ .. some list with items that can't be compared .. ]
L1.sort()
L2 = pickle.loads(pickle.dumps(L1))
L2.sort()
assert L1 == L2

This doesn't work because the reconstituted list will
have different object ids, which might be in a different order.

What should this do?

class NoCompare:
   def __cmp__(self, other): raise SystemExit("no compare")
   __lt__ = __gt__ = __eq__ = __neq__ = .... = __cmp

L3 = [NoCompare(), NoCompare(), NoCompare()]
L3.sort()



> The set, in this particular case,
> is a Python list, that *can* contain arbitrary data. So it does not
> make sense (in my not-so-humble opinion) for sort to impose
> restrictions based on the list element type.

By that definition you require that all other list methods
can work on all data.  Consider

 >>> class NoCompare:
...   def __eq__(self, other): raise AssertionError("no compare")
...
 >>> L = [1, 3, NoCompare(), 5]
 >>> 3 in L
True
 >>> 5 in L
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 2, in __eq__
AssertionError: no compare
 >>>

the "__contains__" method imposes a restriction that types
can be compared for equality.


> (BTW, if we extend this reasoning, the same could be said for other
> types of functions that work over sets -- sum() should ignore
> non-numeric values, etc. But that's another philosophical battle)

I think my example is more relevant.  Sum() explicitly
says it only works on numeric types and gives justification
for why that choice was made.

				Andrew
				dalke at dalkescientific.com



More information about the Python-list mailing list