Incomparable abominations

David Mertz mertz at gnosis.cx
Mon Mar 24 23:59:25 EST 2003


"Tim Peters" <tim_one at email.msn.com> wrote previously:
|what your use case is:  for what purpose do you think you need to be
|able to sort lists with a mix of element types for which Python can't
|dream up a better ordering than comparing the objects' memory addresses?

I fairly often use lists to collect heterogeneous objects together.
This might often be in an introspective context: what attributes (and
attribute values) does an object have?  But sometimes I want to be
forgiving in the calls made to my functions with various arguments
(maybe arguments that are not wholly proper, but I still want to do
something slightly reasonable).  Or I parse values of various datatypes
out of a textual report.  Or I get an XML-RPC or CORBA call with
heterogeneous parameters.

Admittedly, sorting those heterogeneous collections is probably not the
-most- common thing to do with them.  And sometimes dictionaries are
a better data structure--and Sets, especially, will be useful once I
start really using 2.3.  But there is something so simple an elegant
about throwing 'collection.append(...)' statements around liberally.

When I do want to sort a heterogenous collection, often what I want to
do with the ordered collection is extract subsequences that have a
"natural" order.  Like loop through for a while, dealing with numbers;
then when I start seeing strings, deal with those in a different way
(but still within ordered subsequences).  In such a story, I probably
would more-or-less ignore any objects that aren't of "interesting"
types, whenever I got to them.

Unlike almost any other operation, [...].sort() cannot be saved by any
simple try/except block.  It is extremely difficult to accurately
specify in a program just when it will fail, and harder still to decide
exactly what to do to remedy the situation.  I could run the list
through filter() to try to get rid of everything I don't like (and that
might not sort).  But it's not obvious what would need to be filtered.
Complex numbers are OK sometimes, but not others.  Unicode strings are
OK sometimes, but not others.  Not depending on the individual object,
but upon what else might happen to be in the list.  And even upon what
order the list started in, contrast:

    Python 2.3a2 (#0, Feb 21 2003, 19:35:57)
    ...
    >>> [u'x', 'x', chr(255)].sort()
    >>> ['x', chr(255), u'x'].sort()
    Traceback (most recent call last):
    ...

This is seriously pathological!  Btw. something like chr(255) has a
genuine value without an encoding.  For example, I was recently doing
some stuff with cryptographic key material that looks at raw byte
values--the question of charset just isn't relevant there.

There IS NOT "use case" in the sense that I simply cannot do something
like GUI callbacks or matrix manipulation or whatever because of the
current [].sort() behavior.  I can always find SOME way around the
inconvenience.  But it's one of those things that comes back to bite you
at unexpected times, and for no good reason.  That's why it bugs me so
much.

Yours, David...

--
mertz@  | The specter of free information is haunting the `Net!  All the
gnosis  | powers of IP- and crypto-tyranny have entered into an unholy
.cx     | alliance...ideas have nothing to lose but their chains.  Unite
        | against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------






More information about the Python-list mailing list