Incomparable abominations

John Roth johnroth at ameritech.net
Mon Mar 24 12:54:55 EST 2003


"Lulu of the Lotus-Eaters" <mertz at gnosis.cx> wrote in message
news:mailman.1048482165.23801.python-list at python.org...
> |> the relation "1j < 2j" is self-evident and natural.
>
> "John Roth" <johnroth at ameritech.net> wrote previously:
> |That is, however, a border case. Is 1+2j < 2+1j true or false?
>
> True that the latter case has no natural order.  But the former case
> does.  Likewise, I find this order natural:
>
>     "A" < "B"
>
> And this order is completely arbitrary:
>
>     u"A" < unicodedata.lookup('HEBREW LETTER ALEF')
>
> In what sense is the Roman alphabet "less than" the Hebrew
alphabet?...
> I have no hunch at all about where the "A"-like letter in Arabic,
Greek,
> Cyrillic, etc. would fall in the sequence, FWIW.  Nor for various
> A-diacritics that occur in Romanesque and Cryillicish alphabets.  I
> expect the order to be stable, but there is no "natural" answer to the
> order.

This isn't a new problem with character string comparisons. It's been
that way as long as we've had computers that could  compare characters!

Different character sets do it differently, even for 8 bit characters.
For example, in EBCDIC, lower case letters compare before
upper case letters, while in ASCII the reverse is the case. The
character encoding that IBM invented for Stretch (7030) had
the lower case letters interleaved with the upper case letters so that
comparisons would come out "right."

If you want a "real world" ordering out of characters, you have to
use the comparison functions that go along with the appropriate locale.
That uses the established ordering for each language.

> And yet, both the character/unicode comparisons provide answers, while
> the the complex comparisons raise exceptions.
>
> To paraphrase the Timbot, Python seems to be aiming for a "principle
of
> maximum surprise!"
>
> Yours, Lulu...
>







More information about the Python-list mailing list