Would there be support for a more general cmp/__cmp__
Christopher Subich
csubich.spam.block at spam.subich.block.com
Tue Oct 25 10:09:29 EDT 2005
Antoon Pardon wrote:
> I also think there is the problem that people aren't used to partial
> ordering. There is an ordering over sets, it is just not a total
> ordering. But that some pairs are uncomparable (meaning that neither
> one is smaller or greater) doesn't imply that comparing them is
> ill defined.
It depends on your definition of "comparison." Admittedly, <, =, !=,
and > can be defined for a partial ordering, but classical mathematics,
as understood by most people (even programmers), assumes that unless a
== b, a > b or a < b.
The comparisons, as defined this way, are done on a totally ordered set,
yes. But since most comparisons ever done -are- done on a totally
ordered set (namely the integers or reals), it's entirely fair to say
that "a programmer's expectation" is that comparisons should work more
or less like a totally ordered list.
With that caevat in mind, comparison on a partially ordered domain /is/
ill-defined; it can give inconsistent results from the "a < b or a > b
or a == b" rule.
>
> Well it is a wrong assumption is general. There is nothing impure about
> partial ordering.
Impure? Probably not. Useless from many perspective when "comparisons"
are needed, to the point where it's probably safer to throw an exception
than define results.
> That is IMO irrelevant. The subset relationship is an ordering and as
> such has all characteristics of other orderings like "less than",
> except that the subset relationship is partial and the less than
> relationship is total. How it is called "subset" vs "less than" is
> IMO irrelevant. It is about mathematical characteristics.
Still accident. < wouldn't be used for sets if we had a subset symbol
on the standard keyboard, APL fans notwithstanding. Simce programmers
familiar with Python understand that < on sets isn't a "real" comparison
(i.e. provide a total order), they don't expect unique, consistent
results from something like sort (or a tree, for that matter).
>
>
>>By analogy, one can ask, "is the cat inside the box?" and get the answer
>>"No", but this does not imply that therefore the box must be inside the
>>cat.
>
>
> Bad analogy, this doesn't define a mathematical ordering, the subset
> relationship does.
Yes, it does. Consider "in" as a mathematical operator:
For the set (box, cat-in-box)
box in box: False
box in cat-in-box: False
cat-in-box in box: True
cat-in-box in cat-in-box: False
For the set (box, smart-cat) # cat's too smart to get in the box
box in box: False
box in smart-cat: False
smart-cat in box: False
smart-cat in smart-cat: False
In both these cases, the "in" operator is irreflexive, asymmetric, and
transitive (extend to mouse-in-cat if you really care about transitive),
so "in" is a partial order sans equality. A useless one, but a partial
order nonetheless.
>>Notice that L1 and L2 contain the same elements, just in different orders.
>>Sorting those two lists should give the same order, correct?
>
>
> No. Since we don't have a total ordering, sorting doesn't has to give
> the same result. For as far as sorting is defined on this kind of
> object, the only thing I would expect is that after a sort the
> following condition holds.
>
> for all i,j if i < j then not L[i] > L[j]
Highly formal, aren't you? Again, common programming allows for "not a
> b" == "a <= b", so your condition becomes the more familiar:
for all i,j in len(list), if i < j then L[i] <= L[j]
This condition is also assumed implicitly by many other assumed "rules"
of sorted lists, namely their uniqueness:
"If L is a list of unique keys, then sort(L) is a unique list of keys in
sorted order."
Yes, this assumes that L has a total order. Big whoop -- this is a
matter of programming practiciality rather than mathematical purity.
>>Personally, I argue that sorting is something that you do to lists, and
>>that all lists should be sortable regardless of whether the objects within
>>them have a total order or not. If not, the list should impose a sensible
>>ordering on them, such as (perhaps) lexicographical order ("dictionary
>>order").
To reply to the grandparent poster here: that's not always easy to do.
A "sensible order" isn't always easy to define on a set where a partial
order exists, if it includes that order already. Adding
comparison-pairs to a partially ordered set (where incomparable elements
throw an exception, rather than just return False which is confusing
from sort's perspective) can easily result in a graph that isn't
actually transitive and antisymmetric, i.e.:
A > B > C, but
C > D > A for some operator ">"
With this in mind, the only sensible thing for .sort to do when it
encounters an exception is to fall back to its "backup" comparator (id,
for example), and resort /the entire list/ using that comparator. The
results returned will then be valid by sort's comparison, but a subset
of that list containing only "good" objects (like integers) may not (and
probably won't be) sorted itself using the object's comparison.
The difficulty arises because while we may consider a single type to be
fully ordered, it may also have some comparisons with other types;
sorting the list by type, then value seems like it would work (and
that's what the library does), but since we can sometimes compare across
types, this may not give a valid comparison. Consider:
[apple, wheat, cow]
apple < cow (defined, totally ordered)
Mammal > Grain > Fruit (accident of type id numbers)
apple < wheat throws TypeError, also wheat < cow
Since we encounter a TypeError when sorting this list, we first sort by
type:
[cow, wheat, apple]
and would then sort-by-comparison within types, if we had more than one.
But since apple < cow, the sublist of this list [cow, apple] is not
itself sorted. This is paradoxical under a single comparison, because a
sublist of any sorted list is itself supposed to be sorted.
The ordering that .sort used on the full list is well-defined and total,
but it's also useless.
> But will all libraries sort the shelves in the same way. As far as I
> understand a book gets somekind of code in a library and books are
> sorted according to these codes. Now everybody sorts these codes the
> same way, but the attibution of the code may differ from library to
> library.
Irrelevant; if you swap the order of any two (nonidentical) books in a
single library, then the library is unsorted. The library has imposed a
total order on books in its collection.
The library example is irrelevant, anyway; two books can always be
compared by the keys (author last, author first/middle, [other authors],
title, editor, year published, publisher, number of pages, colour of the
brightest line in the cover's emmission spectrum at 2000k in spectral
order [within visible spectrum]), and for the most part this is a useful
keyspace. Libraries just choose to impose a category/numbering at the
head of the key.
More information about the Python-list
mailing list