Would there be support for a more general cmp/__cmp__

Antoon Pardon apardon at forel.vub.ac.be
Thu Oct 27 04:12:15 EDT 2005


Op 2005-10-26, Ron Adam schreef <rrr at ronadam.com>:
>
>
> Antoon Pardon wrote:
>
>> Op 2005-10-25, Steven D'Aprano schreef <steve at REMOVETHIScyber.com.au>:
>
>>>Can somebody remind me, what is the problem Antoon is trying to solve here?
>> 
>> 
>> Well there are two issues. One about correct behaviour and one about
>> practicallity.
>> 
>> The first problem is cmp. This is what the documentation states:
>> 
>> cmp(x, y)
>>     Compare the two objects x and y and return an integer according to
>>     the outcome. The return value is negative if x < y, zero if x == y
>>     and strictly positive if x > y. 
>> 
>> The problem is, that if someone implements a partial ordered class,
>> with the rich comparisons, cmp doesn't behave correct. It can't
>> behave correct because it will always give an number as a result
>> and such a number implies one of three relations between two
>> elements, but with partial ordered classes, it is possible none
>> of these relations exist between those two elements. So IMO cmp
>> should be changed to reflect this. This can be done either by cmp
>> returning None or UnequalValues.  or by cmp raising UnequalValues
>> in such a case.
>
> Instead of changing cmp, why not just have several common versions to 
> choose from instead of trying to make one tool do it all?

That would be an option. Lets call such an alternative comp.
Would that mean also have a __comp__ method for customization.

>> The second part is one of practicallity. Once you have accepted cmp,
>> should reflect this possibility, it makes sense that the __cmp__
>> method should get the same possibilities, so that where it is more
>> pratical to do so, the programmer can implement his partial order
>> through one __cmp__ method instead of having to write six.
>> 
>> 
>> In my opinion such a change would break no existing code. This
>> is because there are two possibilities.
> >
>> 1) The user is using cmp on objects that form a total order.
>> In such circumstances the behaviour of cmp doesn't change.
>> 
>> 2) The user is using cmp on objects that don't form a total
>> order. In such circumstances cmp doesn't give consistent
>> results, so the code is already broken.
>
> Adding complexity to cmp may not break code, but it could probably slow 
> down sorting in general.

The evidence suggests that cmp is not used in sorting. If you have a
list of sets, sort will happily try to sort it, while calling cmp
with a set as an argument throws an exception.

I have a feeling that adding comp (and its brother __comp__) would
have a detrimental effect on sorting, because using '<' would then
mean one extra method to look for that could be used in determining
if a < b or not.

> So I would think what ever improvements or 
> alternatives needs to be careful not to slow down existing sorting cases.

I have done some tests and have come to the conclusion that other
factors will have a greater influence on sorting times. I have written
the following program to estimate effects:

from random import shuffle

from time import time

class UnequalValues(Exception):
  pass

class cla:
  def __init__(self, i):
    self.value = int(i)

  def __cmp__(self, other):
    return self.value - other.value

class clb:
  def __init__(self, i):
    self.value = int(i)

  def __lt__(self, other):
    return self.value < other.value

class clc:
  def __init__(self, i):
    self.value = int(i)

  def __comp__(self, other):
    return self.value - other.value

  def __lt__(self, other):
    try:
      return self.__comp__(other) < 0
    except UnequalValues:
      return False

def test(lng, rep):

  for cl in cla, clb, clc:
    total = 0.0
    for _ in xrange(rep):
      lst = [cl(i) for i in xrange(lng)]
      shuffle(lst)
      start = time()
      lst.sort()
      stop = time()
      total += stop - start
    print "%s: %d repeats, %d long, %9.6f secs" % (cl, rep, lng, total)

test(1000,1000)


This is the result I get:
  
__main__.cla: 1000 repeats, 1000 long, 31.986618 secs
__main__.clb: 1000 repeats, 1000 long,  8.963896 secs
__main__.clc: 1000 repeats, 1000 long, 16.893321 secs


I find it very odd that clc sorts about twice as fast as cla.

That means that every class that has a __cmp__ method can be
speeded up with sorting by writing a similar __lt__ method
as in clc. I do wonder what is causing this.

-- 
Antoon Pardon



More information about the Python-list mailing list