Would there be support for a more general cmp/__cmp__

Antoon Pardon apardon at forel.vub.ac.be
Fri Oct 28 04:44:17 EDT 2005


Op 2005-10-26, Ron Adam schreef <rrr at ronadam.com>:
>
> Adding complexity to cmp may not break code, but it could probably slow 
> down sorting in general.  So I would think what ever improvements or 
> alternatives needs to be careful not to slow down existing sorting cases.

As a result of Bengt's post, I rewrote my test program, this is the
program now.

from random import shuffle

from time import time

class UnequalValues(Exception):
  pass

__metaclass__ = type

class cla:
  def __init__(self, i):
    self.value = int(i)

  def __cmp__(self, other):
    return self.value - other.value

class clb:
  def __init__(self, i):
    self.value = int(i)

  def __lt__(self, other):
    return self.value < other.value

class clc:
  def __init__(self, i):
    self.value = int(i)

  def __gt__(self, other):
    return self.value > other.value

class cld:
  def __init__(self, i):
    self.value = int(i)

  def __comp__(self, other):
    return self.value - other.value

  def __lt__(self, other):
    return self.__comp__(other) < 0

class cle:
  def __init__(self, i):
    self.value = int(i)

  def __comp__(self, other):
    return self.value - other.value

  def __lt__(self, other):
    try:
      return self.__comp__(other) < 0
    except UnequalValues:
      return False

def test(lng, rep):

  for cl in cla, clb, clc, cld, cle:
    total = 0.0
    for _ in xrange(rep):
      lst = [cl(i) for i in xrange(lng)]
      shuffle(lst)
      start = time()
      lst.sort()
      stop = time()
      total += stop - start
      for i in xrange(1,rep):
        assert lst[i - 1] < lst[i]
    print "%s: %d repeats, %d long, %9.6f secs" % (cl, rep, lng, total)

test(1000,1000)

---

These are the results.

<class '__main__.cla'>: 1000 repeats, 1000 long, 10.061425 secs
<class '__main__.clb'>: 1000 repeats, 1000 long,  9.544035 secs
<class '__main__.clc'>: 1000 repeats, 1000 long, 10.450864 secs
<class '__main__.cld'>: 1000 repeats, 1000 long, 15.566061 secs
<class '__main__.cle'>: 1000 repeats, 1000 long, 15.776443 secs

Results on a longer sequence were:

<class '__main__.cla'>: 1000 repeats, 10000 long, 146.722443 secs
<class '__main__.clb'>: 1000 repeats, 10000 long, 139.480863 secs
<class '__main__.clc'>: 1000 repeats, 10000 long, 152.623424 secs
<class '__main__.cld'>: 1000 repeats, 10000 long, 224.630926 secs
<class '__main__.cle'>: 1000 repeats, 10000 long, 228.663825 secs

The most interesting result of this test is the difference between
cld and cle. IMO this difference is an indication of the cost
that my idea would have on sorting, should it be implemented.
That would be around 2%. I think that is bearable. Especially
since this were very simple routines. The moment the comparison
calculation become heavier, the relative contribution of the
try: except will diminuish.

If you are concerned about sorting times, I think you should
be more concerned about Guido's idea of doing away with __cmp__.
Sure __lt__ is faster. But in a number of cases writing __cmp__
is of the same complexity as writing __lt__. So if you then
need a __lt__, __le__, __eq__, __ne__, __gt__ and __ge__ it
would be a lot easier to write a __cmp__ and have all rich
comparisons methods call this instead of duplicating the code
about six times. So you would be more or less forced to write
your class as class cld or cle. This would have a bigger
impact on sorting times than my suggestion.

-- 
Antoon Pardon



More information about the Python-list mailing list