Guido rethinking removal of cmp from sort method

Paul Rubin no.email at nospam.invalid
Mon Mar 28 21:58:25 EDT 2011


Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
> There are lots of problems that Python is not well-suited for.
> It is not well-suited for writing the driver to a graphics card.
> It is not well-suited to calculating pi to a trillion decimal places.
>...

But sorting moderate amounts of data on complicated criteria is an area
Python IS supposed to be well-suited for.

> 99% of requests should be smothered. That is a *good thing*. 

We're not talking about a "request" for a feature to be added.  There
was an existing feature that worked perfectly well, that was removed for
some ideological reason that makes no sense to me, with some slight
allusions to technical reasons that I've also never completely
understood.

> More memory yes; take longer to sort, almost certainly not (except, 
> perhaps, for a very narrow window where the addition of a key causes

I don't see how using something like cmp_to_key could ever fail to slow
the program down.  It's calling the exact same comparison function the
same number of times, but also creating and releasing a pile of class
instances, and doing a lot of runtime lookups to access the comparison
function in each instance on each comparison.

> data.sort(key=functools.cmp_to_key(myfunc))
> which is a little longer to type, and uses a little more memory, but 
> otherwise is no worse than using cmp.

I don't believe that for one second.  Have you actually timed it?  Like,
sort an array of 100000 random (int,int) pairs ascending on the first
int and descending on the second, using a simple comparison function and
using cmp_to_key?  That's Anton's example, and by coincidence that
exact situation came up in something I'm hacking on RIGHT NOW.  I
have (roughly) a list of transactions in the form
  (person's name, timestamp, transaction info)
and I want to print a report sorted alphabetically by person, listing
each person's transactions most-recent-first.  Of course I was able
to concoct a solution with a few moments of head-scratching, but 
using a cmp argument is much more natural and transparent in my
opinion.  DSU is a clever and useful design pattern, but comparison
sorting is what all the sorting textbooks are written about.



More information about the Python-list mailing list