Guido rethinking removal of cmp from sort method

Antoon Pardon Antoon.Pardon at rece.vub.ac.be
Tue Mar 29 04:46:57 EDT 2011


On Mon, Mar 28, 2011 at 03:43:03PM +0000, Steven D'Aprano wrote:
> On Mon, 28 Mar 2011 14:39:04 +0200, Antoon Pardon wrote:
> 
> > I tried to sort lists of 10000 elemens. Each element a tuple two items
> > that was to be sorted first according to the first item in ascending
> > order, then according to the second item in descending order. This was
> > done on python 2.6 so I had to write my own cmp_to_key function.
> > 
> > I then sorted these lists of 10000 elements a 1000 times with the
> > following methods.
> 
> Thank you for spending the time to get some hard data, but I can't 
> replicate your results since you haven't shown your code. Rather than 
> attempt to guess what you did and duplicate it, I instead came up with my 
> own timing measurements. Results are shown here, my code follows below:

But why expect me to come up with such data? Shouldn't the dev team have
come up with such data before they decided to remove the cmp-argument,
instead of afterwards expecting the python users to come up with data
to support the reversing of that decision?

> [steve at sylar ~]$ python2.7 sort_test.py
> Comparison function: 12.1256039143
> Key function: 3.51603388786
> Double sort: 2.33165812492
> cmp_to_key: 28.1129128933
> 
> 
> By far the best result comes from two calls to sort. Not far off is the 
> key function solution. (Admittedly, coming up with a key function for non-
> numeric data would be challenging.) The comparison function, and the 
> cmp_to_key solution, are clearly *much* slower.

So, that means that the fastest solutions don't generalize.

The double sort is useless if the actual sorting is done in a different
module/function/method than the module/function/method where the order
is implemented. It is even possible you didn't write the module 
where the sorting actually occurs.

Your key-function only works with numeric data, strings already pose
a problem here. 

The cmp function can be externally provided, without you having the
source.

So sooner or later there will be a user who just doesn't have a
choice but to use cmp_to_key or something similar. and who will
have to suffer this kind of speed loss.

-- 
Antoon Pardon



More information about the Python-list mailing list