how can I sort a bunch of lists over multiple fields?

Steven Bethard steven.bethard at gmail.com
Thu Apr 28 21:56:53 EDT 2005


Lonnie Princehouse wrote:
>>Likewise, the above is basically just an inefficient way of writing:
>>
>>def date_key(book):
>>    return book.data
>>
>>def author_and_date_key(book):
>>    return (author_key(book), date_key(book))
> 
> 
> It's certainly more elegant, but I wanted to talk about the mechanics
> of comparison functions =)
> 
> I don't know that it's more or less efficient in execution.  That
> depends on a few factors.

For most cases, key= will be more efficient than cmp=.  The key= 
argument will only be called once for each item in the list.  The cmp= 
argument will be called once for each comparison:

py> class Cmp(object):
...     def __init__(self):
...         self.count = 0
...     def __call__(self, x, y):
...         self.count += 1
...         return cmp(x, y)
...
py> class Key(object):
...     def __init__(self):
...         self.count = 0
...     def __call__(self, x):
...         self.count += 1
...         return x
...
py> import random
py> lst = range(1000)
py> random.shuffle(lst)
py> lst2 = list(lst)
py> c = Cmp()
py> lst.sort(cmp=c)
py> c.count
8599
py> k = Key()
py> lst.sort(key=k)
py> k.count
1000

Since in all but a few corner cases (e.g. where the list is already 
sorted) there will be way more comparisons than items, the key= argument 
will minimize the number of function calls.  Since function calls are 
one of the more expensive operations in Python, my guess is that there 
are very few cases where you would want to use the cmp= argument if you 
know you can use the key= argument.

STeVe



More information about the Python-list mailing list