how can I sort a bunch of lists over multiple fields?

Steven Bethard steven.bethard at gmail.com
Thu Apr 28 16:22:28 EDT 2005


Lonnie Princehouse wrote:
> So far, we've been using the "key" parameter of list.sort.  If you want
> sort criteria more complicated than a single attribute, you can sort
> based on a custom comparison function.

Actually, the key= parameter can do anything the cmp= parameter can:

class Key(object):
     def __init__(self, item)
         self.item = item
     def __cmp__(self, other):
         # put your usual cmp code here
         cmp(self.item, other)
lst.sort(key=Key)

Of course this is a pretty silly way to write a cmp function.  But the 
point is that you shouldn't think of the key= parameter as only useful 
for simple comparisons. See
     http://mail.python.org/pipermail/python-list/2005-April/277448.html
for a recent example of a pretty complex key function.

I would guess that 80-90% of all uses of sort that need a custom 
comparison function can be met easily by using the key= parameter and 
will be more efficient than using the cmp= parameter.

The above "Key" example has the same inefficiency problems that the cmp= 
parameter normally does, but in most cases, you won't need to define a 
custom __cmp__ function, and you can rely on __cmp__ functions 
implemented in C, like those of strs and tuples (as I do below).

> So another way to do a sort-by-author for your books would be:
> 
>   def compare_authors(book1, book2):
>       return cmp(book1.author, book2.author)
> 
>    books.sort(compare_authors)

This is definitely not a case where you want to use a comparison 
function.  It will be much more efficient to write:

def author_key(book):
     return book.author
books.sort(key=author_key)

> A more complicated comparison function might nest two others:
> 
>   def compare_dates(book1, book2):
>       # Assuming that your dates are either numerical or are strings
> for which
>       # alphabetical sorting is identical to chronological...
>       return cmp(book1.date, book2.date)
> 
>   def compare_author_and_date(book1, book2):
>        different_authors = compare_authors(book1, book2)
>        if different_authors:  # different authors
>            return different_authors
>        else:  # same author.  sort by date.
>            return compare_dates(book1, book2)
>   
>   books.sort(compare_author_and_date)

Likewise, the above is basically just an inefficient way of writing:

def date_key(book):
     return book.data

def author_and_date_key(book):
     return (author_key(book), date_key(book))

books.sort(key=author_and_date_key)

Note that the thing I take advantage of here is that tuples are 
comparable, and compare as you'd expect them to (in lexicographic order).

STeVe



More information about the Python-list mailing list