list sort question
Alex Martelli
aleaxit at yahoo.com
Sat Oct 7 06:42:39 EDT 2000
"Erik Max Francis" <max at alcyone.com> wrote in message
news:39DE9174.9A4F4F93 at alcyone.com...
> Bill Tolbert wrote:
>
> > I was asked today to sort a list of names and email addresses by last
> > name. I ended up with a list of tuples:
> >
> > [('Bill Tolbert', 'bill_tolbert at bigfoot.com'), ...]
> >
> > I couldn't figure out the list.sort(somefunction) syntax and resorted
> > to all sorts of slicing before I gave up and solved it in a query
> > (outside of Python).
>
> The function passed to sort takes two arguments, and must return -1, 0,
> or 1 if the first is less than, equal to, or greater than the second
> (respectively).
Yes, this is true. As you remark, it only works well for sorting SMALL
sequences.
> To get the last name of one of your tuples, you'd want something like
>
> import string
>
> tuple = ('Bill Tolbert', 'bill_tolbert at bigfoot.com')
> string.split(tuple[0])[-1] # take the first element,
> # split it into words,
> # take the last word
>
> So to sort the list you could do something like:
>
> list.sort(lambda x, y: cmp(string.split(x[0])[-1],
> string.split(y[0])[-1]))
The problem of course is that, this way, the splits are done O(N log N)
times -- and the same number of times is paid the interpretation
overhead of lambda &c.
> Now this would be pretty unwieldy on anything but tiny lists. If you
> are expecting to have even relatively sizeable lists, you'd probably
> want to choose another data structure (another poster suggested a
> dictionary whose keys are the last name and whose values are the
> tuples).
That's one possibility. The alternative design pattern for sorting is:
zip the sortfields to your payload -- sortfields in front, in order;
just sort with no special comparison function;
extract the payloads.
First and third step are O(N), not O(N log N), so for a large enough
list it's not crucial that those be fast -- the middle step, which is
already as fast as can be, will dominate anyway (IT is O(N log N)).
Guttman and Rosler have an excellent paper on that at:
http://www.hpl.hp.com/personal/Larry_Rosler/sort/sorting.html
it's unfortunately in Perl terms, but it also applies to Python.
Maybe somebody should translate it...
Alex
More information about the Python-list
mailing list