sorting on keys in a list of dicts

Craig Ringer craig at postnewspapers.com.au
Fri Jan 7 12:05:40 EST 2005


On Sat, 2005-01-08 at 00:26, It's me wrote:
> What does it mean by "stability in sorting"?

If I understand correctly, it means that when two sorts are performed in
sequence, the keys that are equal to the second sort end up ordered the
way they were left by the first sort.

I'm far from certain of this, but at least I'm presenting an opportunity
for someone to yell "no, you're wrong!" and in the process definitively
answer the question.

For example, given the list:

.>>> l = [(1,2), (8,2), (2,2), (3,2), (4,3), (5,3), (8,9)]

if we sort by the first element of each tuple then the second (the
default), we get:

.>>> l.sort()
.>>> l
[(1, 2), (2, 2), (3, 2), (4, 3), (5, 3), (8, 2), (8, 9)]

Now, if we sort based on the second element we get:

.>>> def seconditem(x):
....     return x[1]
....
.>>> l.sort(key=seconditem)
.>>> l
[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]

You'll note that there are several correct answers to the request "sort
the list 'l' by the second element of each item", including:

[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]
[(2, 2), (1, 2), (8, 2), (3, 2), (4, 3), (5, 3), (8, 9)]
[(1, 2), (2, 2), (3, 2), (8, 2), (5, 3), (4, 3), (8, 9)]

and many others. Because we didn't specify that the first item in the
value tuples should be used in the sort key, so long as the second key
is equal for a group of items it doesn't matter what order items in that
group appear in.

Python (at least 2.4), however, returns those groups where the order
isn't defined in the same order they were before the sort. Look at this,
for example:

.>>> l.sort()
.>>> l.reverse()
.>>> l
[(8, 9), (8, 2), (5, 3), (4, 3), (3, 2), (2, 2), (1, 2)]
.>>> l.sort(key=seconditem)
.>>> l
[(8, 2), (3, 2), (2, 2), (1, 2), (5, 3), (4, 3), (8, 9)]

See how the exact same sort command was used this time around, but
because the list was reverse-sorted first, the elements are in reverse
order by first item when the second item is equal?

In the first case we used the same result as the stable sort could be
obtained with:

.>>> def revitem(x):
....     return (x[1], x[0])
>>> l.sort(key=revitem)
>>> l
[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]

(in other words, saying "use the value tuple as the sort key, but sort
by the second element before the first")

That doesn't extend to more complex cases very well though. Imagine you
had 3-tuples not 2-tuples, and wanted to maintain the previous sort
order of equal groupings when re-sorting by a different key... but you
didn't know what key was last used for sorting. A stable sort algorithm
means you don't need to care, because the order will be maintained for
you not randomized.

Well, that's several hundred more words than were probably required, but
I hope I made sense.

--
Craig Ringer




More information about the Python-list mailing list