sorting with expensive compares?

Tim Peters tim.peters at gmail.com
Sat Dec 24 00:37:01 EST 2005


[Steven D'Aprano]
> ...
> As others have pointed out, Python's sort never compares the same objects
> more than once.

Others have pointed it out, and it's getting repeated now, but it's
not true.  Since I wrote Python's sorting implementation, it's
conceivable that I'm not just making this up ;-)

Here's the shortest counter-example:  [10, 30, 20].

The first thing sort() does is compute the length of the longest
natural run (whether increasing or decreasing) starting at index 0. 
It compares 10 and 30, sees that the list starts with an increasing
run, then compares 30 and 20 to see whether this run can be extended
to length 3.  It can't.  Since the list has only 3 elements, it
decides to use a binary insertion sort to move the 3rd element into
place.  That requires 2 comparisons, the first of which _again_
compares 30 and 20.

This doesn't bother me, and it would cost more to avoid this than it
could possibly save in general.  It's true that the workhorse binary
insertion and merge sorts never compare the same elements more than
once, but there is some potential duplication between those and
comparisons done to identify natural runs.



More information about the Python-list mailing list