Sorts

Tim Peters tim_one at email.msn.com
Mon Jun 12 13:29:51 EDT 2000


[posted & mailed]

[Robin Porter]
> Does anyone know what type of sort is used for the internal sort
> function in Python (quick, merge, insertion, etc.)?

I should know <wink>, but remain grateful to Peter Schneider-Kamp for
explaining it.  I'll just expand on the "why":

The current binary_insertion/samplesort hybrid was introduced in Python
1.5.2.  In earlier releases a variety of plain quicksorts were used,
including-- in the very early days --libc's qsort.  The latter had to be
dropped because it wasn't reentrant on some platforms (== bad results and/or
even segfaults), and ran horribly slowly on some platforms in some common
degenerate cases.

The point to the current hybrid is that it minimizes comparisons while
(believe it or not, and it's hard to believe if you study the code!) doing
much less data movement than a mergesort.  Comparisons are especially
expensive in Python, because lists are heterogenous and each time a pair of
elements gets compared the correct type-varying function for doing the
comparison needs to be figured out and then indirectly invoked.  So
minimizing compares is crucial for a Python sort.  I wasn't able to write a
mergesort (which also does few compares) as fast as the hybrid, because
mergesort shuffles data around a lot more -- there are few enough compares
now that data movement expense is no longer lost in the shuffle <ahem>.  The
current hybrid is also an in-place sort.  Its only drawbacks are the
complexity of the algorithm and that the resulting sort isn't stable.

a-technical-term-meaning-"not-unstable"<wink>-ly y'rs  - tim






More information about the Python-list mailing list