sort functions in python

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sat Feb 9 17:10:05 EST 2008


On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:

> Carl Banks wrote:
>> On Feb 8, 10:09 pm, Jeff Schwab <j... at schwabcenter.com> wrote:
>>> If you expect your data to be pretty nearly sorted already, but you
>>> just want to make sure (e.g. because a small number of elements may
>>> have been inserted or removed since the last sort), bubble-sort is a
>>> good choice.
>> 
>> But if you're at that stage you probably were doing something wrong in
>> the first place.
> 
> How do you figure?  You don't always have control over the
> insertions/replacements, or where they happen.  As a silly example,
> assume you have a sorted list of children, by height.  Maybe you check
> your list once per school semester.  The new sort order for any given
> semester will likely be very close to the previous order; however, a few
>   swaps may be in order, according to the different speeds at which
> children have grown.

You check their heights once per semester, but you're worried about an 
extra ten or twenty microseconds to sort the data?

Frankly, if we're talking about sorting at the level of Python 
application programming, nothing you write is going to beat the speed of 
the built-in list.sort() and sorted() built-ins. Here's that bubble sort 
from yesterday again:

def bubble(A):
    # http://planetmath.org/encyclopedia/Bubblesort.html
    N = len(A)-1
    done = False
    while not done:
        done = True
        for i in xrange(N):
            if A[i+1] <= A[i]:
                A[i], A[i+1] = A[i+1], A[i]
                done = False
    return A


and some speed tests:

>>> L = [1, 2, 3, 4, 6, 5, 7, 9, 8]
>>> min(timeit.Timer('bubble(L)', 
... 'from __main__ import bubble, L').repeat(number=1000))
0.012052059173583984
>>> min(timeit.Timer('sorted(L)', 
... 'from __main__ import L').repeat(number=1000))
0.0055558681488037109
>>> min(timeit.Timer('bubble(L)', 
... 'from __main__ import bubble; L=range(5)').repeat(number=1000))
0.008541107177734375
>>> min(timeit.Timer('sorted(L)', 
... 'L=range(5)').repeat(number=1000))
0.0046191215515136719


If you're writing code in a low-level language, or a high-level language 
with no built-in sort (or a crappy one), your mileage may vary.



-- 
Steven



More information about the Python-list mailing list