sort functions in python

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sat Feb 9 03:02:18 EST 2008


On Fri, 08 Feb 2008 19:09:06 -0800, Jeff Schwab wrote:

> Steven D'Aprano wrote:
>> On Fri, 08 Feb 2008 17:00:27 -0800, t3chn0n3rd wrote:
>> 
>>> Do you think it is relatively easy to write sort algorithms such as
>>> the common Bubble sort in Python as compared to other high level
>>> programming langauges
>> 
>> 
>> You realise that bubble sort is one of the worst possible sort
>> algorithms possible, particularly on modern hardware? It's not the
>> worst: bogo sort and bozo sort are worse, but bubble sort is still
>> pretty atrocious.
> 
> That's the conventional wisdom, mostly because CS professors need a "bad
> algorithm" to show undergrads, but it's not really accurate.  The truth
> is that bubble-sort's performance depends strongly on the input data. In
> the worst case, yes, bubble-sort is O(n^2); however, for almost-sorted
> data (only a few elements out of place), bubble-sort is actually one of
> the fastest known algorithms.

It depends on what you mean by "bubble sort". There are many different 
variations of bubble sort, that are sometimes described by names such as 
comb sort, cocktail sort, exchange sort, and sometimes merely referred to 
bubble sort. It's rather like any divide-and-conquer sorting algorithm 
being called quicksort.

I'm as guilty of that as anyone else -- the example code I posted, raided 
from Wikibooks is very different from this bubble sort algorithm from 
PlanetMath:

http://planetmath.org/encyclopedia/Bubblesort.html

def bubblesort(L):
    done = False
    while not done:
        done = True
        for i in xrange(len(L)-1):
            if L[i+1] <= L[i]:
                L[i], L[i+1] = L[i+1], L[i]
                done = False
    return L

This one runs significantly faster than the earlier one I posted.

Another vital factor is, what language are you implementing the sort 
routine in? The conventional wisdom for low-level languages like assembly 
and C doesn't necessarily hold for high-level languages like Python. 
Comparisons are slow in Python and fast in C; in C a good algorithm will 
minimize the amount of moves, while in Python a good algorithm will 
minimize the number of comparisons.

Finally, what hardware are you running on? There are interactions between 
hardware caches and pipes that can ruin the performance of an otherwise 
promising algorithm.


But all those factors aside, I fear that your attempt to rehabilitate the 
reputation of bubble sort is misplaced. Here's another person who wants 
to defend bubble sort:

"A fair number of algorithm purists (which means they've probably never 
written software for a living) claim that the bubble sort should never be 
used for any reason. Realistically, there isn't a noticeable performance 
difference between the various sorts for 100 items or less, and the 
simplicity of the bubble sort makes it attractive."

http://linux.wku.edu/~lamonml/algor/sort/bubble.html

But then on the same person's page on insertion sort:

"The insertion sort is a good middle-of-the-road choice for sorting lists 
of a few thousand items or less. The algorithm is significantly simpler 
than the shell sort, with only a small trade-off in efficiency. At the 
same time, the insertion sort is over twice as fast as the bubble sort 
and almost 40% faster than the selection sort."

http://linux.wku.edu/~lamonml/algor/sort/insertion.html

He gives sample C code for both, and the bubble sort has eight lines of 
code, versus nine for insertion sort (excluding bare braces).

Perhaps what he should have said is:

"Bubble sort is a tiny bit simpler than insertion sort, and almost twice 
as slow!"



-- 
Steven




More information about the Python-list mailing list