sort functions in python

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Feb 10 00:06:29 EST 2008


On Sat, 09 Feb 2008 14:28:15 -0800, Jeff Schwab wrote:

> Steven D'Aprano wrote:
>> On Sat, 09 Feb 2008 13:37:23 -0800, Jeff Schwab wrote:
>> 
>>> Carl Banks wrote:
>>>> On Feb 8, 10:09 pm, Jeff Schwab <j... at schwabcenter.com> wrote:
>>>>> If you expect your data to be pretty nearly sorted already, but you
>>>>> just want to make sure (e.g. because a small number of elements may
>>>>> have been inserted or removed since the last sort), bubble-sort is a
>>>>> good choice.
>>>> But if you're at that stage you probably were doing something wrong
>>>> in the first place.
>>> How do you figure?  You don't always have control over the
>>> insertions/replacements, or where they happen.  As a silly example,
>>> assume you have a sorted list of children, by height.  Maybe you check
>>> your list once per school semester.  The new sort order for any given
>>> semester will likely be very close to the previous order; however, a
>>> few
>>>   swaps may be in order, according to the different speeds at which
>>> children have grown.
>> 
>> You check their heights once per semester, but you're worried about an
>> extra ten or twenty microseconds to sort the data?
> 
> Are you serious?
> 
> The fact that you wouldn't use a Python script for this is what makes it
> a "silly" example in the first place.


Okay, now you've broken my brain. Why on earth do you think Python isn't 
suitable for processing a list of a few hundred, or even thousand, items?


> The debate is whether Bubble Sort
> is so bad that someone should get slapped upside the head for even
> mentioning it.

Perhaps we have a different opinion about what "slapped upside the head" 
means. But if you care to point out precisely what it was that I -- or 
anyone else -- said that could be classified as abusive to the original 
poster for mentioning bubble sort, I'll be glad to apologize to him.

If not, perhaps you should chill out and stop accusing folks of bad 
behaviour that didn't happen.


> I think it's a tool, like any other, and that
> occasionally it might be the right tool for the job.

Sure its a tool. So is a stone axe, and if I were worried about the 
collapse of industrial civilization and the loss of metal-working skills, 
then I might advocate that engineering students learn how to make stone 
axes.

In the case of bubble sort, I think it's a perfectly fine algorithm to 
play around with *as a learning exercise*.


> Imagine that
> instead of children, we're keeping a sorted list of a very large number
> of crystals that may grow at varying speeds, and that the sizes of the
> individual crystals are polled at a rate limited by the time taken by
> the sort algorithm.

Bubble sort doesn't degrade gracefully. On "a very large" set of data, if 
your almost-sorted data is actually less sorted than you imagined, you 
could see your expected 10 seconds of processing time blow out to 35 
minutes.

I'd be more sympathetic to the idea of using bubble sort on tiny data 
sets, where if it goes bad, nobody is going to care that 0.01 second 
blows out to 3s. Who'd even notice?

But then, for such small amounts of data, why would you bother with 
bubble sort when insertion sort and Shell sort exist?


>> Frankly, if we're talking about sorting at the level of Python
>> application programming, nothing you write is going to beat the speed
>> of the built-in list.sort() and sorted() built-ins.
> 
> That's probably true with five-nines probability, but there may be
> exceptions.  Do you really not understand what I'm getting at, or do you
> just disagree?

It's not that bubble sort has no positives, but that they're so minor, 
and the negatives so serious, that I can't think of any situation where 
I'd recommend using bubble sort in production code -- or take somebody 
seriously if they suggested it, without a *whole* lot of actual timing 
results backing them up.


-- 
Steven



More information about the Python-list mailing list