outline-style sorting algorithm

Thu Apr 29 10:51:21 EDT 2004

* Delaney, Timothy C (Timothy) (2004-04-29 02:52 +0100)
> Thorsten Kampe wrote:
>>>>>> foo
>>> ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20',
>>> '1.20.1', '1.30']
>>>>>> foo.sort()
>>>>>> foo
>>> ['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1',
>>> '1.30', '1.9'] 
>>> 
>>> Obviously 1.20.1 should be after 1.9 if we look at this as
>>> dot-delimited integers, not as decimal numbers.
>> 
>> You need some general approach to avoid the DSU thing:
>> 
>> def funcsort(seq, func):
>>     """ sort seq by func(item) """
>>     seq = seq[:]
>>     seq.sort(lambda x, y: cmp(func(x), func(y)))
>>     return seq
>> 
>> funcsort(foo, lambda x: map(int, x.split('.')))
> 
> I've seen you give this advice several times, today, and IMO it's
> completely wrong.
> 
> DSU is *exactly* what you want to do. If you want to wrap it inside a
> general function, that's fine, but DSU is in almost all cases preferred
> to passing a comparison function - it's much faster.

If made some tests with lists of one million items:

The first test was with a randomized sequence of numbers from 1 -
1000000. Duplicate numbers were allowed. The comparing function was
the identity (lambda x: x).

You DSU approach took about 43 s and mine took about 86 s.
Than I tested range(1000000) in reverse order:
Your sorted program took about 24 s and mine about 5!

So you cannot simply state that DSU is faster than passing a function
if the structure of the input you want to sort is unknown...

Thorsten