A Sort Optimization Technique: decorate-sort-dedecorate

Tue Aug 29 18:17:14 EDT 2006

 [Joachim Durchholz]
>>>>> Wikipedia says it's going from 2NlogN to N. If a sort is massively
>>>>> dominated by the comparison, that could give a speedup of up to 100%
>>>>> (approximately - dropping the logN factor is almost irrelevant, what
>>>>> counts is losing that factor of 2).

[Gabriel Genellina]
>>>> In fact it's the other way - losing a factor of 2 is irrelevant,
>>>> O(2N)=O(N). The logN factor is crucial here.

[Joachim Durchholz]
>>> That's just a question of what you're interested in.
>>>
>>> If it's asymptotic behavior, then the O(logN) factor is a difference.
>>>
>>> If it's practical speed, a constant factor of 2 is far more relevant
>>> than any O(logN) factor.

[Tim Peters]
>> Nope.  Even if you're thinking of base 10 logarithms, log(N, 10) > 2
>> for every N > 100.  Base 2 logarithms are actually most appropriate
>> here, and log(N, 2) > 2 for every N > 4.  So even if the "2" made
>> sense here (it doesn't -- see next paragraph), the log(N) term
>> dominates it for even relatively tiny values of N.

[Joachim Durchholz]
> Whether this argument is relevant depends on the constant factors
> associated with each term.

I'm afraid you're still missing the point in this example:  it's not
just that Python's (& Perl's) current sorts do O(N*log(N)) worst-case
comparisons, it's that they /do/ N*log(N, 2) worst-case comparisons.
O() notation isn't being used, and there is no "constant factor" here:
 the count of worst-case comparisons made is close to exactly N*log(N,
2), not to some mystery-constant times N*log(N, 2).  For example,
sorting a randomly permuted array with a million distinct elements
will require nearly 1000000*log(1000000, 2) ~= 1000000 * 20 = 20
million comparisons, and DSU will save about 19 million key
computations in this case.  O() arguments are irrelevant to this, and
the Wikipedia page you started from wasn't making an O() argument
either:

    http://en.wikipedia.org/wiki/Schwartzian_transform

    For an efficient ordinary sort function, the number of invocations of the
    transform function goes from an average of 2nlogn to n;

No O() in sight, and no O() was intended there either.  You do exactly
N key computations when using DSU, while the hypothetical "efficient
ordinary sort function" the author had in mind does about 2*N*log(N,
2) key computations when not using DSU.  That's overly pessimistic for
Python's & Perl's current sort functions, where no more than N*log(N,
2) key computations are done when not using DSU.  The /factor/ of key
computations saved is thus as large as N*log(N, 2) / N = log(N, 2).
O() behavior has nothing to do with this result, and the factor of
log(N, 2) is as real as it gets.  If key extraction is at all
expensive, and N isn't trivially small, saving a factor of log(N, 2)
key extractions is /enormously/ helpful.

If this is sinking in now, reread the rest of my last reply that got
snipped hre.

> Roughly speaking, if the constant factor on the O(N) term is 100 and the
> constant factor on the O(logN) term is 1, then it's still irrelevant.

As above, it's talking about O() that's actually irrelevant in this
specific case.

> My point actually is this: big-Oh measures are fine for comparing
> algorithms in general, but when it comes to optimizing concrete
> implementations, its value greatly diminishes: you still have to
> investigate the constant factors and all the details that the big-Oh
> notation abstracts away.

That's true, although the argument here wasn't actually abstracting
away anything.  You've been adding abstraction to an argument that
didn't have any ;-)

> From that point of view, it's irrelevant whether some part of the
> algorithm contributes an O(1) or an O(logN) factor: the decision where
> to optimize is almost entirely dominated by the constant factors.

While true in some cases, it's irrelevant to this specific case.
More, in practice a factor of O(log(N)) is almost always more
important "for real" than a factor of O(1) anyway -- theoretical
algorithms hiding gigantic constant factors in O() notion are very
rarely used in real life.  For example, the best-known algorithm for
finding the number of primes <= x has O(sqrt(x)) time complexity, but
AFAIK has /never/ been implemented because the constant factor is
believed to be gigantic indeed.  Theoretical CompSci is full of
results like that, but they have little bearing on practical
programming.