While we're talking about annoyances

Tue May 1 03:02:36 EDT 2007

On Apr 30, 3:51 pm, a... at mac.com (Alex Martelli) wrote:
> Michael Hoffman <cam.ac... at mh391.invalid> wrote:
>
>    ...
>
> > >> Well, counting the index() function that is called in both cases, the
> > >> original rank() had one sort, but my version has two sorts.
>
> > > That doesn't affet the big-O behavior -- O(N log N) holds whether you
> > > have one sort, or three, or twentyseven.

Again we're not talking about big-O: we're talking about which one is
faster.

> > I've taught programming classes before, and I would have had to fail
> > anybody who misunderstood speed badly enough to claim that something
> > repeating an O(N log N) algorithm 27 times was no faster than doing it
> > once. ;-)
>
> As for me, I would fail somebody who thought they could compare speed
> this way -- if the O(N log N) executed 27 times took (on a given
> machine) 1 millisecond times N times log N, and the other one (on the
> same machine) 1 second times N times log N (and in the big-O notation
> you can NEVER infer what the multiplicative constant is), for example,
> such a comparison would be totally of base.

Of course you are right: this is basic asymptotic analysis.
You also know very well that this is not what I or Michael were
thinking of.  He was simply saying that repeating the *same thing* 27
times takes more time than simply doing it once, as it was made clear
by my earlier post that his solution involved repeating the same
index() function twice.

> > As Arnaud points out, asymptotic behavior is not the same as speed. His
> > original statement that the more recently proposed definitions of rank()
> > are slower than the OP's may be correct. And if it's not, it's not
> > because they're all O(N log N).
>
> And if it is, it's not because of the "one sort" versus "two sorts": by
> that sole criterion you just cannot guess (sensibly) at speed (measuring
> is much better, though it has its own pitfalls).

I'm afraid you can draw some conclusions in this case, precisely
*because* one version has one sort less than the other (that is what I
was hinting at in my first post).

Let's say the time complexity of the first sort is:
        f(n) ~ Knlogn
The time complexity of the second sort is:
        g(n) ~ Lnlogn
The time complexity of the simple loop that can replace the second
sort is:
        h(n) ~ Hn

Now because the whole algorithm consists of doing one sort THEN the
second thing(sort or loop), the time complexities of the two versions
are as follow:

Two sort version:
        f(n)+g(n) ~ Knlogn+Lnlogn
            ~ (K+L)nlogn
One sort version:
        f(n)+h(n) ~ Knlogn + Hn
            ~ Knlogn(1+H/Klogn)
            ~ Knlogn

So the two sort version is (K+L)/K times slower than the one-sort
version asymptotically.
(in practice I reckon K=L so that makes it twice slower)

--
Arnaud
"Errare humanum est, sed perseverare diabolicum"