Comparing lists

Mon Oct 17 12:35:58 EDT 2005

"Alex Martelli" <aleaxit at yahoo.com> wrote in message 
news:1h4knvy.41r7fw1k4bge7N%aleaxit at yahoo.com...
> Christian Stapfer <nil at dev.nul> wrote:
>
>>  This is why we would like to have a way of (roughly)
>> estimating the reasonableness of the outlines of a
>> program's design in "armchair fashion" - i.e. without
>> having to write any code and/or test harness.
>
> And we would also like to consume vast amounts of chocolate, while
> similarly reclining in comfortable armchairs,

Maybe some of my inclination towards design
based on suitable *theories* (instead of
self-conditioning through testing) goes back
to the fact that I tend to think about the
design of my programs when no computer happens
to be near at hand to do some such experimenting,
or self-conditioning...

> without getting all fat and flabby.

Well, thinking can be hard work. There is no need
to suggest an image of laziness. Thought experiments
are also quite often successful. Hardware engineers
can design very often entire gadgets without doing
a great deal of testing. They usually need to resort
to testing only if they know (or feel?) not to have
a sufficiently clear *theoretical* grasp of the
behavior of some part of their design.

>  Unfortunately, what we would like and what reality affords
> are often pretty uncorrelated.  No matter how much theoreticians may
> love big-O because it's (relatively) easy to compute, it still has two
> failings which are often sufficient to rule out its sufficiency for any
> "estimate [of] the reasonableness" of anything: [a] as we operate on
> finite machines with finite wordsize, we may never be able reach
> anywhere even remotely close to the "asymptotic" region where big-O has
> some relationship to reality; [b] in many important cases, the
> theoretical worst-case is almost impossible to characterize and hardly
> ever reached in real life, so big-O is of no earthly use (and much
> harder to compute measures such as big-Theta should be used for just
> about any practical purpose).

But the fact remains that programmers, somewhat
experienced with the interface a module offers,
have a *rough*idea* of that computational complexity
attaches to what operations of that interface.
And having such a *rough*idea* helps them to
design reasonably performing programs much more
quickly.
  Big-Oh and other asymptotic complexity measures
really do have *this* advantage over having
acquired, by way of conditioning experiences,
some such *rough*idea* of computational complexity:
they capture at least some of that "rough idea"
in a somewhat more easily communicable and much
more precise fashion.

  Maybe you and Steven prefer to be conditioned,
Pavlov style, by the wonderful experiences that
you get while testing? - This is perhaps really
one of my *worst* psychological handicaps, I must
admit: that I don't *like* to get conditioned
like that, no matter how good it feels, no matter
how effective it might be for "practical" work that
one has to do.
  I want to be able to really think *about* what
I am doing. And in order to be able to think about
it one usually needs some information about the
implementation, performance wise, of the language
features and the system modules that one might
want to use. If you happen to know of any *better*
way of offering the requisite information than
asymptotic complexity measures then, of course,
I am very grateful to hear more about it.

> Consider, for example, point [b].  Quicksort's big-O is N squared,
> suggesting that quicksort's no better than bubblesort or the like.  But
> such a characterization is absurd.  A very naive Quicksort, picking its
> pivot very systematically (e.g., always the first item), may hit its
> worst case just as systematically and in cases of practical importance
> (e.g., already-sorted data); but it takes just a little extra care (in
> the pivot picking and a few side issues) to make the worst-case
> occurrences into ones that will not occur in practice except when the
> input data has been deliberately designed to damage by a clever and
> determined adversary.
>
> Designing based on worst-case occurrences hardly ever makes
> sense in any field of engineering,

What's wrong with wanting to have a rough idea
of what might happen in the worst case? I believe
many engineers are actually expected to think
about at least some "worst-case" scenarios.
Think of nuclear reactors, airplanes, or
telephone exchanges (and dont' think of Google
for a change). Don't you expect engineers
and scientists designing, for example, a nuclear
reactor, to think hard about what the worst-case
scenario might be? And how likely it might happen?
(And *no* testing whatsoever in that direction,
please!) Not thinking is, admittedly, a lot easier.

<snip/>

> Why bother doing
> (e.g.) random pivot selection in quicksort, when its big-O (i.e.,
> worst-case) behavior will remain N-squared, just like naive quicksort,
> or, for that matter, bubblesort?

Because worst-case is not the only measure of
computational complexity that one might be
interested in. For some applications one may
be able to accept relatively bad worst-case
behavior, if it doesn't happen too often.
This is why for these people we might provide
information about average case behavior (and
then the difference between quicksort and
bubblesort clearly shows).
  For others, such worst-case behavior may not
be acceptable. For those other applications,
a good worst-case may be what is required.
This is why this second category of programmers
needs to know about the worst-case. - But I am
certainly belaboring the obvious...

Of course, the programs that have been designed
on the basis of such information can (and ought
to be) tested. Then some surprises (*not* predicted
by theory) might happen: but if they do not happen
too often, theory has done a good job - and so has
the programmer...

Regards,
Christian