Comparing lists

Alex Martelli aleaxit at yahoo.com
Mon Oct 17 16:20:49 EDT 2005


Christian Stapfer <nil at dev.nul> wrote:

> "Alex Martelli" <aleaxit at yahoo.com> wrote in message 
> news:1h4knvy.41r7fw1k4bge7N%aleaxit at yahoo.com...
> > Christian Stapfer <nil at dev.nul> wrote:
> >
> >>  This is why we would like to have a way of (roughly)
> >> estimating the reasonableness of the outlines of a
> >> program's design in "armchair fashion" - i.e. without
> >> having to write any code and/or test harness.
> >
> > And we would also like to consume vast amounts of chocolate, while
> > similarly reclining in comfortable armchairs,
> 
> Maybe some of my inclination towards design
> based on suitable *theories* (instead of
> self-conditioning through testing) goes back
> to the fact that I tend to think about the
> design of my programs when no computer happens
> to be near at hand to do some such experimenting,
> or self-conditioning...

Oh, I am as prone as anybody I know to do SW architecture and design in
bed when the lights are off and I'm sliding into sleep -- just about the
only case in which no computer is handy, or, rather, in which it's
generally unwise to turn the computer on (since it would interfere with
the sleep thing;-).  Back before laptops were really affordable and
usable, I used to have a long bus commute, and did a lot of design with
pen and paper; and whiteboards are a popular group-design tool at
Google, no matter how many laptops or desktops happen to be around --
whiteboards are simply more suitable for "socialization" around a draft
design's sketch, than any computer-based tool I've ever seen.

But that's *design*, and most often in pretty early stages, too -- quite
a ways from *coding*.  At that stage, one doesn't even generally commit
to a specific programming language or other for the eventual
implementation of the components one's considering!  Rough ideas of
*EXPECTED* run-times (big-Theta) for various subcomponents one is
sketching are *MUCH* more interesting and important than "asymptotic 
worst-case for amounts of input tending to infinity" (big-O) -- for
example, where I sketch-in (mentally, on paper, or on whiteboard) a
"hash table" subcomponent, I consider the *expected* (Theta) performance
(constant-time lookups), definitely NOT the big-O "linear time" lookups
which just MIGHT occur (if, say, all inputs just happened to hash to the
same value)... otherwise, I'd never use hash tables, right?-)


> > without getting all fat and flabby.
> 
> Well, thinking can be hard work. There is no need
> to suggest an image of laziness. Thought experiments
> are also quite often successful. Hardware engineers
> can design very often entire gadgets without doing
> a great deal of testing. They usually need to resort
> to testing only if they know (or feel?) not to have
> a sufficiently clear *theoretical* grasp of the
> behavior of some part of their design.

Having been a hardware designer (of integrated circuits, for Texas
Instruments, and later briefly for IBM), before switching to software, I
can resolutely deny this assertion: only an utter madman would approve a
large production run of an IC who has not been EXTENSIVELY tested, in
simulations and quite possibly in breadboards and later in limited
pre-production runs.  And any larger "gadget" USING ICs would be
similarly crazy to skimp on prototyping, simulation, and other testing
-- because, as every HW engineer KNOWS (SW ones often have to learn the
hard way), the distance between theory and practice, in practice, is
much larger than the distance between practice and theory should be in
theory;-).


> >  Unfortunately, what we would like and what reality affords
> > are often pretty uncorrelated.  No matter how much theoreticians may
> > love big-O because it's (relatively) easy to compute, it still has two
> > failings which are often sufficient to rule out its sufficiency for any
> > "estimate [of] the reasonableness" of anything: [a] as we operate on
> > finite machines with finite wordsize, we may never be able reach
> > anywhere even remotely close to the "asymptotic" region where big-O has
> > some relationship to reality; [b] in many important cases, the
> > theoretical worst-case is almost impossible to characterize and hardly
> > ever reached in real life, so big-O is of no earthly use (and much
> > harder to compute measures such as big-Theta should be used for just
> > about any practical purpose).
> 
> But the fact remains that programmers, somewhat
> experienced with the interface a module offers,
> have a *rough*idea* of that computational complexity
> attaches to what operations of that interface.
> And having such a *rough*idea* helps them to
> design reasonably performing programs much more
> quickly.

A rough idea helps, particularly a rough idea of EXPECTED performance.

>   Big-Oh and other asymptotic complexity measures
> really do have *this* advantage over having
> acquired, by way of conditioning experiences,
> some such *rough*idea* of computational complexity:

No: big-O is NOT AT ALL a measure, rough or otherwise, of EXPECTED
performance.  It's *WORST-CASE*, by definition; and includes no
indications whatsoever of how close to the asymptote one can actually
get on a given finite-size machine.  By both issues, it can be totally
misleading -- and, "it's not what you don't know, that hurts... it's
what you know WHICH IS NOT SO".  By its MISLEADING characteristics,
big-O can be SERIOUSLY DAMAGING to the abilty of "designing on the back
of an envelope" (or in your head, or at a whiteboard).

> they capture at least some of that "rough idea"
> in a somewhat more easily communicable and much
> more precise fashion.

Entirely precise, and therefore, in such cases as quicksort and hash
tables (hardly "obscure corner cases" -- CENTRAL PILLARS of many
designs!!!) all that more misleading.


>   Maybe you and Steven prefer to be conditioned,
> Pavlov style, by the wonderful experiences that
> you get while testing? - This is perhaps really
> one of my *worst* psychological handicaps, I must
> admit: that I don't *like* to get conditioned
> like that, no matter how good it feels, no matter
> how effective it might be for "practical" work that
> one has to do.

I like to be able to reason FIRSTLY on the basis of EXPECTED, NORMAL
behavior, corrected in the first order by reasonable prudence and
caution, in the second order by accurate experimentation, and only in
the third order by considerations of worst-case scenarios -- the latter
normally tempered by estimates of their likelihood... which also
requires a rough understanding of CORRELATION between the causes of such
scenarios, which may be present, both directly or indirectly (assuming
that different components interacting in a system "may just happen" to
hit worst-case scenarios for each of them at once may be demonstrably
wrong in either direction -- and NO big-O analysis will ever help with
this crucial kind of task!).

>   I want to be able to really think *about* what
> I am doing. And in order to be able to think about
> it one usually needs some information about the
> implementation, performance wise, of the language
> features and the system modules that one might
> want to use. If you happen to know of any *better*
> way of offering the requisite information than
> asymptotic complexity measures then, of course,
> I am very grateful to hear more about it.

If you can't think about a design without knowing such details about one
specific implementation of one specific programming language in which
that design might get implemented, I think this strongly suggests you're
giving insufficient attention to the one crucial skill in a designer's
mental armory: *ABSTRACTION*.  There are many ways to cultivate one's
abstraction abilities.  One of my favorite pastimes is collecting
different renditions of Bach's "Art of the Fugue", for example: Bach
carefully abstracted away the information about which instrument(s) were
supposed to be playing which voice(s), as well as many other details --
because of this, the Art of the Fugue has the highest abstraction level
among all well-known musical compositions, and each rendition may take a
different concrete reading of it.  Learn to listen to them and be able
to tell what they have in common, and where they differ: it's a great
lesson in the understanding of abstraction.  But it's probably only
appropriate if you like music, and Baroque in particular; other arts and
discipline may no doubt afford similarly good learning experiences, if
you know where to look for them.

Once you're confident about your abilities of abstraction, I suggest you
continue by the exercise of *characterization* (of A FEW
implementations, ideally) which Jon Bentley (the programmer, not the
jazz player) suggests pretty close to the start of his masterpiece
"Writing Efficient Programs" (that great book may be hard to come by
these days, but the "Programming Pearls" books are also excellent and
communicate mostly the same messages, quite effectively).  Get a sense
for the *EXPECTED* (NOT "asymptotic", for your inputs will NOT tend to
infinity; NOT "worst-case only", don't be THAT pessimistic) behavior of
some primitive operations - the couple of pages I devote to the subject
in the Nutshell chapter on optimization, profiling and testing may
suffice.  Then refine your instinct by taking ONE complicated case, such
as the natural mergesort variant known as the timsort, and delving as
deep into its analysis as you dare -- characterize it as best you can
manage WITHOUT testing, simply by reasoning on its code (Tim's essay,
part of the Python source distribution, will be a helpful addition to a
thorought grounding in Knuth's chapter on sorting, for this purpose)...
then see what a difference it makes, to BE able to experiment!

You'll personally traverse, through such exercises, a curve not too
dissimilar from what the whole of Western thought went through in the
discovery and refinement of the experimental method (to some extent it
can be considered in full bloom in the thoughts and works of Galilei):
not the blind flailing around of pure trial and error (which HAD,
however, proved extremely fruitful in eliciting just about all technical
progress up to that age, and later), much less the ungrounded
elucubration of pure theoreticism (which HAD, mind you, given great
results once in a while, e.g. in Euclid, Nagarjuna, Archimedes...) --
but the powerful, fruitful merging of both strands into the incredibly
productive golden braid which has pulled progress up during the last few
centuries.


> > Consider, for example, point [b].  Quicksort's big-O is N squared,
> > suggesting that quicksort's no better than bubblesort or the like.  But
> > such a characterization is absurd.  A very naive Quicksort, picking its
> > pivot very systematically (e.g., always the first item), may hit its
> > worst case just as systematically and in cases of practical importance
> > (e.g., already-sorted data); but it takes just a little extra care (in
> > the pivot picking and a few side issues) to make the worst-case
> > occurrences into ones that will not occur in practice except when the
> > input data has been deliberately designed to damage by a clever and
> > determined adversary.
> >
> > Designing based on worst-case occurrences hardly ever makes
> > sense in any field of engineering,
> 
> What's wrong with wanting to have a rough idea
> of what might happen in the worst case? I believe
> many engineers are actually expected to think
> about at least some "worst-case" scenarios.

Not extrapolating to infinity.

> Think of nuclear reactors, airplanes, or
> telephone exchanges (and dont' think of Google
> for a change). Don't you expect engineers
> and scientists designing, for example, a nuclear
> reactor, to think hard about what the worst-case
> scenario might be? And how likely it might happen?

A square hit by an asteroid of mass tending to infinity?  No, I don't
expect nuclear reactors (nor anything else of human conception) to be
designed in consideration of what such an asteroid hit would do.  And
yet, that's *EXACTLY* what would be indicated by your theory of big-O as
a guide to design: consider the absolute worst that could conceivably
happen, with *NO* indications WHATSOEVER of how unlikely it might be
(because for simplicity of computation you take limits for misfortune
tending to infinity!!!), and design for THAT.

If our collective ancestors had taken this attitude, we'd still all be
huddling in deep caves (possibly a better protection against "dinosaurs'
killer" levels of asteroid hits!), shivering in the cold (fire is FAR
too dangerous to survive a worst-case analysis, particularly with
damaging elements all tending to infinity, as your love affair with
big-O based designs would certainly indicate!!!).  Animal skins?  Forget
it!!!  Do a perfectly pessimistic worst-case analysis with suitable
extrapolations to infinity and such skins would no doubt carry germs
enough to exterminate the budding human race (not that extinction might
not be preferable to the utterly miserable "lives" the few humans might
lead if "big-O" had guided their design concerns, mind you!-).


> (And *no* testing whatsoever in that direction,
> please!) Not thinking is, admittedly, a lot easier.

I would consider ANYBODY who built a nuclear reactor without AMPLE
testing dangerous enough for all of mankind to shoot on sight.  I would
expect HUGE amounts of simulation-based testing, followed and
interspersed by prototypes (smaller and simplified) to validate that the
simulations' tests are indeed perfectly respondent to actual reality.

I haven't seen anybody on this thread advocating "not thinking"; if
you're somehow implying that I'm trying to discourage people from
thinking IN USEFUL AND PRODUCTIVE WAYS, I challenge you to point to
anything I wrote here that could be construed that way.

> 
> <snip/>
> 
> > Why bother doing
> > (e.g.) random pivot selection in quicksort, when its big-O (i.e.,
> > worst-case) behavior will remain N-squared, just like naive quicksort,
> > or, for that matter, bubblesort?
> 
> Because worst-case is not the only measure of
> computational complexity that one might be
> interested in. For some applications one may
> be able to accept relatively bad worst-case
> behavior, if it doesn't happen too often.

But big-O, which is what you advocate, gives *NO* indication of how
likely or unlikely it might be, in particular -- a TERRIBLE failing.

> This is why for these people we might provide
> information about average case behavior (and
> then the difference between quicksort and
> bubblesort clearly shows).

Of COURSE you might!  Who, pray, is stopping you from so doing, except
perhaps your own laziness and the fact that you prefer to pontificate
about the work that OTHERS should (you believe) do for you FOR FREE, IN
ADDITION to the other work they're already so doing, rather than
constructively participating in the collective effort yourself?

You argued for big-O information, and now you're arguing for more
information (that's much harder to provide in mathematically rigorous
form) regarding "averages" (over WHAT distribution of input
permutations?  Why would you think that all permutations are equally
likely?  In the real world, they're not, but they ARE incredibly hard to
characterize -- you'd better pick a very specific, tiny subfield of
application for sorting, to be able to supply AMPLE experimental support
for your theories... theory without any experimentation to support it
can be worse than worthless, it can be truly DIS-informing!).  Very
well, if you're so convinced this information will be precious (worth
more than, say, the optimizations and new components which people might
alternatively produce, investing as they prefer their own time and
efforts), LEAD BY EXAMPLE.  Pick ONE relatively simple issue of
performance, and explore it to the level of breadth and depth you
believe "analytical" (as opposed to *experimental*) performance
characterization should go.

If you're willing to DO SOME WORK rather than SPEND YOUR TIME WHINING,
you will either produce some beautiful results whose practical
importance will stun others into following your lead, or find out that,
in the real world, there are more things in heaven and earth etc --
e.g., that behavior of virtual memory implementations SWAMPS any subtle
theoretical effects you thought would matter, for containers big enough
to matter, and well before getting anywhere close to the "asymptote" of
big-O and perhaps even big-Theta.

One way or another, something useful will be achieved, which surely
cannot be said about the present thread.


>   For others, such worst-case behavior may not
> be acceptable. For those other applications,
> a good worst-case may be what is required.
> This is why this second category of programmers
> needs to know about the worst-case. - But I am
> certainly belaboring the obvious...

You are (perhaps without realizing it) pointing out that the big-O
characterization which you originally demanded is basicaly useless to
everybody (considering that a system has several subcomponents, and the
conditions under which each reaches worst-case are NOT necessarily
uncorrelated but might be positively or negatively correlated!).  Except
perhaps theoreticians needing to publish theoretical papers, of
course;-).

 
> Of course, the programs that have been designed
> on the basis of such information can (and ought
> to be) tested. Then some surprises (*not* predicted
> by theory) might happen: but if they do not happen
> too often, theory has done a good job - and so has
> the programmer...

That would follow only if the total amount of work (properly accounting
for the huge amounts needed to collect "theoretically" sound
characterizations) was somehow reduced compared to alternative
approaches, based more on sound engineering practice and less on
reminescences of mathematics.

I believe that, out of all the iron suspension bridges designed and
built in the 19th century, ONE is standing -- the Brooklin Bridge.
Others were designed with "sounder theory" and (althought NOT real
"worst case analysis" -- none considered direct asteroid impacts, nor
did any designer "extrapolate to infinity"!!!-) tried to cover
"reasonable" worst cases as the best theory available to them afforded.
And they all went down in the following decades.

The designer of the Brooklin Bridge followed sound engineering practice:
he designed based on TYPICAL (NOT worst-case) behavior, supported by
small-scale prototypes, THEN, knowing perfectly well that he couldn't be
sure he knew exactly WHAT worst-case he needed to ward about... he
doubled the thickness of all steel ropes and load-bearing beams.  So,
HIS bridge is still standing (as is, say, Renzo Piano's airport terminal
building in Japan, where ALL Japanese buildings all around crumbled to
dust in a terrible earthquake... Piano may be an architect rather than
an engineer, but my respect for his craft knows no bounds).

I gather you want nuclear reactors, and programs, designed by the praxis
of all those OTHER suspension bridge designers of the 19th century; I
want them designed by the praxis by which the Brooklin Bridge was
designed.  But in this case, you have an excellent chance to prove me
wrong: just put some of your work where your mouth is!  And I'll be
quite happy to help, because, even if (as I surmise) the attempt at
theoretical characterization proves pragmatically unsuccessful (in terms
of actual usefulness, and specifically of "bang for the buck"), even
then, if serious work has been put towards it, an empirically important
result is obtained.

I suspect you'll just find half-assed excuses to shirk the work which
your suggestions imply, but for once I'd be happy to be proved wrong...


Alex



More information about the Python-list mailing list