Language comparisons

Wed May 9 05:15:25 EDT 2001

"Grant Edwards" <grante at visi.com> wrote in message
news:slrn9fheu4.o0u.grante at isis.visi.com...
> In article <QM%J6.96455$HF.21774089 at news4.rdc1.on.home.com>, Nick Perkins
wrote:
>
> >Things can usually be done many ways in python, and it is
> >rarely obvious which will be fastest.
>
> That's an interesting observation.  I don't think it's true about C -- but
C
> is much lower level and an experienced programmer will have a pretty good
> idea what code the compiler will generate for various constructs (at least
> that's the case in the embedded world -- maybe it isn't for other
> environments).

Kernighan and Pike, in "The Practice of Programming" (great text, btw,
IMHO), argue otherwise, confirming instead the observation that I've
often seen argued (and made myself) that programmers' intuition is
surprisingly-rarely on-target about _where_ their program is in fact
spending its time.  I think the first time I saw that explicitly
observed was nearly 30 years ago, in a report or article by Per
Brinch Hansen about a "Concurrent Pascal" compiler -- after many
relatively-fruitless efforts at optimization, it turned out that
one big time-sink for the whole compiler was a small routine whose
job was removing trailing spaces from punched cards read in, if I
recall correctly.  The way they found this out was by instrumenting
the program to MEASURE where the time was being spent...

It's not enough to know what "the compiler will generate" for a
construct (and if you can do that, your compiler is not a well
optimizing one -- it IS quite possible for Python, but I'm
rather surprised that "in the embedded world" optimization, say
of C programs, is so scarce...?).  Indeed the observation fully
applies to machine-language programs!  This is pretty obvious
for today's complex machines -- heavily pipelined, with stalls
potentially coming who-knows-where, etc -- but it was already
quite true decades ago.  What tends to be non-obvious, and alas
often counter-intuititive, is HOW MANY TIMES a given 'construct'
(generated by the compiler, or hand-written) ends up being executed,
and secondarily how long each execution will take (what's the
time for a taken vs non-taken forwards vs backwards branch,
how does branch-target caching and non-deterministic speculative
execution fit in, will this memory-fetch stall the cache and
how often, ...).  *DON'T GUESS -- MEASURE*!  Instrumenting your
program, or "profiling" it with external tools, is not as hard
as all that, and there may be real performance to be gained that
way, and perhaps even more relevant, from NOT expending huge
amount of complications to optimize what doesn't need to be
("Premature optimization is the root of all evil", Knuth said...).

Alex