Trivial performance questions

Peter Hansen peter at engcorp.com
Fri Oct 17 18:00:32 EDT 2003


Geoff Gerrietts wrote:
> 
> Quoting Peter Hansen (peter at engcorp.com):
> >
> > Not unless you add Alex' constraint that the two alternatives under
> > consideration are equally readable.  Otherwise the less readable one
> > is always going to cost you more at maintenance time.
> 
> Yes to your first sentence, not so sure to the second. The implication
> is the code will always be touched, and my contention is that if you
> don't pay at least trivial attention to writing something optimal --
> includes avoiding geometric algorithms -- then you're significantly
> increasing the amount of maintenance work necessary.

I won't disagree with most of that (we're rapidly reaching near total 
agreement here! :-) but I do think that assuming "the code will always
be touched" is a very healthy attitude, in the same way you think that
at least trivial attention to performance is a healthy attitude.

We certainly have code that hasn't been touched during maintenance,
but nobody could have predicted which areas of the code that would be.

> Capitalism has bred a real reliance on "good enough": when you hit
> your payoff point, you don't go any farther. It's a useful metric to
> apply, but a dangerous premise to base all your decisions on. "Good
> enough" needs to be critically evaluated for both the short term and
> the long term.

As an XP team, we tend to consider that critical evaluation to be 
the domain of the customer, so we basically don't worry about it 
until there is feedback that we're doing the wrong thing.  This,
in cooperation with the customer, makes the best use of the our
resources (for which the customer is paying, in effect).  But,
yeah, that's just the XP view of things.

> A half-million micro-optimizations may not pay for themselves

Phew!  I seriously hope your group hasn't examined that many 
pieces of code with performance concerns in mind!  We don't have
even that many lines of code, let alone areas that could be
micro-optimized.

> individually. But in the long term, when confronted with a total
> system rewrite because the collected work can no longer perform
> adequately, and standard optimization techniques have met with
> diminishing returns, you're going to regret not having paid attention
> the first time through, 

There's some truth in that, but I can't shake the nagging feeling
that simply by using Python, we've moved into a realm where the
best way to optimize a serious problem area is to rewrite in C
or Pyrex, or get a faster processor.  (Like you, I can be 
persuaded, but this is what _my_ experience has taught me.)

> >   http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast
> 
> It's an interesting formulation but it stinks of propaganda to me.
> When generic catchphrases are re-interpreted by almost every viewer
> its a pretty fair bet they're not precise enough to be really useful.
> The discussion on this page makes me think of Biblical scholars
> debating the meaning of ambiguous passages.

Actually, it's probably just that re-interpretation and discussion
which proves so very useful, not the phrase itself.  Like a Zen
koan or something, it's too short (or ambiguous) to have direct,
hard meaning, but the meme it carries is a valuable one with which
to be infected. ;-)

The same probably holds true about ambiguous biblical passages, 
I hate to admit.

> My group has invested probably something like 15 person-years writing
> Python code in the last few years. We have probably put about one of
> those person years into trying to account for performance bottlenecks.
> Management is presently of the opinion that a drastic rewrite is the
> only way to resolve the remaining issues. Perhaps the most distinct
> difference between your group and mine is that many of our developers
> are fairly novice, and prone to select solutions that are not
> well-informed about performance issues and algorithm complexity. On
> the other hand, maybe our code is just more heavily used?

I'd vote for the latter.  My group has been heavily junior in flavour.
Perhaps another cause of the difference is our greater (?) emphasis
on XP and test-driven development?  I doubt anyone could say, but 
for sure your code is more heavily used.  I don't even need to know
what it does to say that. :-)

Maybe one example: we used += with strings a lot in the early days.
Partly junior developers, a greater part due to inexperience with
Python.  I think only one or two bits of our code has been re-written
to use [].append() and ''.join() instead, because only those bits
came to the fore when performance was an issue.  The rest is still
merrily chewing up CPU time doing wasteful += on strings, but nobody
cares.  We refactor that (for consistency, mainly, I think) when we
get to them for other reasons, and new code probably doesn't use +=
so much, but that's about the extent of it.

> At first blush, I thought "maybe there's an equilibrium that needs to
> be found". But I don't think so now. I think it's important for
> younger (intermediate?) developers to be obsessed with performance, so
> they can learn the dangers of bad algorithms, how to recognize them,
> how to avoid them. And it's worth building good habits where you
> choose an optimal idiom rather than a slower one.

I would agree that new developers would benefit from that kind of
experience.  One of the few reasons why a (good) university or
college education can be of value to a programmer.  So can critical
reading of some decent books or web pages on the topic.

> You can disagree, but I've done a lot of reading and thinking on the
> matter, in part because my experience and my beliefs have been at odds
> in the past. Consequently, you're going to hafta try harder than
> invoking the divine authority of Kent Beck (or even Knuth!) to
> persuade me. Still, I can yet be persuaded; my mind is quite
> tractable.

I think Kent is merely on a par with the Pope, but is not Himself
divine.  ;-)  Knuth is another story, perhaps.  :-)

-Peter




More information about the Python-list mailing list