Python performance notes...

Thu May 25 12:36:22 EDT 2000

Courageous wrote:
...
> > Empty loops don't happen in reality, and it makes limited
> > sense to compare a Python loop against a compiler, which in
> > this case even has the chance to optimize the full thing
> > away.
> 
> Well, it didn't change it to i=i+100000, if that's
> what you mean. What I was really testing was the
> cost of the increment operation, establishing a worse
> case. We already know that the best cases are better,
> but doing the worse case allows us to compare to other
> worse case situations, IMO. A friend of mine wanted
> to know what the loop cost was, to compare it to TCL
> as a watermark.

I see what you want, but not why. Comparing worst cases
from artificial construct which don't occour in real
programs might be fun, but might be as well be misleading.
If a decision maker reads "Python loop == 100 * slower than C",
this is bad for everybody.
Comparing typical cases can lead to better decisions.
I't talk about an "extreme case", but then turn to a
"realistic case".

So, a more typical case would be:
- a loop
- some operations on Python objects

Given that, the loop cannot be optimized away.
You have the following mapping then:

  Python script            C translation
  -------------            -------------
  loop construct           C loop
  loop jump                C jump
  interpreted object ops   C calls to object ops

Well, comparing these gives you a bit more than optimizing
the loop from Python into C; it also translates some
other bytecode operations, but keeps using the high level
objects. Pretty much what P2C would produce.

Measuring the exact loop overhead in both versions
could be acomplished quite easily by duplicating
the object operations.

> > The calling overhead of the objects in your code is still
> > there, and unless you leave Python totally and use native
> > C structures all the time, you will not gain much more
> > than 30, maybe 40 percent of speed by avoiding the interpreter.
> 
> Well, that's interesting, any pointers which prove this,
> or is this experience speaking?

This is experience, from lots of experiments with speeding
Python up, using the P2C translator, and so on. After it
became clear to me that there is this 40% limit, unless
you do massive program analyses, I stopped thinking of
optimizing the current interpreter any longer.

but-started-to-think-of-much-more-drastic-things-ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com