[Python-Dev] Bytecode idea

Jeremy Hylton jeremy@zope.com
Wed, 26 Feb 2003 11:55:29 -0500


Chris Tismer wrote:
> Oh, that was not what I meant. I also did this
> two years ago and tossed it. Function calls
> are too expensive.
> What I mean was to fold opcodes by common patterns.
> Unfortunately this is slower, too.
>
> Anyway, I didn't want to get too deep into this.
> Stopping wasting time now :-)

Chris already knows this, but it's worth repeating for people who don't.  A
function call isn't always too expensive, it depends on how much work the
opcode is doing.  And it depends on lots of other hard-to-predict effects of
the generated code and its interaction with the memory system.

The various function call opcodes regularly call out to separate functions.
I recall benchmarking various options and often moving big chunks of code
out of the mainloop and into functions improved performance slightly.
Except when it didn't <0.3 wink>.

If you are benchmarking various opcode effects, I'd recommend trying to
revive the simple cycle counter instrumentation I did for Python 2.2.  The
idea is to use the Pentium cycle counter to measure the number of cycles
spent on each trip through the mainloop.  A rough conclusion from the
previous measurements was that trivial opcodes like POP_TOP can execute in
less than 100 cycles, including opcode dispatch.  An opcode that involves
calling out to a C function never executes in less than 100 cycles, and
often takes 100s of cycles.

There's a patch floating around sourceforge somewhere.

Jeremy