[Python-Dev] Speeding up CPython 5-10%

Damien George damien.p.george at gmail.com
Wed Jan 27 15:10:03 EST 2016


Hi Yuri,

I think these are great ideas to speed up CPython.  They are probably
the simplest yet most effective ways to get performance improvements
in the VM.

MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
by PyPy, and the main reason to have it is because you don't need to
allocate on the heap when doing a simple method call).  The specific
opcodes are:

LOAD_METHOD # same behaviour as you propose
CALL_METHOD # for calls with positional and/or keyword args
CALL_METHOD_VAR_KW # for calls with one or both of */**

We also have LOAD_ATTR, CALL_FUNCTION and CALL_FUNCTION_VAR_KW for
non-method calls.

MicroPython also has dictionary lookup caching, but it's a bit
different to your proposal.  We do something much simpler: each opcode
that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR,
etc) includes a single byte in the opcode which is an offset-guess
into the dictionary to find the desired element.  Eg for LOAD_GLOBAL
we have (pseudo code):

CASE(LOAD_GLOBAL):
key = DECODE_KEY;
offset_guess = DECODE_BYTE;
if (global_dict[offset_guess].key == key) {
    // found the element straight away
} else {
    // not found, do a full lookup and save the offset
    offset_guess = dict_lookup(global_dict, key);
    UPDATE_BYTECODE(offset_guess);
}
PUSH(global_dict[offset_guess].elem);

We have found that such caching gives a massive performance increase,
on the order of 20%.  The issue (for us) is that it increases bytecode
size by a considerable amount, requires writeable bytecode, and can be
non-deterministic in terms of lookup time.  Those things are important
in the embedded world, but not so much on the desktop.

Good luck with it!

Regards,
Damien.


More information about the Python-Dev mailing list