[Python-ideas] optimized VM ideas

joe joeedh at gmail.com
Fri Jan 23 07:59:01 CET 2009


So, I've been kicking around some ideas for an optimized python VM.  I freely
admit I'm an amateur at this, but I find the problem of making python code
run faster fascinating.  My ideas arose from observing that google V8's JIT
compiler and type system are much simpler compared to TraceMonkey, but is
also faster, and also learning that SquirrelFish is allegedy faster than V8,
even though it doesn't compile to native code at all (which V8 and I
believe TraceMonkey both do).

This leads me to believe that relatively simple, more general concepts in
VM design can have a bigger impact then specific, highly complicated JIT
solutions, in the context of dynamic languages that can't be easily typed
at compile time.

So I've thought of a few ideas for a more (new) streamlined python VM:

* Simplify the cpython object model as much as possible, while still allowing
  most of the power of the current model.

* Either keep referencing counting, or experiment with some of the newer
  techniques such as pointer escaping. Object models that exclusively rely
  on cyclic GC's have many issues and are hard to get right.

* Possibly modify the bytecode to be register-based, as in SquirrelFish.
  Not sure if this is worth it with python code.

* Use direct threading (which is basically optimizing switch statements to
  be only one or two instructions) for the bytecode loop.

* Remove string lookups for member access entirely, and replaced with a
  system of unique identifyers.  The idea is you would use a hash in the
  types to map a member id to an index.  Hashing ints is faster then strings,
  and I've even thought about experimenting with using collapsed arrays instead
  of hashes.  Of course, the design would still need to support string lookups
  when necessary.  I've thought about this a lot, and I think you'd need the
  same general idea as V8's hidden classes for this to work right (though
  instead of classes, it'd just be member/unique id lookup maps).

I'm not sure I'll have the time to anytime soon to prototype these ideas, but I
thought I'd kick them out there and see what people say.  Note, I'm in no way
suggesting any sort of change to the existing cpython VM (it's way, way too
early for that kind of talk).

references:

v8's design: http://code.google.com/apis/v8/design.html
squirrelfish's design:
http://blog.mozilla.com/dmandelin/2008/06/03/squirrelfish/

Joe



More information about the Python-ideas mailing list