[pypy-dev] Minimal VM

holger krekel hpk at trillke.net
Tue Jan 14 18:01:28 CET 2003


[Paul Boddie Tue, Jan 14, 2003 at 08:41:28AM -0800]
> Christian Tismer <tismer at tismer.com> wrote:
> >
> >There will be a very tiny, portable little virtual
> >machine written in C. It is not meant to be efficient,
> >just enough to implement the bytecode interpreter.
> >I'm right now tinkering with parts of such a beast.
> >It is getting very small, just a few kilobytes executable.
> 
> The most interesting parts, it seems to me, will be those 
> which implement the more complicated bytecode semantics (like LOAD_NAME, 
> for example, which seems to have the potential to be pretty deep).

that's actually pretty easy. it's basically (from eval_frame in compile.c):

    case LOAD_NAME:
            w = GETNAMEV(oparg); 
            if ((x = f->f_locals) == NULL) { // exception ... }
            x = PyDict_GetItem(x, w);
            if (x == NULL) {
                x = PyDict_GetItem(f->f_globals, w);
                if (x == NULL) {
                    x = PyDict_GetItem(f->f_builtins, w);
                    if (x == NULL) { // exception ...  }
                }
            }
            Py_INCREF(x);
            PUSH(x);
            break;

so it looks into local, then global and then the builtin namespace. 

> Although you are bound to know much more about this than me - 
> I've never looked at the Python VM source code - 

btw, it's quite easy to read if you know C.  It's pythonic C mostly :-)

> there must surely be a huge chunk of C code implementing the 
> name lookup semantics.

not that much if you don't count the Dict object in.
some complexity drops in with optimizations and special cases:  

    - local name-bindings are usually mapped to LOAD_FAST/STORE_FAST 
      which don't go through a dictionary lookup but use integer indexes
      into an array.  these are figured out at compile time. 

    - nested scopes (LOAD_DEREF) which bind a name to an outer namespace

they probably aren't needed for a Stage1-VM.  Also a big part is 
exception (and block) handling. 
 
> >This thing will be able to interpret Python bytecode.
> 
> Without getting carried away by the promise of massive performance 
> increases, an interesting application of a simplified VM should be 
> the increased potential for reimplementations of the platform. It 
> would be most amusing to be able to reduce the scope of VM 
> operations such that they could be more easily implemented 
> on "really small" computing platforms.

There will be a memory-speed tradeoff, though.  So if
"small" means 1M memory the odds are that a CPython based
approach is more effective. 

> Of course, a side effect of having simpler VM operations (ie. 
> simpler bytecode semantics) is that Psyco (or its successor) 
> will have much more to play with, as more of the VM "magic" 
> moves out of the VM and into bytecode routines which can 
> then be specialised.

IMO the CPython VM is pretty straight forward.  But
the fact that Psyco hits the C-barrier too soon (with the
frame object and the eval_frame loop) stops it from
going/specializing deeper. 

cheers,

    holger


More information about the Pypy-dev mailing list