[Python-Dev] Accessing globals without dict lookup

Guido van Rossum guido@python.org
Fri, 08 Feb 2002 11:50:31 -0500


Inspired by talks by Jeremy and Skip on DevDay, here's a different
idea for speeding up access to globals.  It retain semantics but (like
Jeremy's proposal) changes the type of a module's __dict__.

- Let a cell be a really simple PyObject, containing a PyObject
  pointer and a cell pointer.  Both pointers may be NULL.  (It may
  have to be called PyGlobalCell since I believe there's already a
  PyCell object.)  (Maybe it doesn't even have to be an object -- it
  could just be a tiny struct.)

- Let a celldict be a mapping that is implemented using a dict of
  cells.  When you use its getitem method, the PyObject * in the cell
  is dereferenced, and if a NULL is found, getitem raises KeyError
  even if the cell exists.  Using setitem to add a new value creates a
  new cell and stores the value there; using setitem to change the
  value for an existing key stores the value in the existing cell for
  that key.  There's a separate API to access the cells.

- We change the module implementation to use a celldict for its
  __dict__.  The module's getattr and setattr operations now map to
  getitem and setitem on the celldict.  I think the type of
  <module>.__dict__ and globals() is the only backwards
  incompatibility.

- When a module is initialized, it gets its __builtins__ from the
  __builtin__ module, which is itself a celldict.  For each cell in
  __builtins__, the new module's __dict__ adds a cell with a NULL
  PyObject pointer, whose cell pointer points to the corresponding
  cell of __builtins__.

- The compiler generates LOAD_GLOBAL_CELL <i> (and STORE_GLOBAL_CELL
  <i> etc.) opcodes for references to globals, where <i> is a small
  index with meaning only within one code object like the const index
  in LOAD_CONST.  The code object has a new tuple, co_globals, giving
  the names of the globals referenced by the code indexed by <i>.  I
  think no new analysis is required to be able to do this.

- When a function object is created from a code object and a celldict,
  the function object creates an array of cell pointers by asking the
  celldict for cells corresponding to the names in the code object's
  co_globals.  If the celldict doesn't already have a cell for a
  particular name, it creates and an empty one.  This array of cell
  pointers is stored on the function object as func_cells.  When a
  function object is created from a regular dict instead of a
  celldict, func_cells is a NULL pointer.

- When the VM executes a LOAD_GLOBAL_CELL <i> instruction, it gets
  cell number <i> from func_cells.  It then looks in the cell's
  PyObject pointer, and if not NULL, that's the global value.  If it
  is NULL, it follows the cell's cell pointer to the next cell, if it
  is not NULL, and looks in the PyObject pointer in that cell.  If
  that's also NULL, or if there is no second cell, NameError is
  raised.  (It could follow the chain of cell pointers until a NULL
  cell pointer is found; but I have no use for this.)  Similar for
  STORE_GLOBAL_CELL <i>, except it doesn't follow the cell pointer
  chain -- it always stores in the first cell.

- There are fallbacks in the VM for the case where the function's
  globals aren't a celldict, and hence func_cells is NULL.  In that
  case, the code object's co_globals is indexed with <i> to find the
  name of the corresponding global and this name is used to index the
  function's globals dict.


I believe that's it.  I think this faithfully implements the current
semantics (where a global can shadow a builtin), but without the need
for any dict lookups when accessing globals, except in cases where an
explicit dict is passed to exec or eval().

Compare this to Jeremy's scheme using dlicts:

http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals

- My approach doesn't require global agreement on the numbering of the
  globals; each code object has its own numbering.  This avoids the
  need for more global analysis, and allows adding code to a module
  using exec that introduces new globals without having to fall back
  on a less efficient scheme.

- Jeremy's approach might be a teensy bit faster because it may have
  to do less work in the LOAD_GLOBAL; but I'm not convinced.


Here's a implementation sketch for cell and celldict.  Note in
particular that keys() only returns the keys for which the cell's
objptr is not NULL.


NULL = object() # used as a token


class cell(object):

    def __init__(self):
        self.objptr = NULL
        self.cellptr = NULL


class celldict(object):

    def __init__(self):
        self.__dict = {} # dict of cells

    def getcell(self, key):
        c = self.__dict.get(key)
        if c is None:
            c = cell()
            self.__dict[key] = c
        return c

    def __getitem__(self, key):
        c = self.__dict.get(key)
        if c is None:
            raise KeyError, key
        value = c.objptr
        if value is NULL:
            raise KeyError, key
        else:
            return value

    def __setitem__(self, key, value):
        c = self.__dict.get(key)
        if c is None:
            c = cell()
            self.__dict[key] = c
        c.objptr = value

    def __delitem__(self, key):
        c = self.__dict.get(key)
        if c is None or c.objptr is NULL:
            raise KeyError, key
        c.objptr = NULL

    def keys(self):
        return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL]

    def clear(self):
        for c in self.__dict.values():
            c.objptr = NULL

    # Etc.


--Guido van Rossum (home page: http://www.python.org/~guido/)