[Python-Dev] Accessing globals without dict lookup

Mon, 11 Feb 2002 11:28:59 -0500

> All right -- i have attempted to diagram a slightly more interesting
> example, using my interpretation of Guido's scheme.
[...]
> How does it look?  Guido, is it anything like what you have in mind?

Yes, exactly.  I've added pointers to your images to PEP 280.  Maybe
you can also create a diagram for Tim's "more aggressive" scheme?

> A couple of observations so far:
> 
>     1.  There are going to be lots of global-cell objects.
>         Perhaps they should get their own allocator and free list.

Yes.

>     2.  Maybe we don't have to change the module dict type.
>         We could just use regular dictionaries, with the special
>         case that if retrieving the value yields a cell object,
>         we then do the objptr/cellptr dance to find the value.
>         (The cell objects have to live outside the dictionaries
>         anyway, since we don't want to lose them on a rehashing.)

And who would do the special dance?  If PyDict_GetItem, it would add
an extra test to code whose speed is critical in lots of other cases
(plus it would be impossible to create a dictionary containing cells
without having unwanted special magic).  If in a wrapper, then
<module>.__dict__[<key>] would return a surprise cell instead of a
value.

>     3.  Could we change the name, please?  It would really suck
>         to have two kinds of things called "cell objects" in
>         the Python core.

Agreed.  Or we could add a cellptr to the existing cell objects; or
maybe a scheme could be devised that wouldn't need a cell to have a
cellptr, and then we could use the existing cell objects unchanged.

>     4.  I recall Tim asked something about the cellptr-points-to-itself
>         trick.  Here's what i make of it -- it saves a branch: instead of
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 if (c->cell_objptr) return c->cell_objptr;
>                 if (c->cell_cellptr) return c->cell_cellptr->cell_objptr;
>             }
> 
>         it's
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 if (c->cell_objptr) return c->cell_objptr;
>                 return c->cell_cellptr->cell_objptr;
>             }

That's what my second "additional idea" in PEP 280 proposes:

|     - Make c.cellptr equal to c when a cell is created, so that
|       LOAD_GLOBAL_CELL can always dereference c.cellptr without a NULL
|       check.

>         This makes no difference when c->cell_objptr is filled,
>         but it saves one check when c->cell_objptr is NULL in
>         a non-shadowed variable (e.g. after "del x").  I believe
>         that's the only case in which it matters, and it seems
>         fairly rare to me that a module function will attempt to
>         access a variable that's been deleted from the module.

Agreed.  When x is not defined, it doesn't matter how much extra code
we execute as long as we don't dereference NULL. :-)

>         Because the module can't know what new variables might
>         be introduced into __builtin__ after the module has been
>         loaded, a failed lookup must finally fall back to a lookup
>         in __builtin__.  Given that, it seems like a good idea to
>         set c->cell_cellptr = c when c->cell_objptr is set (for
>         both shadowed and non-shadowed variables).  In my picture,
>         this would change the cell that spam.max points to, so
>         that it points to itself instead of __builtin__.max's cell.
>         That is:
> 
>             PyObject* cell_set(PyGlobalCell* c, PyObject* v)
>             {
>                 c->cell_objptr = v;
>                 c->cell_cellptr = c;
>             }

But now you'd have to work harder when you delete the global again
(i.e. in cell_delete()); the shadowed built-in must be restored.

>         This simplifies things further:
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 return c->cell_cellptr->cell_objptr;
>             }
> 
>         This buys us no branches, which might be a really good
>         thing on today's speculative execution styles.

Good idea!  (And before I *did* misread your followup, because I
hadn't fully digested this msg.  I think you're right that we might be
able to use just a PyObject **; but I haven't fully digested Tim's
more aggressive idea.)

--Guido van Rossum (home page: http://www.python.org/~guido/)