[Python-Dev] Accessing globals without dict lookup
Guido van Rossum
guido@python.org
Mon, 11 Feb 2002 11:28:59 -0500
> All right -- i have attempted to diagram a slightly more interesting
> example, using my interpretation of Guido's scheme.
[...]
> How does it look? Guido, is it anything like what you have in mind?
Yes, exactly. I've added pointers to your images to PEP 280. Maybe
you can also create a diagram for Tim's "more aggressive" scheme?
> A couple of observations so far:
>
> 1. There are going to be lots of global-cell objects.
> Perhaps they should get their own allocator and free list.
Yes.
> 2. Maybe we don't have to change the module dict type.
> We could just use regular dictionaries, with the special
> case that if retrieving the value yields a cell object,
> we then do the objptr/cellptr dance to find the value.
> (The cell objects have to live outside the dictionaries
> anyway, since we don't want to lose them on a rehashing.)
And who would do the special dance? If PyDict_GetItem, it would add
an extra test to code whose speed is critical in lots of other cases
(plus it would be impossible to create a dictionary containing cells
without having unwanted special magic). If in a wrapper, then
<module>.__dict__[<key>] would return a surprise cell instead of a
value.
> 3. Could we change the name, please? It would really suck
> to have two kinds of things called "cell objects" in
> the Python core.
Agreed. Or we could add a cellptr to the existing cell objects; or
maybe a scheme could be devised that wouldn't need a cell to have a
cellptr, and then we could use the existing cell objects unchanged.
> 4. I recall Tim asked something about the cellptr-points-to-itself
> trick. Here's what i make of it -- it saves a branch: instead of
>
> PyObject* cell_get(PyGlobalCell* c)
> {
> if (c->cell_objptr) return c->cell_objptr;
> if (c->cell_cellptr) return c->cell_cellptr->cell_objptr;
> }
>
> it's
>
> PyObject* cell_get(PyGlobalCell* c)
> {
> if (c->cell_objptr) return c->cell_objptr;
> return c->cell_cellptr->cell_objptr;
> }
That's what my second "additional idea" in PEP 280 proposes:
| - Make c.cellptr equal to c when a cell is created, so that
| LOAD_GLOBAL_CELL can always dereference c.cellptr without a NULL
| check.
> This makes no difference when c->cell_objptr is filled,
> but it saves one check when c->cell_objptr is NULL in
> a non-shadowed variable (e.g. after "del x"). I believe
> that's the only case in which it matters, and it seems
> fairly rare to me that a module function will attempt to
> access a variable that's been deleted from the module.
Agreed. When x is not defined, it doesn't matter how much extra code
we execute as long as we don't dereference NULL. :-)
> Because the module can't know what new variables might
> be introduced into __builtin__ after the module has been
> loaded, a failed lookup must finally fall back to a lookup
> in __builtin__. Given that, it seems like a good idea to
> set c->cell_cellptr = c when c->cell_objptr is set (for
> both shadowed and non-shadowed variables). In my picture,
> this would change the cell that spam.max points to, so
> that it points to itself instead of __builtin__.max's cell.
> That is:
>
> PyObject* cell_set(PyGlobalCell* c, PyObject* v)
> {
> c->cell_objptr = v;
> c->cell_cellptr = c;
> }
But now you'd have to work harder when you delete the global again
(i.e. in cell_delete()); the shadowed built-in must be restored.
> This simplifies things further:
>
> PyObject* cell_get(PyGlobalCell* c)
> {
> return c->cell_cellptr->cell_objptr;
> }
>
> This buys us no branches, which might be a really good
> thing on today's speculative execution styles.
Good idea! (And before I *did* misread your followup, because I
hadn't fully digested this msg. I think you're right that we might be
able to use just a PyObject **; but I haven't fully digested Tim's
more aggressive idea.)
--Guido van Rossum (home page: http://www.python.org/~guido/)