[Python-Dev] Making builtins more efficient

Steven Elliott selliott4 at austin.rr.com
Thu Mar 9 15:50:06 CET 2006


On Thu, 2006-03-09 at 12:00 +0000, Paul Moore wrote:
> On 3/9/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Steven Elliott wrote:
> > > I'm interested in how builtins could be more efficient.  I've read over
> > > some of the PEPs having to do with making global variables more
> > > efficient (search for "global"):
> > >     http://www.python.org/doc/essays/pepparade.html
> > > But I think the problem can be simplified by focusing strictly on
> > > builtins.
> >
> > Unfortunately, builtins can currently be shadowed in the module global
> > namespace from outside the module (via constructs like "import mod; mod.str =
> > my_str"). Unless/until that becomes illegal, focusing solely on builtins
> > doesn't help - the difficulties lie in optimising builtin access while
> > preserving the existing name shadowing semantics.
> 
> Is there any practical way of detecting and flagging constructs like
> the above (remotely shadowing a builtin in another module)? I can't
> see a way of doing it (but I know very little about this area...).

It may be possible to flag it, or it may be possible it make it work.

In my post I mentioned one special case that needs to be addressed
(assigning to __builtins__).  What Nick mentioned in his post ("import
mod; mod.str = my_str") is another special case that needs to be
addressed.  If we can assume that all pyc files are compiled with the
same set of default bulitins (which should be assured by the by the
version in the pyc file) then there are two ways that things like
"mod.str = my_str" could be handled.

I believe that currently "mod.str = my_str" alters the module's global
hash table (f->f_globals in the code).  One way of handling it is to
alter STORE_ATTR (op code for assigning to mod.str) to always check to
see if the key being assigned is one of the default builtins.  If it is,
then the module's indexed array of builtins is assigned to.

Alternatively if we also wanted to optimize "mod.str = my_str" then
there could be a new opcode like STORE_ATTR that would take an index
into the array of builtins instead of an index into the names.

PEP 280, which Nick mentioned, talks about a "cells", a hybrid data
structure that can do both hash table lookups and lookups by index
efficiently.  That's great, but I'm curious if additional gains can be
made be focusing just on builtins.

-- 
-----------------------------------------------------------------------
|          Steven Elliott          |      selliott4 at austin.rr.com     |
-----------------------------------------------------------------------




More information about the Python-Dev mailing list