[Python-Dev] Python Specializing Compiler
Armin Rigo
arigo@ulb.ac.be
Mon, 25 Jun 2001 15:45:20 +0200
Hello,
At 14:59 22.06.2001 +0200, Samuele Pedroni wrote:
>*: some possible useful hooks would be:
>- minimal profiling support in order to specialize only things called often
>- feedback for dynamic changing of methods, class hierarchy, ... if we want
>to optimize method lookup (which would make sense)
>- a mixed fixed slots/dict layout for instances.
There is one point that you didn't mention, which I believe is important:
how to handle global/builtin variables. First, a few words about the
current Python semantics.
* I am sorry if what follows has already been discussed; I am raising the
question again because it might be important for Psyco. If you feel this
should better be a PEP please just tell me so. *
Complete lexical scoping was recently added, implemented with "free" and
"cell" variables. These are only used for functions defined inside of other
functions; top-level functions use the opcode LOAD_GLOBAL for all non-local
variables. LOAD_GLOBAL performs one or two dictionary look-up (two if the
variable is built-in). For simple built-ins like "len" this might be
expensive (has someone measured such costs ?).
I suggest generalizing the compile-time lexical scoping rules. Let's
compile all functions' non-local variables (top-level and others) as "free"
variables. This means the corresponding module's global variables must be
"cell" variables. This is just what we would get if the module's code was
one big function enclosing the definition of all the other functions. Next,
the variables not defined in the module (the built-ins) are "free"
variables of the module, and the built-in module provides "cell" variables
for them. Remember that "free" and "cell" variables are linked together
when the function (or module in this case) is defined (for functions, when
"def" is executed; for modules, it would be at load-time).
Benefit: not a single dictionary look-up any more; uniformity of treatment.
Potential code break: global variables shadowing built-ins would behave
like local variables shadowing globals, i.e. the mere presence of a global
"xyz=..." would forever hide the "xyz" built-in from the module, even
before the assignment or after a "del xyz". (c.f. UnboundLocalError.)
To think about: what the "global" keyword would mean in this context.
Implementation problems: if we want to keep the module's dictionary of
global variables (and we certainly do) it would require changes to the
dictionary implementation (or the creation of a different kind of
dictionary). One solution is to automatically dereference cell objects and
raise exceptions upon reading empty cells. Another solution is to turn
dictionaries into collections of objects that all behave like cell objects
(so that if "d" is any dictionary, something like "d.ref(key)" would let us
get a cell object which could be read or written later to actually get or
set the value associated to "key", and "d[key]" would mean
"d.ref(key).cell_ref). Well, these are just proposals; they might not be a
good solution.
Why it is related to Psyco: the current treatment of globals/builtins makes
it hard for Psyco to statically tell what function we are calling when it
sees e.g. "len(a)" in the code. We would at least need some help from the
interpreter; at least hooks called when the module's globals() dictionary
change. The above proposal might provide a more uniform solution.
Thanks for your attention.
Armin.