[Python-Dev] Re: [Python-checkins] python/nondist/peps
pep-0329.txt, 1.2, 1.3
Phillip J. Eby
pje at telecommunity.com
Tue Apr 20 14:05:44 EDT 2004
At 09:50 AM 4/20/04 -0700, Guido van Rossum wrote:
>It is quite the opposite of the PEP! The PEP proposes a quick, very
>visible hack that works only for one implementation; your proposal
>here lays the foundation for changing the language to enable the same
>kind of optimizations.
>
>I like that much better, but I doubt that it is doable in the
>timeframe for 2.4, nor do I think it is needed. Also, your 4th bullet
>proposes exactly (except for the __future__ statement) what was
>implemented in moduleobject.c in rev 2.46 and then withdrawn in rev
>2.47; it is not feasible for a number of reasons (see python-dev for
>the gory details; I don't recall what they were, just that they were
>convincing).
Reviewing the problems I see that the issues are with:
1) extension modules have a problem during initialization if they use setattr
2) modules and subpackages of package, that have names that shadow builtins
#1 is fixable, as long as there is some kind of flag for the module's
state, which is needed in order to support the __future__ statement
anyway. I don't know how this would affect Jython or IronPython, but I
don't think they really have "extension modules" as such.
#2 is harder, because it needs sane rules. I don't think the parser should
have to know about other modules. But banning modules named after builtins
isn't appropriate either.
OTOH, the only time this causes ambiguity is when the following conditions
*all* apply:
* The module is a package __init__
* The module contains functions or methods that reference the builtin name
* The module does not directly import the name
I'm tempted to say that this is broken code, except that it's possible for
the module to import a module that then imports the module that has the
conflicting name.
But I believe I have a solution. See below.
>The __future__ statement sounds like an excellent idea to me, as it
>enables experimentation with the new feature. One thing: we need to
>specify the future behavior very carefully so that other Python
>implementations will be able to do the right thing without having to
>reverse-engineer CPython.
Here is my proposal for the semantics of "optimized builtins":
* The compiler identifies names in a module that are builtin names (as
defined by the language version), but are never assigned to (or otherwise
declared global) in the module. It adds code at the beginning of the
module that sets a module-level variable, let's say '__builtins_used__', to
list those names whose use it *may* be able to optimize. (Note that this
step applies to all modules compiled, not just those with the __future__
statement, and there is some additional complexity needed in the generated
code to correctly handle being 'exec'-d in an existing namespace.)
* If the __future__ statement is in effect, also set a
'__builtins_optimized__' flag in the module dictionary, and actually
implement any desired optimizations.
* module.__setattr__ either warns or issues an error when setting a name
listed in '__builtins_used__', depending on the status of
'__builtins_optimized__'. If either is missing, the current
(backward-compatible) semantics of setattr should apply.
Note that if an extension module uses setattr to initialize itself, it will
not break, because it does not have a '__builtins_used__' attribute. Also
note that mere container packages will not break because they contain
modules or packages named after builtins. Only packages which actually do
something with the contained module, while also failing to bind that name,
will receive warnings.
Such modules can then simply add an explicit e.g. 'import .list' or
'global list' in the appropriate function(s), or use some similar approach
to clarify that the named item is a module. There would be some potential
pain here when new builtins are added, however, since previously-working
code could break.
There's a way to fix that too, but it may be a bit harsh. Issue a warning
for ambiguous use of global names that are *not* builtin, but are not
explicitly bound by the module. That is, if I use the name 'foo' in a
function, and it is not a local, and is not declared 'global' or explicitly
bound by module level code (i.e. because I am hacking globals() or because
the name is a submodule), I should be warned that I should explicitly
declare my intended usage. E.g. "Name 'foo' is never assigned a value:
perhaps you're missing a 'global' declaration or 'import' statement?"
The warning could be introduced as a PendingDeprecationWarning, upgraded to
a warning for modules using the __future__ statement. This would then
discourage writing such ambiguous code in future. (Oh, and speaking of
ambiguity, use of 'import *' would either have to be forbidden in optimized
modules, disable all optimization, or else use setattr and thus break at
runtime if there's a conflict with an optimized name.)
Whew. That's quite a list of things that would have to be done, but
presumably we'll have to pay the piper (pyper?) sometime if we want to get
to optimized builtin land someday.
More information about the Python-Dev
mailing list