[Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0329.txt, 1.2, 1.3

Tue Apr 20 14:05:44 EDT 2004

At 09:50 AM 4/20/04 -0700, Guido van Rossum wrote:

>It is quite the opposite of the PEP!  The PEP proposes a quick, very
>visible hack that works only for one implementation; your proposal
>here lays the foundation for changing the language to enable the same
>kind of optimizations.
>
>I like that much better, but I doubt that it is doable in the
>timeframe for 2.4, nor do I think it is needed.  Also, your 4th bullet
>proposes exactly (except for the __future__ statement) what was
>implemented in moduleobject.c in rev 2.46 and then withdrawn in rev
>2.47; it is not feasible for a number of reasons (see python-dev for
>the gory details; I don't recall what they were, just that they were
>convincing).

Reviewing the problems I see that the issues are with:

1) extension modules have a problem during initialization if they use setattr

2) modules and subpackages of package, that have names that shadow builtins

#1 is fixable, as long as there is some kind of flag for the module's 
state, which is needed in order to support the __future__ statement 
anyway.  I don't know how this would affect Jython or IronPython, but I 
don't think they really have "extension modules" as such.

#2 is harder, because it needs sane rules.  I don't think the parser should 
have to know about other modules.  But banning modules named after builtins 
isn't appropriate either.

OTOH, the only time this causes ambiguity is when the following conditions 
*all* apply:

  * The module is a package __init__

  * The module contains functions or methods that reference the builtin name

  * The module does not directly import the name

I'm tempted to say that this is broken code, except that it's possible for 
the module to import a module that then imports the module that has the 
conflicting name.

But I believe I have a solution.  See below.

>The __future__ statement sounds like an excellent idea to me, as it
>enables experimentation with the new feature.  One thing: we need to
>specify the future behavior very carefully so that other Python
>implementations will be able to do the right thing without having to
>reverse-engineer CPython.

Here is my proposal for the semantics of "optimized builtins":

* The compiler identifies names in a module that are builtin names (as 
defined by the language version), but are never assigned to (or otherwise 
declared global) in the module.  It adds code at the beginning of the 
module that sets a module-level variable, let's say '__builtins_used__', to 
list those names whose use it *may* be able to optimize.  (Note that this 
step applies to all modules compiled, not just those with the __future__ 
statement, and there is some additional complexity needed in the generated 
code to correctly handle being 'exec'-d in an existing namespace.)

* If the __future__ statement is in effect, also set a 
'__builtins_optimized__' flag in the module dictionary, and actually 
implement any desired optimizations.

* module.__setattr__ either warns or issues an error when setting a name 
listed in '__builtins_used__', depending on the status of 
'__builtins_optimized__'.  If either is missing, the current 
(backward-compatible) semantics of setattr should apply.

Note that if an extension module uses setattr to initialize itself, it will 
not break, because it does not have a '__builtins_used__' attribute.  Also 
note that mere container packages will not break because they contain 
modules or packages named after builtins.  Only packages which actually do 
something with the contained module, while also failing to bind that name, 
will receive warnings.

Such modules can then simply add an explicit e.g.  'import .list' or 
'global list' in the appropriate function(s), or use some similar approach 
to clarify that the named item is a module.  There would be some potential 
pain here when new builtins are added, however, since previously-working 
code could break.

There's a way to fix that too, but it may be a bit harsh.  Issue a warning 
for ambiguous use of global names that are *not* builtin, but are not 
explicitly bound by the module.  That is, if I use the name 'foo' in a 
function, and it is not a local, and is not declared 'global' or explicitly 
bound by module level code (i.e. because I am hacking globals() or because 
the name is a submodule), I should be warned that I should explicitly 
declare my intended usage.  E.g. "Name 'foo' is never assigned a value: 
perhaps you're missing a 'global' declaration or 'import' statement?"

The warning could be introduced as a PendingDeprecationWarning, upgraded to 
a warning for modules using the __future__ statement.  This would then 
discourage writing such ambiguous code in future.  (Oh, and speaking of 
ambiguity, use of 'import *' would either have to be forbidden in optimized 
modules, disable all optimization, or else use setattr and thus break at 
runtime if there's a conflict with an optimized name.)

Whew.  That's quite a list of things that would have to be done, but 
presumably we'll have to pay the piper (pyper?) sometime if we want to get 
to optimized builtin land someday.