[Python-ideas] Optimizing builtins
Guido van Rossum
guido at python.org
Fri Dec 31 22:51:41 CET 2010
On Fri, Dec 31, 2010 at 11:59 AM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
>
>
> On 31 December 2010 18:49, Guido van Rossum <guido at python.org> wrote:
>>
>> [Changed subject *and* list]
>>
>> > 2010/12/31 Maciej Fijalkowski <fijall at gmail.com>
>> >> How do you know that range is a builtin you're thinking
>> >> about and not some other object?
>>
>> On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro
>> <cesare.di.mauro at gmail.com> wrote:
>> > By a special opcode which could do this work. ]:-)
>>
>> That can't be the answer, because then the question would become "how
>> does the compiler know it can use the special opcode". This particular
>> issue (generating special opcodes for certain builtins) has actually
>> been discussed many times before. Alas, given Python's extremely
>> dynamic promises it is very hard to do it in a way that is
>> *guaranteed* not to change the semantics. For example, I could have
>> replaced builtins['range'] with something else; or I could have
>> inserted a variable named 'range' into the module's __dict__. (Note
>> that I am not talking about just creating a global variable named
>> 'range' in the module; those the compiler could recognize. I am
>> talking about interceptions that a compiler cannot see, assuming it
>> compiles each module independently, i.e. without whole-program
>> optimizations.)
>>
>> Now, *in practice* such manipulations are rare
>
> Actually range is the one I've seen *most* overridden, not in order to
> replace functionality but because range is such a useful (or relevant)
> variable name in all sorts of circumstances...
No, you're misunderstanding. I was not referring to the overriding a
name using Python's regular syntax for defining names. If you set a
(global or local) variable named 'range', the compiler is perfectly
capable of noticing. E.g.:
range = 42
def foo():
for i in range(10): print(i)
While this will of course fail with a TypeError if you try to execute
it, a (hypothetical) optimizing compiler would have no trouble
noticing that the 'range' in the for-loop must refer to the global
variable of that name, not to the builtin of the same name.
I was referring to an innocent module containing a use of the builtin
range function, e.g.
# a.py
def f():
for i in range(10): print(i)
which is imported by another module which manipulates a's globals, for example:
# b.py
import a
a.range = 42
a.f()
The compiler has no way to notice this when a.py is being compiled.
Variants of "hiding" a mutation like this include:
a.__dict__['range'] = 42
or
import builtins
builtins.range = 42
and of course for more fun you can make it more dynamic (think
obfuscated code contests).
--
--Guido van Rossum (python.org/~guido)
More information about the Python-ideas
mailing list