[Python-ideas] Optimizing builtins

Guido van Rossum guido at python.org
Fri Dec 31 22:51:41 CET 2010


On Fri, Dec 31, 2010 at 11:59 AM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
>
>
> On 31 December 2010 18:49, Guido van Rossum <guido at python.org> wrote:
>>
>> [Changed subject *and* list]
>>
>> > 2010/12/31 Maciej Fijalkowski <fijall at gmail.com>
>> >> How do you know that range is a builtin you're thinking
>> >> about and not some other object?
>>
>> On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro
>> <cesare.di.mauro at gmail.com> wrote:
>> > By a special opcode which could do this work. ]:-)
>>
>> That can't be the answer, because then the question would become "how
>> does the compiler know it can use the special opcode". This particular
>> issue (generating special opcodes for certain builtins) has actually
>> been discussed many times before. Alas, given Python's extremely
>> dynamic promises it is very hard to do it in a way that is
>> *guaranteed* not to change the semantics. For example, I could have
>> replaced builtins['range'] with something else; or I could have
>> inserted a variable named 'range' into the module's __dict__. (Note
>> that I am not talking about just creating a global variable named
>> 'range' in the module; those the compiler could recognize. I am
>> talking about interceptions that a compiler cannot see, assuming it
>> compiles each module independently, i.e. without whole-program
>> optimizations.)
>>
>> Now, *in practice* such manipulations are rare
>
> Actually range is the one I've seen *most* overridden, not in order to
> replace functionality but because range is such a useful (or relevant)
> variable name in all sorts of circumstances...

No, you're misunderstanding. I was not referring to the overriding a
name using Python's regular syntax for defining names. If you set a
(global or local) variable named 'range', the compiler is perfectly
capable of noticing. E.g.:

  range = 42
  def foo():
    for i in range(10): print(i)

While this will of course fail with a TypeError if you try to execute
it, a (hypothetical) optimizing compiler would have no trouble
noticing that the 'range' in the for-loop must refer to the global
variable of that name, not to the builtin of the same name.

I was referring to an innocent module containing a use of the builtin
range function, e.g.

  # a.py
  def f():
    for i in range(10): print(i)

which is imported by another module which manipulates a's globals, for example:

  # b.py
  import a
  a.range = 42
  a.f()

The compiler has no way to notice this when a.py is being compiled.

Variants of "hiding" a mutation like this include:

  a.__dict__['range'] = 42

or

  import builtins
  builtins.range = 42

and of course for more fun you can make it more dynamic (think
obfuscated code contests).

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list