Question about math.pi is mutable

Mon Nov 9 21:37:38 EST 2015

On Tue, 10 Nov 2015 06:45 am, Ben Finney wrote:

> Steven D'Aprano <steve at pearwood.info> writes:
> 
>> The compiler doesn't need to decide in advance whether or not the
>> module attributes have been changed. It can decide that at runtime,
>> just before actually looking up the attribute. In pseudo-code:
>>
>>     if attribute might have changed:
>>         use the slow path just like today
>>     else:
>>         use the optimized fast path
> 
> As you have pointed out earlier, the “attribute might have changed”
> condition is set by *any* non-trivial code — notably, a function
> call, though that doesn't exhaust the ways of setting that condition.

Ben, I fear that you are not paying attention to me :-)

The compiler doesn't need to decide *in advance* whether the attribute might
have changed. It knows whether it has changed or not *at runtime*.

I'm not a compiler writer, but I pretend to be one on Usenet *wink* so don't
take this as gospel. Treat it as a simple-minded illustration of what sort
of thing a JIT compiler could do.

It's one thing to say that *in principle* any function might modify or
shadow builtins. That's true, because we don't know what's inside the
function. But the compiler knows, because it actually executes the code
inside the function and can see what happens when it does. It doesn't have
to predict in advance whether or not calling `func(x)` shadows the builtin
`len` function, *it can see for itself* whether it did or not.

At compile time, `func(x)` might do anything. But at runtime, we know
exactly what it did, because it just did it.

Imagine that the compiler keeps track of whether or not builtins has been
modified. Think of it as a simple "dirty" flag that says "yes, builtins is
still pristine" or "no, something may have shadowed or modified the
builtins". That's fairly straight-forward: builtins is a dict, and the
compiler can tell whether or not __setitem__ etc has been called on that
dict. Likewise, it can keep track of whether or not a global has been
created that shadows builtins: some of that can be done statically, at
compile-time, but most of it needs to be done dynamically, at runtime.

If the flag is set, the compiler knows that the optimization is unsafe and
it has to follow the standard name lookup, and you lose nothing: the
standard Python semantics are still followed. But if the flag is clear, the
compiler knows that nothing has shadowed or modified builtins, and a whole
class of optimizations are safe. It can replace a call to (say) len(x) with
a fast jump, avoiding an unnecessary name lookup in globals, and another
unnecessary name lookup in builtins. Or it might even inline the call to
len. Since *most* code doesn't play tricks with builtins, the overhead of
tracking these changes pays off *most* of the time -- and when it doesn't,
the penalty is very small.

Depending on how smart the compiler is, there are all sorts of things it can
do. The state of the art (not bleeding edge) for JIT compilers is pretty
smart these days. CPython is a simple-minded, dumb compiler, and that's the
way Guido likes it (its the reference implementation, not the fastest or
most memory efficient implementation). But PyPy can approach the speed of
statically optimized C, at least sometimes, and certainly can beat CPython
by an order of magnitude. Likewise Javascript's V8 compiler.

> So the remaining space of code that is safe for the proposed
> optimisation is trivially small. Why bother with such optimisations, if
> the only code that can benefit is *already* small and simple?

That is absolutely not correct.

-- 
Steven