[Python-ideas] Really support custom types for global namespace

Thu Jun 19 21:26:00 CEST 2014

On Wed, Jun 18, 2014 at 5:25 AM, Robert Lehmann <mail at robertlehmann.de> wrote:
> The interpreter currently supports setting a custom type for globals() and
> overriding __getitem__.  The same is not true for __setitem__:
>
> class Namespace(dict):
>     def __getitem__(self, key):
>         print("getitem", key)
>     def __setitem__(self, key, value):
>         print("setitem", key, value)
>
> def fun():
>     global x, y
>     x  # should call globals.__getitem__
>     y = 1  # should call globals.__setitem__
>
> dis.dis(fun)
> #  3           0 LOAD_GLOBAL              0 (x)
> #              3 POP_TOP
> #
> #  4           4 LOAD_CONST               1 (1)
> #              7 STORE_GLOBAL             1 (y)
> #             10 LOAD_CONST               0 (None)
> #             13 RETURN_VALUE
>
> exec(fun.__code__, Namespace())
> # => getitem x
> # no setitem :-(
>
> I think it is weird why reading global variables goes through the usual
> magic methods just fine, while writing does not.  The behaviour seems to
> have been introduced in Python 3.3.x (commit e3ab8aa) to support custom
> __builtins__.  The documentation is fuzzy on this issue:
>
>> If only globals is provided, it must be a dictionary, which will be used
>> for both the global and the local variables. If globals and locals are
>> given, they are used for the global and local variables, respectively. If
>> provided, locals can be any mapping object.

"it must be a dictionary" implies to me the exclusion of subclasses.
Keep in mind that subclassing core builtin types (like dict) is
generally not a great idea and overriding methods there is definitely
a bad idea.  A big part of this is due to an implementation detail of
CPython: the use of the concrete C API, especially for dict.  The
concrete API is useful for performance, but it isn't subclass-friendly
(re: overridden methods) in the least.

> People at python-list were at odds if this was a bug,
> unspecified/unsupported behaviour, or a deliberate design decision.

I'd lean toward unspecified behavior, though (again) the docs imply to
me that using anything other than dict isn't guaranteed to work right.

So I'd consider this a proposal to add a slow path to STORE_GLOBAL
that supports dict subclasses with overridden __setitem__() and to
explicitly indicate support for get/set in the docs for exec().

To be honest, I'm not sold on the idea.  There are subtleties involved
here that make messing around with exec a high risk endeavor,
requiring sufficient justification.  What's the use case here?

Also, is this exec-specific?  Consider the case of class definitions
and that the namespace in which they are executed can be customized
via __prepare_class__() on the metaclass.  I could be wrong, but I'm
pretty sure you don't run into the problem there.  So there may be
more to the story here.

>  If it
> is just unsupported, I don't think the asymmetry makes it any better.  If it
> is deliberate, I don't understand why dispatching on the dictness of globals
> (PyDict_CheckExact(f_globals)) is good enough for LOAD_GLOBAL, but not for
> STORE_GLOBAL in terms of performance.
>
> I have a patch (+ tests) to the current default branch straightening out
> this asymmetry and will happily open a ticket if you think this is indeed a
> bug.

Definitely open a ticket (and reply here with a link).

-eric