[Cython] Speedup module-level lookup

Vitja Makarov vitja.makarov at gmail.com
Sat Jan 21 19:43:28 CET 2012


2012/1/21 Chris Colbert <sccolbert at gmail.com>:
>
>
> On Sat, Jan 21, 2012 at 2:35 AM, Vitja Makarov <vitja.makarov at gmail.com>
> wrote:
>>
>> 2012/1/21 Stefan Behnel <stefan_ml at behnel.de>:
>> > Chris Colbert, 19.01.2012 09:18:
>> >> If it doesn't pass PyDict_CheckExact you won't be able to use it as the
>> >> globals to eval or exec.
>> >
>> > What makes you say that? I tried and it worked for me, all the way back
>> > to
>> > Python 2.4:
>> >
>> > --------------------
>> > Python 2.4.6 (#2, Jan 21 2010, 23:45:25)
>> > [GCC 4.4.1] on linux2
>> > Type "help", "copyright", "credits" or "license" for more information.
>> >>>> class MyDict(dict): pass
>> >>>> eval('1+1', MyDict())
>> > 2
>> >>>> exec '1+1' in MyDict()
>> >>>>
>> > --------------------
>> >
>> > I only see a couple of calls to PyDict_CheckExact() in CPython's sources
>> > and they usually seem to be related to special casing for performance
>> > reasons. Nothing that should impact a module's globals.
>> >
>> > Besides, Cython controls its own language usages of eval and exec.
>> >
>>
>> Cool!
>> It seems that python internally uses PyObject_GetItem() for module
>> level lookups and not PyDict_GetItem().
>> Btw we use __Pyx_GetName() that calls PyObject_GetAttr() that isn't
>> exactly the same for module lookups:
>>
>> # Works in Cython and doesn't work in Python
>> print __class__
>>
>> So we can override __getitem__() and __setitem__():
>> class MyDict(dict):
>>    def __init__(self):
>>        self._dict = {}
>>
>>    def __getitem__(self, key):
>>        print '__getitem__', key
>>        return self._dict[key]
>>
>>    def __setitem__(self, key, value):
>>        print '__setitem__', key, value
>>        self._dict[key] = value
>>
>>    def __getattr__(self, key):
>>        print '__getattr__'
>>
>> d = MyDict()
>> exec('x = 1; print x', d)
>> eval('x', d)
>> $ python foo.py
>> __setitem__ x 1
>> __getitem__ x
>> 1
>> __getitem__ x
>>
>>
>> So we can make globals() return special dict with custom
>> __setitem__()/__getitem__(). But it seems that we'll have to override
>> many dict's standard methods like values(), update() and so on. That
>> would be hard.
>>
>>
>
> Be careful. That only works because your dict subclass is being used as the
> locals as well. The LOAD_NAME opcode does a PyDict_CheckExact on the locals
> and will call PyDict_GetItem if true, PyObject_GetItem if False:
>
> case LOAD_NAME:
>             w = GETITEM(names, oparg);
>             if ((v = f->f_locals) == NULL) {
>                 PyErr_Format(PyExc_SystemError,
>                              "no locals when loading %s",
>                              PyObject_REPR(w));
>                 why = WHY_EXCEPTION;
>                 break;
>             }
>             if (PyDict_CheckExact(v)) {
>                 x = PyDict_GetItem(v, w);
>                 Py_XINCREF(x);
>             }
>             else {
>                 x = PyObject_GetItem(v, w);
>                 if (x == NULL && PyErr_Occurred()) {
>                     if (!PyErr_ExceptionMatches(
>                                     PyExc_KeyError))
>                         break;
>                     PyErr_Clear();
>                 }
>
> }
>
>
> You can see that the dict subclassing breaks down when you pass an empty
> dict as the locals:
>
> In [1]: class Foo(dict): ...: def __getitem__(self, name): ...: print 'get',
> name ...: return super(Foo, self).__getitem__(name) ...: In [2]: f =
> Foo(a=42) In [3]: eval('a', f) get a Out[3]: 42 In [4]: eval('a', f, {})
> Out[4]: 42
>
>

Nice catch! It seems that globals MUST be a real dict.

>>> help(eval)
eval(...)
    eval(source[, globals[, locals]]) -> value

    Evaluate the source in the context of globals and locals.
    The source may be a string representing a Python expression
    or a code object as returned by compile().
    The globals must be a dictionary and locals can be any mapping,
    defaulting to the current globals and locals.
    If only globals is given, locals defaults to it.


-- 
vitja.


More information about the cython-devel mailing list