[Cython] Speedup module-level lookup

Chris Colbert sccolbert at gmail.com
Sat Jan 21 19:08:26 CET 2012


On Sat, Jan 21, 2012 at 2:35 AM, Vitja Makarov <vitja.makarov at gmail.com>wrote:

> 2012/1/21 Stefan Behnel <stefan_ml at behnel.de>:
> > Chris Colbert, 19.01.2012 09:18:
> >> If it doesn't pass PyDict_CheckExact you won't be able to use it as the
> >> globals to eval or exec.
> >
> > What makes you say that? I tried and it worked for me, all the way back
> to
> > Python 2.4:
> >
> > --------------------
> > Python 2.4.6 (#2, Jan 21 2010, 23:45:25)
> > [GCC 4.4.1] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> class MyDict(dict): pass
> >>>> eval('1+1', MyDict())
> > 2
> >>>> exec '1+1' in MyDict()
> >>>>
> > --------------------
> >
> > I only see a couple of calls to PyDict_CheckExact() in CPython's sources
> > and they usually seem to be related to special casing for performance
> > reasons. Nothing that should impact a module's globals.
> >
> > Besides, Cython controls its own language usages of eval and exec.
> >
>
> Cool!
> It seems that python internally uses PyObject_GetItem() for module
> level lookups and not PyDict_GetItem().
> Btw we use __Pyx_GetName() that calls PyObject_GetAttr() that isn't
> exactly the same for module lookups:
>
> # Works in Cython and doesn't work in Python
> print __class__
>
> So we can override __getitem__() and __setitem__():
> class MyDict(dict):
>    def __init__(self):
>        self._dict = {}
>
>    def __getitem__(self, key):
>        print '__getitem__', key
>        return self._dict[key]
>
>    def __setitem__(self, key, value):
>        print '__setitem__', key, value
>        self._dict[key] = value
>
>    def __getattr__(self, key):
>        print '__getattr__'
>
> d = MyDict()
> exec('x = 1; print x', d)
> eval('x', d)
> $ python foo.py
> __setitem__ x 1
> __getitem__ x
> 1
> __getitem__ x
>
>
> So we can make globals() return special dict with custom
> __setitem__()/__getitem__(). But it seems that we'll have to override
> many dict's standard methods like values(), update() and so on. That
> would be hard.
>
>
>
Be careful. That only works because your dict subclass is being used as the
locals as well. The LOAD_NAME opcode does a PyDict_CheckExact on the locals
and will call PyDict_GetItem if true, PyObject_GetItem if False:

case LOAD_NAME:
            w = GETITEM(names, oparg);
            if ((v = f->f_locals) == NULL) {
                PyErr_Format(PyExc_SystemError,
                             "no locals when loading %s",
                             PyObject_REPR(w));
                why = WHY_EXCEPTION;
                break;
            }
            if (PyDict_CheckExact(v)) {
                x = PyDict_GetItem(v, w);
                Py_XINCREF(x);
            }
            else {
                x = PyObject_GetItem(v, w);
                if (x == NULL && PyErr_Occurred()) {
                    if (!PyErr_ExceptionMatches(
                                    PyExc_KeyError))
                        break;
                    PyErr_Clear();
                }

}


You can see that the dict subclassing breaks down when you pass an empty
dict as the locals:

In [1]: class Foo(dict): ...: def __getitem__(self, name): ...: print
'get', name ...: return super(Foo, self).__getitem__(name) ...: In [2]: f =
Foo(a=42) In [3]: eval('a', f) get a Out[3]: 42 In [4]: eval('a', f, {})
Out[4]: 42
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20120121/464d77fe/attachment.html>


More information about the cython-devel mailing list