[Cython] Speedup module-level lookup
Vitja Makarov
vitja.makarov at gmail.com
Sat Jan 21 19:43:28 CET 2012
2012/1/21 Chris Colbert <sccolbert at gmail.com>:
>
>
> On Sat, Jan 21, 2012 at 2:35 AM, Vitja Makarov <vitja.makarov at gmail.com>
> wrote:
>>
>> 2012/1/21 Stefan Behnel <stefan_ml at behnel.de>:
>> > Chris Colbert, 19.01.2012 09:18:
>> >> If it doesn't pass PyDict_CheckExact you won't be able to use it as the
>> >> globals to eval or exec.
>> >
>> > What makes you say that? I tried and it worked for me, all the way back
>> > to
>> > Python 2.4:
>> >
>> > --------------------
>> > Python 2.4.6 (#2, Jan 21 2010, 23:45:25)
>> > [GCC 4.4.1] on linux2
>> > Type "help", "copyright", "credits" or "license" for more information.
>> >>>> class MyDict(dict): pass
>> >>>> eval('1+1', MyDict())
>> > 2
>> >>>> exec '1+1' in MyDict()
>> >>>>
>> > --------------------
>> >
>> > I only see a couple of calls to PyDict_CheckExact() in CPython's sources
>> > and they usually seem to be related to special casing for performance
>> > reasons. Nothing that should impact a module's globals.
>> >
>> > Besides, Cython controls its own language usages of eval and exec.
>> >
>>
>> Cool!
>> It seems that python internally uses PyObject_GetItem() for module
>> level lookups and not PyDict_GetItem().
>> Btw we use __Pyx_GetName() that calls PyObject_GetAttr() that isn't
>> exactly the same for module lookups:
>>
>> # Works in Cython and doesn't work in Python
>> print __class__
>>
>> So we can override __getitem__() and __setitem__():
>> class MyDict(dict):
>> def __init__(self):
>> self._dict = {}
>>
>> def __getitem__(self, key):
>> print '__getitem__', key
>> return self._dict[key]
>>
>> def __setitem__(self, key, value):
>> print '__setitem__', key, value
>> self._dict[key] = value
>>
>> def __getattr__(self, key):
>> print '__getattr__'
>>
>> d = MyDict()
>> exec('x = 1; print x', d)
>> eval('x', d)
>> $ python foo.py
>> __setitem__ x 1
>> __getitem__ x
>> 1
>> __getitem__ x
>>
>>
>> So we can make globals() return special dict with custom
>> __setitem__()/__getitem__(). But it seems that we'll have to override
>> many dict's standard methods like values(), update() and so on. That
>> would be hard.
>>
>>
>
> Be careful. That only works because your dict subclass is being used as the
> locals as well. The LOAD_NAME opcode does a PyDict_CheckExact on the locals
> and will call PyDict_GetItem if true, PyObject_GetItem if False:
>
> case LOAD_NAME:
> w = GETITEM(names, oparg);
> if ((v = f->f_locals) == NULL) {
> PyErr_Format(PyExc_SystemError,
> "no locals when loading %s",
> PyObject_REPR(w));
> why = WHY_EXCEPTION;
> break;
> }
> if (PyDict_CheckExact(v)) {
> x = PyDict_GetItem(v, w);
> Py_XINCREF(x);
> }
> else {
> x = PyObject_GetItem(v, w);
> if (x == NULL && PyErr_Occurred()) {
> if (!PyErr_ExceptionMatches(
> PyExc_KeyError))
> break;
> PyErr_Clear();
> }
>
> }
>
>
> You can see that the dict subclassing breaks down when you pass an empty
> dict as the locals:
>
> In [1]: class Foo(dict): ...: def __getitem__(self, name): ...: print 'get',
> name ...: return super(Foo, self).__getitem__(name) ...: In [2]: f =
> Foo(a=42) In [3]: eval('a', f) get a Out[3]: 42 In [4]: eval('a', f, {})
> Out[4]: 42
>
>
Nice catch! It seems that globals MUST be a real dict.
>>> help(eval)
eval(...)
eval(source[, globals[, locals]]) -> value
Evaluate the source in the context of globals and locals.
The source may be a string representing a Python expression
or a code object as returned by compile().
The globals must be a dictionary and locals can be any mapping,
defaulting to the current globals and locals.
If only globals is given, locals defaults to it.
--
vitja.
More information about the cython-devel
mailing list