Building CPython

Fri May 15 08:43:09 EDT 2015

On Fri, 15 May 2015 08:50 pm, Marko Rauhamaa wrote:

> Chris Angelico <rosuav at gmail.com>:
> 
>> On Fri, May 15, 2015 at 6:59 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
>>> Must a method lookup necessarily involve object creation?
>>
>> Actually, no.
>> [...]
>> a particular Python implementation is most welcome to notice the
>> extremely common situation of method calls and optimize it.
> 
> I'm not sure that is feasible given the way it has been specified. You'd
> have to prove the class attribute lookup produces the same outcome in
> consecutive method references.

Sure. But some implementations may have a more, um, flexible approach to
correctness, and offer more aggressive optimizations which break the letter
of Python's semantics but work for 90% of cases. Just because CPython
doesn't do so, doesn't mean that some new implementation might not offer a
series of aggressive optimizations which the caller (or maybe the module?)
can turn on as needed, e.g.:

- assume methods never change;
- assume classes are static;
- assume built-in names always refer to the known built-in;

etc. Such an optimized Python, when running with those optimizations turned
on, is not *strictly* Python, but "buyer beware" applies here. If the
optimizations break your code or make testing hard, don't use it.

> Also:
> 
>    >>> class X:
>    ...   def f(self): pass
>    ...
>    >>> x = X()
>    >>> f = x.f
>    >>> ff = x.f
>    >>> f is ff
>    False
> 
> Would a compliant Python implementation be allowed to respond "True"?

Certainly.

When you retrieve x.f, Python applies the usual "attribute lookup" code,
which simplified looks like this:

if 'f' in x.__dict__:
    attr = x.__dict__['f']
else:
    for K in type(x).__mro__:
        # Walk the parent classes of x in the method resolution order
        if 'f' in K.__dict__:
            attr = K.__dict__['f']
            break
    else:  # no break
        raise AttributeError
# if we get here, we know x.f exists and is bound to attr
# now apply the descriptor protocol (simplified)
if hasattr(attr, '__get__'):
    attr = attr.__get__(x, type(x))
# Finally we can call x.f()
return attr(x, *args)

Functions have a __get__ method which returns the method object! Imagine
they look something like this:

class FunctionType:
    def __call__(self, *args, **kwargs):
        # Actually call the code that does stuff

    def __get__(self, instance, cls):
        if cls is None:
            # Unbound instance
            return self
        return MethodType(self, instance)  # self is the function

This implementation creates a new method object every time you look it up.
But functions *could* do this:

    def __get__(self, instance, cls):
        if cls is None:
            # Unbound instance
            return self
        if self._method is None:
            self._method = MethodType(self, instance)  # Cache it.
        return self._method

What's more, a compliant implementation could reach the "if we get here"
point in the lookup procedure above, and do this:

# if we get here, we know attr exists
if type(attr) is FunctionType:  # Fast pointer comparison.
    return attr(x, *args)
else:
    # do the descriptor protocol thing, and then call attr

It can only do this if it knows that x.f is a real function, not some sort
of callable or function subclass, because in that case who knows what
side-effects the __get__ method might have.

How much time would it save? Probably very little. After all, unless the
method call itself did bugger-all work, the time to create the method
object is probably insignificant. But it's a possible optimization.

-- 
Steven