[Python-Dev] proposed attribute lookup optimization

Sun Jul 8 19:23:19 CEST 2007

Hi,

I would like to propose an optimization (I think so, anyway) for the
way attributes are looked up.  Currently, it is done like this:

	return value of attribute in instance.__dict__ if present
	for type in instance.__class__.__mro__:
	    return value of attribute in type.__dict__ if present
	raise AttributeError

I propose adding to each type a C-implementation-private dictionary
of attribute-name => type-in-which-defined.  Then, it will not be
necessary to traverse __mro__ on each attribute lookup for names
which are present in this lookup dictionary.

This optimization will not have any effect for attributes defined
on instance.  It will, however, for type attributes, most notably
for methods.  It will most likely cause a slowdown for looking up
attributes that are defined directly on self.__class__, not on any
of its bases.  However, I believe it will be a benefit for all
non-extremely shallow inheritance tree.  Especially if they involve
multiple inheritance.

One open question is what to do in case an attribute on a type is
set or deleted.

Python example:

class Current (type):

    @staticmethod
    def getattribute (self, name):
        dict = object.__getattribute__(self, '__dict__')
        if name in dict:
            return dict[name]

        mro = object.__getattribute__ (self, '__class__').__mro__
        for type in mro:
            dict = type.__dict__
            if name in dict:
                return dict[name]

        raise AttributeError

    def __init__(self, name, bases, dict):
        super (Current, self).__init__(name, bases, dict)
        self.__getattribute__ = Current.getattribute

class Optimized (type):

    @staticmethod
    def getattribute (self, name):
        dict = object.__getattribute__(self, '__dict__')
        if name in dict:
            return dict[name]

        # <possible optimization>
        lookup = object.__getattribute__ (self, '__class__').__lookup_cache__
        if name in lookup:
            return lookup[name].__dict__[name]
        # </possible optimization>

        mro = object.__getattribute__ (self, '__class__').__mro__
        for type in mro:
            dict = object.__getattribute__(type, '__dict__')
            if name in dict:
                return dict[name]

        raise AttributeError

    # <possible optimization>
    def build_lookup_cache (self):
        lookup = {}
        for type in self.__mro__:
            for name in type.__dict__:
                if name not in lookup:
                    lookup[name] = type

        return lookup
    # </possible optimization>

    def __init__(self, name, bases, dict):
        super (Optimized, self).__init__(name, bases, dict)
        # <possible optimization>
        self.__lookup_cache__ = self.build_lookup_cache ()
        # </possible optimization>
        self.__getattribute__ = Optimized.getattribute

class A (object):
    __metaclass__ = Optimized
    x = 1

class B (A):
    pass

class C (B):
    pass

class D (C):
    pass

class E (D):
    pass

t = E ()

for k in xrange (100000):
    t.x

Try swapping metaclass of A from Optimized to Current and measure
execution time.

Paul