New PEP: Attribute Access Handlers

Sat Jul 22 00:11:56 EDT 2000

http://python.sourceforge.net/peps/pep-0213.html

PEP: 213
Title: Attribute Access Handlers
Version: $Revision: 1.3 $
Owner: paul at prescod.net (Paul Prescod)
Python-Version: 2.0
Status: Incomplete


Introduction

     It is possible (and even relatively common) in Python code and 
     in extension modules to "trap" when an object's client code 
     attempts to set an attribute and execute code instead. In other 
     words it is possible to allow users to use attribute assignment/
     retrieval/deletion syntax even though the underlying implementation 
     is doing some computation rather than directly modifying or
reporting
     a binding.

     This PEP describes a feature that makes it easier, more efficient
     and safer to implement these handlers in Python class instances.

Justification

    Scenario 1:

        You have a deployed class that works on an attribute named
        "stdout". After a year, you think it would be better to
        check that stdout is really an object with a "write" method
        at the moment of assignment. Rather than change to a
        setstdout method (which would be incompatible with deployed
        code) you would rather trap the assignment and check the
        object's type.

    Scenario 2:

        You want to be as compatible as possible with an object 
        model that has a concept of attribute assignment. It could
        be the W3C Document Object Model or a particular COM 
        interface (e.g. the PowerPoint interface). In that case
        you may well want attributes in the model to show up as
        attributes in the Python interface, even though the 
        underlying implementation may not use attributes at all.

    Scenario 3:

        A user wants to make an attribute read-only.

    In short, this feature allows programmers to separate the 
    interface of their module from the underlying implementation
    for whatever purpose. Again, this is not a new feature but
    merely a new and more robust syntax for something that can
    already be implemented.

Current Solution

    To make some attributes read-only:

    class foo:
       def __setattr__( self, name, val ):
          if name=="readonlyattr":
             raise TypeError
          elif name=="readonlyattr2":
             raise TypeError
          ...
          else:
             self.__dict__["name"]=val

     This has the following problems:

     1. The creator of the method must be intimately aware of whether
        somewhere else in the class hiearchy __setattr__ has also been
        trapped for any particular purpose. If so, she must specifically
        call that method rather than assigning to the dictionary. There
        are many different reasons to overload __setattr__ so there is a
        decent potential for clashes. For instance object database
        implementations often overload setattr for an entirely unrelated
        purpose.

     2. The string-based switch statement forces all attribute handlers 
        to be specified in one place in the code. They may then dispatch
        to task-specific methods (for modularity) but this could cause
        performance problems.

     3. Logic for the setting, getting and deleting must live in 
        __getattr__, __setattr__ and __delattr__. Once again, this can
        be mitigated through an extra level of method call but this is 
        inefficient.

Proposed Syntax
 
    Special methods should declare themselves with definitions of the
    following form:

    class x:
        def __attr_XXX__(self, op, val ):
            if op=="get":
                return someComputedValue(self.internal)
            elif op=="set":
                self.internal=someComputedValue(val)
            elif op=="del":
                del self.internal

    Client code looks like this:

    inst=x()
    fooval=inst.XXX
    inst.XXX=fooval+5
    del x.XXX

 Semantics
 
     Attribute references of all three kinds should call the method.
     The op parameter can be "get"/"set"/"del". Of course this string
     will be interned so the actual checks for the string will be
     very fast.
 
     It is disallowed to actually have an attribute named XXX in the
     same instance dictionary as a method named __attr_XXX__.
     If both are declared in a class then the last one declared 
     replaces the other one.

     An implementation of __attr_XXX__ takes precedence over an
     implementation of __getattr__ based on the principle that
     __getattr__ is supposed to be invoked only after as a
     last resort after an appropriate attribute lookup has failed.

     An implementation of __attr_XXX__ takes precedence over an
     implementation of __setattr__ in order to be consistent. The
     opposite choice seems fairly feasible also, however. The same
     goes for __del_y__.

Proposed Implementation
 
    There is a new object type called an attribute access handler. 
    Objects of this type have the following attributes:

       name (e.g. XXX, not __attr__XXX__
       method (pointer to a method object

    Class construction
   
        In PyClass_New, methods of the appropriate form will be
        detected and converted into objects (just like unbound method
        objects). These are stored in the class __dict__ under the name
        XXX. The original method is stored as an unbound method under
        its original name.

        If there are any attribute access handlers in an class at all,
        a flag is set. Let's call it "I_have_computed_attributes" for
        now. Derived classes inherit the flag from base classes. 

    Instance construction

        Instances inherit the "I_have_computed_attributes" flag from 
        classes. No other changes to PyInstance_New are anticipated.

    Property fetching

        At the layer, Python instance attribute lookup is always done 
        through PyInstance_getattr.
 
        A "get" proceeds as usual until just before the object is
        returned.  In addition to the current check whether the returned
        object is a method it would also check whether a returned object
        is an access handler. If so, it would invoke the getter method
        and return the value. To remove an attribute access handler you
        could directly fiddle with the dictionary.  Gets should be more
        efficient than they are today with __getattr__ and no different
        for classes that do not use the feature at all.
 
        A set proceeds by checking the "I_have_computed_attributes"
        flag. If it is not set, everything proceeds as it does today. If
        it is set then we must do a dictionary get on the requested
        attribute name. If it returns an attribute access handler then 
        we call the setter function with the value. If it returns any 
        other object then we discard the result and continue as we do 
        today. Note that having an attribute access handler will mildly 
        affect attribute "setting" performance for all sets on a 
        particular instance, but no more so than today, using 
        __setattr__. 
 
        The I_have_computed_attributes flag is intended to eliminate the
        performance degradation of an extra "get" per "set" for objects
        not using this feature. Checking this flag should have miniscule
        performance implications for all objects.

       The implementation of delete is basically identical to the 
       implementation of set.

Caveats

    1. You might note that I have not proposed any logic to keep
       the I_have_computed_attributes flag up to date as attributes
       are added and removed from the instance's dictionary. This is
       consistent with current Python. If you add a __setattr__ method
       to an object after it is in use, that method will not behave as
       it would if it were available at "compile" time. The dynamism is
       arguably not worth the extra implementation effort. This snippet
       demonstrates the current behavior:

	>>> def prn(*args):print args
	>>> class a:
	...    __setattr__=prn
        >>> a().foo=5
	(<__main__.a instance at 882890>, 'foo', 5)

	>>> class b: pass 
	>>> bi=b()
	>>> bi.__setattr__=prn
	>>> b.foo=5

     2. Assignment to __dict__["XXX"] can overwrite the attribute
	access handler for __attr_XXX__. Typically the access handlers 
        will store information away in private __XXX variables

     3. An attribute access handler that attempts to call setattr or
        getattr on the object itself can cause an infinite loop (as with
       __getattr__) Once again, the solution is to use a special 
       (typically private) variable such as __XXX.

-- 
 Paul Prescod - Not encumbered by corporate consensus
New from Computer Associates: "Software that can 'think', sold by 
marketers who choose not to."