Jeremy Hylton : weblog : 2003-04-25

A Descriptor Tutorial

Friday, April 25, 2003

It is funny that Guido's new-style class tutorial is called descrinto.html does describe descriptors. It mentions them at the end, under the list of additional topics that should be discussed. Descriptors are discussed in detail in PEP 252, but it is hard to read.

A descriptor is an object in a class dictionary. It describes how to find an attribute on an instance. For example, a property descriptor refers to functions that should be called when code attempts to get or set the property. A descriptor allows arbitrary code to be executed when a specific attribute is accessed or set. You can think of it as a getattr or setattr hook for a specific attribute.

I have written a few descriptors. The most interesting hack is a descriptor used to support persistent code in ZODB4. It allows classes and instances to have separate namespaces for the same attribute. That is, a class and its instances can both have attributes named "foo" and have different values for them.

Samuele Pedroni wrote some impressively arcane rexec attacks that used descriptors to trick trusted code into executing code from the untrusted environment. Sometimes descriptors seem too powerful. They provide unlimited control over what happens when you lookup an attribute on an object.

What seems particularly surprising about descriptors is that many simple Python expressions can end up calling descriptors. They can be used for special methods like __call__ or __getitem__.

A descriptor describes how to get and set a specific attribute on an object. It's two interesting methods are __get__() and __set__(). It also has attributes __name__, __doc__, and __objclass__.

The __get__ attribute is a good place to start. It is a method called with one or two arguments to retrieve a value. It is called to access the attribute on the class and on the instance. Whatever __get__() returns is the value of accessing the attribute. The documentation from PEP 252 says this:

__get__(): a function callable with one or two arguments that retrieves the attribute value from an object. This is also referred to as a "binding" operation, because it may return a "bound method" object in the case of method descriptors. The first argument, X, is the object from which the attribute must be retrieved or to which it must be bound. When X is None, the optional second argument, T, should be meta-object and the binding operation may return an *unbound* method restricted to instances of T. When both X and T are specified, X should be an instance of T. Exactly what is returned by the binding operation depends on the semantics of the descriptor; for example, static methods and class methods (see below) ignore the instance and bind to the type instead.

There's probably a nice, short essay to be written on descriptors, but no time to write that now.