|Title:||Simpler customisation of class creation|
|Last-Modified:||2013-09-16 00:36:52 +1000 (Mon, 16 Sep 2013)|
|Author:||Nick Coghlan <ncoghlan at gmail.com>, Daniel Urban <urban.dani+py at gmail.com>|
- PEP Deferral
- Key Benefits
- New Ways of Using Classes
- Rejected Design Options
- Reference Implementation
Currently, customising class creation requires the use of a custom metaclass. This custom metaclass then persists for the entire lifecycle of the class, creating the potential for spurious metaclass conflicts.
This PEP proposes to instead support a wide range of customisation scenarios through a new namespace parameter in the class header, and a new __init_class__ hook in the class body.
The new mechanism is also much easier to understand and use than implementing a custom metaclass, and thus should provide a gentler introduction to the full power Python's metaclass machinery.
Deferred until 3.5 at the earliest. The last review raised a few interesting points that I (Nick) need to consider further before proposing it for inclusion, and that's not going to happen in the 3.4 timeframe.
For an already created class cls, the term "metaclass" has a clear meaning: it is the value of type(cls).
During class creation, it has another meaning: it is also used to refer to the metaclass hint that may be provided as part of the class definition. While in many cases these two meanings end up referring to one and the same object, there are two situations where that is not the case:
- If the metaclass hint refers to an instance of type, then it is considered as a candidate metaclass along with the metaclasses of all of the parents of the class being defined. If a more appropriate metaclass is found amongst the candidates, then it will be used instead of the one given in the metaclass hint.
- Otherwise, an explicit metaclass hint is assumed to be a factory function and is called directly to create the class object. In this case, the final metaclass will be determined by the factory function definition. In the typical case (where the factory functions just calls type, or, in Python 3.3 or later, types.new_class) the actual metaclass is then determined based on the parent classes.
It is notable that only the actual metaclass is inherited - a factory function used as a metaclass hook sees only the class currently being defined, and is not invoked for any subclasses.
In Python 3, the metaclass hint is provided using the metaclass=Meta keyword syntax in the class header. This allows the __prepare__ method on the metaclass to be used to create the locals() namespace used during execution of the class body (for example, specifying the use of collections.OrderedDict instead of a regular dict).
In Python 2, there was no __prepare__ method (that API was added for Python 3 by PEP 3115). Instead, a class body could set the __metaclass__ attribute, and the class creation process would extract that value from the class namespace to use as the metaclass hint. There is published code  that makes use of this feature.
Another new feature in Python 3 is the zero-argument form of the super() builtin, introduced by PEP 3135. This feature uses an implicit __class__ reference to the class being defined to replace the "by name" references required in Python 2. Just as code invoked during execution of a Python 2 metaclass could not call methods that referenced the class by name (as the name had not yet been bound in the containing scope), similarly, Python 3 metaclasses cannot call methods that rely on the implicit __class__ reference (as it is not populated until after the metaclass has returned control to the class creation machinery).
Finally, when a class uses a custom metaclass, it can pose additional challenges to the use of multiple inheritance, as a new class cannot inherit from parent classes with unrelated metaclasses. This means that it is impossible to add a metaclass to an already published class: such an addition is a backwards incompatible change due to the risk of metaclass conflicts.
This PEP proposes that a new mechanism to customise class creation be added to Python 3.4 that meets the following criteria:
- Integrates nicely with class inheritance structures (including mixins and multiple inheritance)
- Integrates nicely with the implicit __class__ reference and zero-argument super() syntax introduced by PEP 3135
- Can be added to an existing base class without a significant risk of introducing backwards compatibility problems
- Restores the ability for class namespaces to have some influence on the class creation process (above and beyond populating the namespace itself), but potentially without the full flexibility of the Python 2 style __metaclass__ hook
One mechanism that can achieve this goal is to add a new class initialisation hook, modelled directly on the existing instance initialisation hook, but with the signature constrained to match that of an ordinary class decorator.
Specifically, it is proposed that class definitions be able to provide a class initialisation hook as follows:
class Example: def __init_class__(cls): # This is invoked after the class is created, but before any # explicit decorators are called # The usual super() mechanisms are used to correctly support # multiple inheritance. The class decorator style signature helps # ensure that invoking the parent class is as simple as possible.
If present on the created object, this new hook will be called by the class creation machinery after the __class__ reference has been initialised. For types.new_class(), it will be called as the last step before returning the created class object. __init_class__ is implicitly converted to a class method when the class is created (prior to the hook being invoked).
If a metaclass wishes to block class initialisation for some reason, it must arrange for cls.__init_class__ to trigger AttributeError.
Note, that when __init_class__ is called, the name of the class is not yet bound to the new class object. As a consequence, the two argument form of super() cannot be used to call methods (e.g., super(Example, cls) wouldn't work in the example above). However, the zero argument form of super() works as expected, since the __class__ reference is already initialised.
This general proposal is not a new idea (it was first suggested for inclusion in the language definition more than 10 years ago , and a similar mechanism has long been supported by Zope's ExtensionClass ), but the situation has changed sufficiently in recent years that the idea is worth reconsidering.
In addition, the introduction of the metaclass __prepare__ method in PEP 3115 allows a further enhancement that was not possible in Python 2: this PEP also proposes that type.__prepare__ be updated to accept a factory function as a namespace keyword-only argument. If present, the value provided as the namespace argument will be called without arguments to create the result of type.__prepare__ instead of using a freshly created dictionary instance. For example, the following will use an ordered dictionary as the class namespace:
class OrderedExample(namespace=collections.OrderedDict): def __init_class__(cls): # cls.__dict__ is still a read-only proxy to the class namespace, # but the underlying storage is an OrderedDict instance
This PEP, along with the existing ability to use __prepare__ to share a single namespace amongst multiple class objects, highlights a possible issue with the attribute lookup caching: when the underlying mapping is updated by other means, the attribute lookup cache is not invalidated correctly (this is a key part of the reason class __dict__ attributes produce a read-only view of the underlying storage).
Since the optimisation provided by that cache is highly desirable, the use of a preexisting namespace as the class namespace may need to be declared as officially unsupported (since the observed behaviour is rather strange when the caches get out of sync).
Currently, to use a different type (such as collections.OrderedDict) for a class namespace, or to use a pre-populated namespace, it is necessary to write and use a custom metaclass. With this PEP, using a custom namespace becomes as simple as specifying an appropriate factory function in the class header.
Understanding Python's metaclasses requires a deep understanding of the type system and the class construction process. This is legitimately seen as challenging, due to the need to keep multiple moving parts (the code, the metaclass hint, the actual metaclass, the class object, instances of the class object) clearly distinct in your mind. Even when you know the rules, it's still easy to make a mistake if you're not being extremely careful. An earlier version of this PEP actually included such a mistake: it stated "subclass of type" for a constraint that is actually "instance of type".
Understanding the proposed class initialisation hook only requires understanding decorators and ordinary method inheritance, which isn't quite as daunting a task. The new hook provides a more gradual path towards understanding all of the phases involved in the class definition process.
One of the big issues that makes library authors reluctant to use metaclasses (even when they would be appropriate) is the risk of metaclass conflicts. These occur whenever two unrelated metaclasses are used by the desired parents of a class definition. This risk also makes it very difficult to add a metaclass to a class that has previously been published without one.
By contrast, adding an __init_class__ method to an existing type poses a similar level of risk to adding an __init__ method: technically, there is a risk of breaking poorly implemented subclasses, but when that occurs, it is recognised as a bug in the subclass rather than the library author breaching backwards compatibility guarantees. In fact, due to the constrained signature of __init_class__, the risk in this case is actually even lower than in the case of __init__.
Unlike code that runs as part of the metaclass, code that runs as part of the new hook will be able to freely invoke class methods that rely on the implicit __class__ reference introduced by PEP 3135, including methods that use the zero argument form of super().
For use cases that don't involve completely replacing the defined class, Python 2 code that dynamically set __metaclass__ can now dynamically set __init_class__ instead. For more advanced use cases, introduction of an explicit metaclass (possibly made available as a required base class) will still be necessary in order to support Python 3.
All of the examples below are actually possible today through the use of a custom metaclass:
class CustomNamespace(type): @classmethod def __prepare__(meta, name, bases, *, namespace=None, **kwds): parent_namespace = super().__prepare__(name, bases, **kwds) return namespace() if namespace is not None else parent_namespace def __new__(meta, name, bases, ns, *, namespace=None, **kwds): return super().__new__(meta, name, bases, ns, **kwds) def __init__(cls, name, bases, ns, *, namespace=None, **kwds): return super().__init__(name, bases, ns, **kwds)
The advantage of implementing the new keyword directly in type.__prepare__ is that the only persistent effect is then the change in the underlying storage of the class attributes. The metaclass of the class remains unchanged, eliminating many of the drawbacks typically associated with these kinds of customisations.
class OrderedClass(namespace=collections.OrderedDict): a = 1 b = 2 c = 3
seed_data = dict(a=1, b=2, c=3) class PrepopulatedClass(namespace=seed_data.copy): pass
class NewClass(namespace=Prototype.__dict__.copy): pass
Just because the PEP makes it possible to do this relatively, cleanly doesn't mean anyone should do this!
from collections import MutableMapping # The MutableMapping + dict combination should give something that # generally behaves correctly as a mapping, while still being accepted # as a class namespace class ClassNamespace(MutableMapping, dict): def __init__(self, cls): self._cls = cls def __len__(self): return len(dir(self._cls)) def __iter__(self): for attr in dir(self._cls): yield attr def __contains__(self, attr): return hasattr(self._cls, attr) def __getitem__(self, attr): return getattr(self._cls, attr) def __setitem__(self, attr, value): setattr(self._cls, attr, value) def __delitem__(self, attr): delattr(self._cls, attr) def extend(cls): return lambda: ClassNamespace(cls) class Example: pass class ExtendedExample(namespace=extend(Example)): a = 1 b = 2 c = 3 >>> Example.a, Example.b, Example.c (1, 2, 3)
Calling the new hook automatically from type.__init__, would achieve most of the goals of this PEP. However, using that approach would mean that __init_class__ implementations would be unable to call any methods that relied on the __class__ reference (or used the zero-argument form of super()), and could not make use of those features themselves.
Originally, this PEP required the explicit use of @classmethod on the __init_class__ decorator. It was made implicit since there's no sensible interpretation for leaving it out, and that case would need to be detected anyway in order to give a useful error message.
This decision was reinforced after noticing that the user experience of defining __prepare__ and forgetting the @classmethod method decorator is singularly incomprehensible (particularly since PEP 3115 documents it as an ordinary method, and the current documentation doesn't explicitly say anything one way or the other).
At one point, this PEP proposed that the class namespace be passed directly as a keyword argument, rather than passing a factory function. However, this encourages an unsupported behaviour (that is, passing the same namespace to multiple classes, or retaining direct write access to a mapping used as a class namespace), so the API was switched to the factory function version.
- address the 5 points in http://mail.python.org/pipermail/python-dev/2013-February/123970.html
This document has been placed in the public domain.