[Python-ideas] Simpler Customization of Class Creation - PEP 487

Martin Teichmann lkb.teichmann at gmail.com
Fri Feb 5 16:20:46 EST 2016


Hi List,

about a year ago I started a discussion on how to simplify metaclasses,
which led to PEP 487. I got some good ideas from this list, but couldn't
follow up on this because I was bound in other projects.

In short, metaclasses are often not used as they are considered very
complicated. Indeed they are, especially if you need to use two of them
at the same time in a multiple inheritance context.

Most metaclasses, however, serve only some of the following three
purposes: a) run some code after a class is created b) initialize descriptors
of a class or c) keep the order in which class attributes have been defined.

PEP 487 now proposes to put a metaclass into the standard library, which
can be used for all those three purposes. If now libraries start to use this
metaclass, we won't need any metaclass mixing anymore.

What has changed since the last time I posted PEP 487? Firstly, I re-wrote
large parts of the PEP to make it easier to read. Those who liked the
old text, that's still existing in PEP 422.

Secondly, I modified the proposal following suggestions from this list:
I added the descriptor initialization (purpose b)), as this was considered
particularly useful, even if it could in principle be done using purpose a) from
above. The order-keeping of the class attributes is the leftover from a much
more ambitious previous idea that would have allowed for custom namespaces
during class creation. But this additional feature would have rendered the
most common usecase - getting the order of attributes - much more
complicated, so I opted for usability over flexibility.

I have put the new version of the PEP here:

https://github.com/tecki/metaclasses/blob/pep487/pep-0487.txt

and also added it to this posting. An implementation of this PEP can
be found at:

https://pypi.python.org/pypi/metaclass

Greetings

Martin

PEP: 487
Title: Simpler customisation of class creation
Version: $Revision$
Last-Modified: $Date$
Author: Martin Teichmann <lkb.teichmann at gmail.com>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Feb-2015
Python-Version: 3.6
Post-History: 27-Feb-2015, 5-Feb-2016
Replaces: 422


Abstract
========

Currently, customising class creation requires the use of a custom metaclass.
This custom metaclass then persists for the entire lifecycle of the class,
creating the potential for spurious metaclass conflicts.

This PEP proposes to instead support a wide range of customisation
scenarios through a new ``__init_subclass__`` hook in the class body,
a hook to initialize descriptors, and a way to keep the order in which
attributes are defined.

Those hooks should at first be defined in a metaclass in the standard
library, with the option that this metaclass eventually becomes the
default ``type`` metaclass.

The new mechanism should be easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power Python's metaclass machinery.


Background
==========

Metaclasses are a powerful tool to customize class creation. They have,
however, the problem that there is no automatic way to combine metaclasses.
If one wants to use two metaclasses for a class, a new metaclass combining
those two needs to be created, typically manually.

This need often occurs as a surprise to a user: inheriting from two base
classes coming from two different libraries suddenly raises the necessity
to manually create a combined metaclass, where typically one is not
interested in those details about the libraries at all. This becomes
even worse if one library starts to make use of a metaclass which it
has not done before. While the library itself continues to work perfectly,
suddenly every code combining those classes with classes from another library
fails.

Proposal
========

While there are many possible ways to use a metaclass, the vast majority
of use cases falls into just three categories: some initialization code
running after class creation, the initalization of descriptors and
keeping the order in which class attributes were defined.

Those three use cases can easily be performed by just one metaclass. If
this metaclass is put into the standard library, and all libraries that
wish to customize class creation use this very metaclass, no combination
of metaclasses is necessary anymore.

The three use cases are achieved as follows:

1. The metaclass contains an ``__init_subclass__`` hook that initializes
   all subclasses of a given class,
2. the metaclass calls an ``__init_descriptor__`` hook for all descriptors
   defined in the class, and
3. an ``__attribute_order__`` tuple is left in the class in order to inspect
   the order in which attributes were defined.

For ease of use, a base class ``SubclassInit`` is defined, which uses said
metaclass and contains an empty stub for the hook described for use case 1.

As an example, the first use case looks as follows::

   class SpamBase(SubclassInit):
       # this is implicitly a @classmethod
       def __init_subclass__(cls, **kwargs):
           # This is invoked after a subclass is created, but before
           # explicit decorators are called.
           # The usual super() mechanisms are used to correctly support
           # multiple inheritance.
           # **kwargs are the keyword arguments to the subclasses'
           # class creation statement
           super().__init_subclass__(cls, **kwargs)

   class Spam(SpamBase):
       pass
   # the new hook is called on Spam

The base class ``SubclassInit`` contains an empty ``__init_subclass__``
method which serves as an endpoint for cooperative multiple inheritance.
Note that this method has no keyword arguments, meaning that all
methods which are more specialized have to process all keyword
arguments.

This general proposal is not a new idea (it was first suggested for
inclusion in the language definition `more than 10 years ago`_, and a
similar mechanism has long been supported by `Zope's ExtensionClass`_),
but the situation has changed sufficiently in recent years that
the idea is worth reconsidering for inclusion.

The second part of the proposal adds an ``__init_descriptor__``
initializer for descriptors.  Descriptors are defined in the body of a
class, but they do not know anything about that class, they do not
even know the name they are accessed with. They do get to know their
owner once ``__get__`` is called, but still they do not know their
name. This is unfortunate, for example they cannot put their
associated value into their object's ``__dict__`` under their name,
since they do not know that name.  This problem has been solved many
times, and is one of the most important reasons to have a metaclass in
a library. While it would be easy to implement such a mechanism using
the first part of the proposal, it makes sense to have one solution
for this problem for everyone.

To give an example of its usage, imagine a descriptor representing weak
referenced values (this is an insanely simplified, yet working example)::

    import weakref

    class WeakAttribute:
        def __get__(self, instance, owner):
            return instance.__dict__[self.name]

        def __set__(self, instance, value):
            instance.__dict__[self.name] = weakref.ref(value)

        # this is the new initializer:
        def __init_descriptor__(self, owner, name):
            self.name = name

The third part of the proposal is to leave a tuple called
``__attribute_order__`` in the class that contains the order in which
the attributes were defined. This is a very common usecase, many
libraries use an ``OrderedDict`` to store this order. This is a very
simple way to achieve the same goal.


Key Benefits
============


Easier inheritance of definition time behaviour
-----------------------------------------------

Understanding Python's metaclasses requires a deep understanding of
the type system and the class construction process. This is legitimately
seen as challenging, due to the need to keep multiple moving parts (the code,
the metaclass hint, the actual metaclass, the class object, instances of the
class object) clearly distinct in your mind. Even when you know the rules,
it's still easy to make a mistake if you're not being extremely careful.

Understanding the proposed implicit class initialization hook only requires
ordinary method inheritance, which isn't quite as daunting a task. The new
hook provides a more gradual path towards understanding all of the phases
involved in the class definition process.


Reduced chance of metaclass conflicts
-------------------------------------

One of the big issues that makes library authors reluctant to use metaclasses
(even when they would be appropriate) is the risk of metaclass conflicts.
These occur whenever two unrelated metaclasses are used by the desired
parents of a class definition. This risk also makes it very difficult to
*add* a metaclass to a class that has previously been published without one.

By contrast, adding an ``__init_subclass__`` method to an existing type poses
a similar level of risk to adding an ``__init__`` method: technically, there
is a risk of breaking poorly implemented subclasses, but when that occurs,
it is recognised as a bug in the subclass rather than the library author
breaching backwards compatibility guarantees.


A path of introduction into Python
==================================

Most of the benefits of this PEP can already be implemented using
a simple metaclass. For the ``__init_subclass__`` hook this works
all the way down to Python 2.7, while the attribute order needs Python 3.0
to work. Such a class has been `uploaded to PyPI`_.

The only drawback of such a metaclass are the mentioned problems with
metaclasses and multiple inheritance. Two classes using such a
metaclass can only be combined, if they use exactly the same such
metaclass. This fact calls for the inclusion of such a class into the
standard library, let's call it ``SubclassMeta``, with the base class
using it called ``SubclassInit``. Once all users use this standard
library metaclass, classes from different packages can easily be
combined.

But still such classes cannot be easily combined with other classes
using other metaclasses. Authors of metaclasses should bear that in
mind and inherit from the standard metaclass if it seems useful
for users of the metaclass to add more functionality. Ultimately,
if the need for combining with other metaclasses is strong enough,
the proposed functionality may be introduced into Python's ``type``.

Those arguments strongly hint to the following procedure to include
the proposed functionality into Python:

1. The metaclass implementing this proposal is put onto PyPI, so that
   it can be used and scrutinized.
2. Once the code is properly mature, it can be added to the Python
   standard library. There should be a new module called
   ``metaclass`` which collects tools for metaclass authors, as well
   as a documentation of the best practices of how to write
   metaclasses.
3. If the need of combining this metaclass with other metaclasses is
   strong enough, it may be included into Python itself.

While the metaclass is still in the standard library and not in the
language, it may still clash with other metaclasses.  The most
prominent metaclass in use is probably ABCMeta.  It is also a
particularly good example for the need of combining metaclasses. For
users who want to define a ABC with subclass initialization, we should
support a ``ABCSubclassInit`` class, or let ABCMeta inherit from this
PEP's metaclass.

Extensions written in C or C++ also often define their own metaclass.
It would be very useful if those could also inherit from the metaclass
defined here, but this is probably not possible.

New Ways of Using Classes
=========================

This proposal has many usecases like the following. In the examples,
we still inherit from the ``SubclassInit`` base class. This would
become unnecessary once this PEP is included in Python directly.

Subclass registration
---------------------

Especially when writing a plugin system, one likes to register new
subclasses of a plugin baseclass. This can be done as follows::

   class PluginBase(SubclassInit):
       subclasses = []

       def __init_subclass__(cls, **kwargs):
           super().__init_subclass__(**kwargs)
           cls.subclasses.append(cls)

One should note that this also works nicely as a mixin class.

Trait descriptors
-----------------

There are many designs of Python descriptors in the wild which, for
example, check boundaries of values. Often those "traits" need some support
of a metaclass to work. This is how this would look like with this
PEP::

   class Trait:
       def __get__(self, instance, owner):
           return instance.__dict__[self.key]

       def __set__(self, instance, value):
           instance.__dict__[self.key] = value

       def __init_descriptor__(self, owner, name):
           self.key = name

   class Int(Trait):
       def __set__(self, instance, value):
           # some boundary check code here
           super().__set__(instance, value)


Rejected Design Options
=======================


Calling the hook on the class itself
------------------------------------

Adding an ``__autodecorate__`` hook that would be called on the class
itself was the proposed idea of PEP 422.  Most examples work the same
way or even better if the hook is called on the subclass. In general,
it is much easier to explicitly call the hook on the class in which it
is defined (to opt-in to such a behavior) than to opt-out, meaning
that one does not want the hook to be called on the class it is
defined in.

This becomes most evident if the class in question is designed as a
mixin: it is very unlikely that the code of the mixin is to be
executed for the mixin class itself, as it is not supposed to be a
complete class on its own.

The original proposal also made major changes in the class
initialization process, rendering it impossible to back-port the
proposal to older Python versions.


Other variants of calling the hook
----------------------------------

Other names for the hook were presented, namely ``__decorate__`` or
``__autodecorate__``. This proposal opts for ``__init_subclass__`` as
it is very close to the ``__init__`` method, just for the subclass,
while it is not very close to decorators, as it does not return the
class.


Requiring an explicit decorator on ``__init_subclass__``
--------------------------------------------------------

One could require the explicit use of ``@classmethod`` on the
``__init_subclass__`` decorator. It was made implicit since there's no
sensible interpretation for leaving it out, and that case would need
to be detected anyway in order to give a useful error message.

This decision was reinforced after noticing that the user experience of
defining ``__prepare__`` and forgetting the ``@classmethod`` method
decorator is singularly incomprehensible (particularly since PEP 3115
documents it as an ordinary method, and the current documentation doesn't
explicitly say anything one way or the other).


Defining arbitrary namespaces
-----------------------------

PEP 422 defined a generic way to add arbitrary namespaces for class
definitions. This approach is much more flexible than just leaving
the definition order in a tuple. The ``__prepare__`` method in a metaclass
supports exactly this behavior. But given that effectively
the only use cases that could be found out in the wild were the
``OrderedDict`` way of determining the attribute order, it seemed
reasonable to only support this special case.

The metaclass described in this PEP has been designed to be very simple
such that it could be reasonably made the default metaclass. This was
especially important when designing the attribute order functionality:
This was a highly demanded feature and has been enabled through the
``__prepare__`` method of metaclasses. This method can be abused in
very weird ways, making it hard to correctly maintain this feature in
CPython. This is why it has been proposed to deprecated this feature,
and instead use ``OrderedDict`` as the standard namespace, supporting
the most important feature while dropping most of the complexity. But
this would have meant that ``OrderedDict`` becomes a language builtin
like dict and set, and not just a standard library class. The choice
of the ``__attribute_order__`` tuple is a much simpler solution to the
problem.

A more ``__new__``-like hook
----------------------------

In PEP 422 the hook worked more like the ``__new__`` method than the
``__init__`` method, meaning that it returned a class instead of
modifying one. This allows a bit more flexibility, but at the cost
of much harder implementation and undesired side effects.


History
=======

This used to be a competing proposal to PEP 422 by Nick Coughlan and
Daniel Urban. It shares both most of the PEP text and proposed code, but
has major differences in how to achieve its goals. In the meantime, PEP 422
has been withdrawn favouring this approach.

References
==========

.. _published code:
   http://mail.python.org/pipermail/python-dev/2012-June/119878.html

.. _more than 10 years ago:
   http://mail.python.org/pipermail/python-dev/2001-November/018651.html

.. _Zope's ExtensionClass:
   http://docs.zope.org/zope_secrets/extensionclass.html

.. _uploaded to PyPI:
   https://pypi.python.org/pypi/metaclass

Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


More information about the Python-ideas mailing list