Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

PEP 560 -- Core support for generic types

PEP:560
Title:Core support for generic types
Author:Ivan Levkivskyi <levkivskyi at gmail.com>
Status:Draft
Type:Standards Track
Created:03-Sep-2017
Python-Version:3.7
Post-History:09-Sep-2017

Abstract

Initially PEP 484 was designed in such way that it would not introduce any changes to the core CPython interpreter. Now type hints and the typing module are extensively used by the community, e.g. PEP 526 and PEP 557 extend the usage of type hints, and the backport of typing on PyPI has 1M downloads/month. Therefore, this restriction can be removed. It is proposed to add two special methods __class_getitem__ and __subclass_base__ to the core CPython for better support of generic types.

Rationale

The restriction to not modify the core CPython interpreter lead to some design decisions that became questionable when the typing module started to be widely used. There are three main points of concerns: performance of the typing module, metaclass conflicts, and the large number of hacks currently used in typing.

Performance:

The typing module is one of the heaviest and slowest modules in the standard library even with all the optimizations made. Mainly this is because subscripted generic types (see PEP 484 for definition of terms used in this PEP) are class objects (see also [1]). The three main ways how the performance can be improved with the help of the proposed special methods:

  • Creation of generic classes is slow since the GenericMeta.__new__ is very slow; we will not need it anymore.
  • Very long MROs for generic classes will be twice shorter; they are present because we duplicate the collections.abc inheritance chain in typing.
  • Time of instantiation of generic classes will be improved (this is minor however).

Metaclass conflicts:

All generic types are instances of GenericMeta, so if a user uses a custom metaclass, then it is hard to make a corresponding class generic. This is particularly hard for library classes that a user doesn't control. A workaround is to always mix-in GenericMeta:

class AdHocMeta(GenericMeta, LibraryMeta):
    pass

class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta):
    ...

but this is not always practical or even possible. With the help of the proposed special attributes the GenericMeta metaclass will not be needed.

Hacks and bugs that will be removed by this proposal:

  • _generic_new hack that exists since __init__ is not called on instances with a type differing form the type whose __new__ was called, C[int]().__class__ is C.
  • _next_in_mro speed hack will be not necessary since subscription will not create new classes.
  • Ugly sys._getframe hack, this one is particularly nasty, since it looks like we can't remove it without changes outside typing.
  • Currently generics do dangerous things with private ABC caches to fix large memory consumption that grows at least as O(N2), see [2]. This point is also important because it was recently proposed to re-implement ABCMeta in C.
  • Problems with sharing attributes between subscripted generics, see [3]. Current solution already uses __getattr__ and __setattr__, but it is still incomplete, and solving this without the current proposal will be hard and will need __getattribute__.
  • _no_slots_copy hack, where we clean-up the class dictionary on every subscription thus allowing generics with __slots__.
  • General complexity of the typing module, the new proposal will not only allow to remove the above mentioned hacks/bugs, but also simplify the implementation, so that it will be easier to maintain.

Specification

The idea of __class_getitem__ is simple: it is an exact analog of __getitem__ with an exception that it is called on a class that defines it, not on its instances, this allows us to avoid GenericMeta.__getitem__ for things like Iterable[int]. The __class_getitem__ is automatically a class method and does not require @classmethod decorator (similar to __init_subclass__) and is inherited like normal attributes. For example:

class MyList:
    def __getitem__(self, index):
        return index + 1
    def __class_getitem__(cls, item):
        return f"{cls.__name__}[{item.__name__}]"

class MyOtherList(MyList):
    pass

assert MyList()[0] == 1
assert MyList[int] == "MyList[int]"

assert MyOtherList()[0] == 1
assert MyOtherList[int] == "MyOtherList[int]"

Note that this method is used as a fallback, so if a metaclass defines __getitem__, then that will have the priority.

If an object that is not a class object appears in the bases of a class definition, the __subclass_base__ is searched on it. If found, it is called with the original tuple of bases as an argument. If the result of the call is not None, then it is substituted instead of this object. Otherwise (if the result is None), the base is just removed. This is necessary to avoid inconsistent MRO errors, that are currently prevented by manipulations in GenericMeta.__new__. After creating the class, the original bases are saved in __orig_bases__ (currently this is also done by the metaclass).

NOTE: These two method names are reserved for exclusive use by the typing module and the generic types machinery, and any other use is strongly discouraged. The reference implementation (with tests) can be found in [4], the proposal was originally posted and discussed on the typing tracker, see [5].

Backwards compatibility and impact on users who don't use typing:

This proposal may break code that currently uses the names __class_getitem__ and __subclass_base__.

This proposal will support almost complete backwards compatibility with the current public generic types API; moreover the typing module is still provisional. The only two exceptions are that currently issubclass(List[int], List) returns True, with this proposal it will raise TypeError. Also issubclass(collections.abc.Iterable, typing.Iterable) will return False, which is probably desirable, since currently we have a (virtual) inheritance cycle between these two classes.

With the reference implementation I measured negligible performance effects (under 1% on a micro-benchmark) for regular (non-generic) classes.

References

[1]Discussion following Mark Shannon's presentation at Language Summit (https://github.com/python/typing/issues/432)
[2]Pull Request to implement shared generic ABC caches (https://github.com/python/typing/pull/383)
[3]An old bug with setting/accessing attributes on generic types (https://github.com/python/typing/issues/392)
[4]The reference implementation (https://github.com/ilevkivskyi/cpython/pull/2/files)
[5]Original proposal (https://github.com/python/typing/issues/468)
Source: https://github.com/python/peps/blob/master/pep-0560.txt