[Python-checkins] CVS: python/nondist/peps pep-0253.txt,1.1,1.2

Guido van Rossum gvanrossum@users.sourceforge.net
Mon, 14 May 2001 18:36:48 -0700


Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv20830

Modified Files:
	pep-0253.txt 
Log Message:
Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...


Index: pep-0253.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0253.txt,v
retrieving revision 1.1
retrieving revision 1.2
diff -C2 -r1.1 -r1.2
*** pep-0253.txt	2001/05/14 13:43:23	1.1
--- pep-0253.txt	2001/05/15 01:36:46	1.2
***************
*** 12,20 ****
  
      This PEP proposes ways for creating subtypes of existing built-in
!     types, either in C or in Python.
  
! Introduction
  
!     [XXX to be done.]
  
  
--- 12,375 ----
  
      This PEP proposes ways for creating subtypes of existing built-in
!     types, either in C or in Python.  The text is currently long and
!     rambling; I'll go over it again later to make it shorter.
  
!     Traditionally, types in Python have been created statically, by
!     declaring a global variable of type PyTypeObject and initializing
!     it with a static initializer.  The fields in the type object
!     describe all aspects of a Python object that are relevant to the
!     Python interpreter.  A few fields contain dimensional information
!     (e.g. the basic allocation size of instances), others contain
!     various flags, but most fields are pointers to functions to
!     implement various kinds of behaviors.  A NULL pointer means that
!     the type does not implement the specific behavior; in that case
!     the system may provide a default behavior in that case or raise an
!     exception when the behavior is invoked.  Some collections of
!     functions pointers that are usually defined together are obtained
!     indirectly via a pointer to an additional structure containing.
  
!     While the details of initializing a PyTypeObject structure haven't
!     been documented as such, they are easily glanced from the examples
!     in the source code, and I am assuming that the reader is
!     sufficiently familiar with the traditional way of creating new
!     Python types in C.
! 
!     This PEP will introduce the following optional features to types:
! 
!     - create an instance of a type by calling it
! 
!     - create a subtype in C by specifying a base type pointer
! 
!     - create a subtype in Python using a class statement
! 
!     - multiple inheritance
! 
!     This PEP builds on PEP 252, which adds standard introspection to
!     types; in particular, types are assumed to have e.g. a __hash__
!     method when the type object defines the tp_hash slot.  PEP 252 also
!     adds a dictionary to type objects which contains all methods.  At
!     the Python level, this dictionary is read-only; at the C level, it
!     is accessible directly (but modifying it is not recommended except
!     as part of initialization).
! 
! 
! Metatypes
! 
!     Inevitably the following discussion will come to mention metatypes
!     (or metaclasses).  Metatypes are nothing new in Python: Python has
!     always been able to talk about the type of a type:
! 
!     >>> a = 0
!     >>> type(a)
!     <type 'int'>
!     >>> type(type(a))
!     <type 'type'>
!     >>> type(type(type(a)))
!     <type 'type'>
!     >>> 
! 
!     In this example, type(a) is a "regular" type, and type(type(a)) is
!     a metatype.  While as distributed all types have the same metatype
!     (which is also its own metatype), this is not a requirement, and
!     in fact a useful 3rd party extension (ExtensionClasses by Jim
!     Fulton) creates an additional metatype.  A related feature is the
!     "Don Beaudry hook", which says that if a metatype is callable, its
!     instances (which are regular types) can be subclassed (really
!     subtyped) using a Python class statement.  We will use this rule
!     to support subtyping of built-in types, and in the process we will
!     introduce some additional metatypes, and a "metametatype". (The
!     metametatype is nothing unusual; Python's type system allows any
!     number of metalevels.)
! 
!     Note that Python uses the concept of metatypes or metaclasses in a
!     different way than Smalltalk.  In Smalltalk-80, there is a
!     hierarchy of metaclasses that mirrors the hierarchy of regular
!     classes, metaclasses map 1-1 to classes (except for some funny
!     business at the root of the hierarchy), and each class statement
!     creates both a regular class and its metaclass, putting class
!     methods in the metaclass and instance methods in the regular
!     class.
! 
!     Nice though this may be in the context of Smalltalk, it's not
!     compatible with the traditional use of metatypes in Python, and I
!     prefer to continue in the Python way.  This means that Python
!     metatypes are typically written in C, and may be shared between
!     many regular types. (It will be possible to subtype metatypes in
!     Python, so it won't be absolutely necessary to write C in order to
!     use metatypes; but the power of Python metatypes will be limited,
!     e.g. Python code will never be allowed to allocate raw memory and
!     initialize it at will.)
! 
! 
! Instantiation by calling the type object
! 
!     Traditionally, for each type there is at least one C function that
!     creates instances of the type.  This function has to take care of
!     both allocating memory for the object and initializing that
!     memory.  As of Python 2.0, it also has to interface with the
!     garbage collection subsystem, if the type chooses to participate
!     in garbage collection (which is optional, but strongly recommended
!     for so-called "container" types: types that may contain arbitrary
!     references to other objects, and hence may participate in
!     reference cycles).
! 
!     If we're going to implement subtyping, we must separate allocation
!     and initialization: typically, the most derived subtype is in
!     charge of allocation (and hence deallocation!), but in most cases
!     each base type's initializer (constructor) must still be called,
!     from the "most base" type to the most derived type.
! 
!     But let's first get the interface for instantiation right.  If we
!     call an object, the tp_call slot if its type gets invoked.  Thus,
!     if we call a type, this invokes the tp_call slot of the type's
!     type: in other words, the tp_call slot of the metatype.
!     Traditionally this has been a NULL pointer, meaning that types
!     can't be called.  Now we're adding a tp_call slot to the metatype,
!     which makes all types "callable" in a trivial sense.  But
!     obviously the metatype's tp_call implementation doesn't know how
!     to initialize individual types.  So the type defines a new slot,
!     tp_construct, which is invoked by the metatype's tp_call slot.  If
!     the tp_construct slot is NULL, the metatype's tp_call issues a
!     nice error message: the type isn't callable.
! 
!     We already know that tp_construct is responsible for initializing
!     the object (this will be important for subtyping too).  Who should
!     be responsible for allocation of the new object? Either the
!     metatype's tp_call can allocate the object, or the type's
!     tp_construct can allocate it.  The solution is copied from typical
!     C++ implementations: if the metatype's tp_call allocates storage
!     for the object it passes the storage as a pointer to the type's
!     tp_construct; if the metatype's tp_call does not allocate storage,
!     it passes a NULL pointer to the type's tp_call in which case the
!     type allocates the storage itself.  This moves the policy decision
!     to the metatype, and different metatypes may have different
!     policies.  The mechanisms are fixed though: either the metatype's
!     tp_call allocates storage, or the type's tp_construct allocates.
! 
!     The deallocation mechanism chosen should match the allocation
!     mechanism: an allocation policy should prescribe both the
!     allocation and deallocation mechanism.  And again, planning ahead
!     for subtyping would be nice.  But the available mechanisms are
!     different.  The deallocation function has always been part of the
!     type structure, as tp_dealloc, which combines the
!     "uninitialization" with deallocation.  This was good enough for
!     the traditional situation, where it matched the combined
!     allocation and initialization of the creation function.  But now
!     imagine a type whose creation function uses a special free list
!     for allocation.  It's deallocation function puts the object's
!     memory back on the same free list.  But when allocation and
!     creation are separate, the object may have been allocated from the
!     regular heap, and it would be wrong (in some cases disastrous) if
!     it were placed on the free list by the deallocation function.
! 
!     A solution would be for the tp_construct function to somehow mark
!     whether the object was allocated from the special free list, so
!     that the tp_dealloc function can choose the right deallocation
!     method (assuming that the only two alternatives are a special free
!     list or the regular heap).  A variant that doesn't require space
!     for an allocation flag bit would be to have two type objects,
!     identical in the contents of all their slots except for their
!     deallocation slot.  But this requires that all type-checking code
!     (e.g. the PyDict_Check()) recognizes both types.  We'll come back
!     to this solution in the context of subtyping.  Another alternative
!     is to require the metatype's tp_call to leave the allocation to
!     the tp_construct method, by passing in a NULL pointer.  But this
!     doesn't work once we allow subtyping.
! 
!     Eventually, when we add any form of subtyping, we'll have to
!     separate deallocation from uninitialization.  The way to do this
!     is to add a separate slot to the type object that does the
!     uninitialization without the deallocation.  Fortunately, there is
!     already such a slot: tp_clear, currently used by the garbage
!     collection subsystem.  A simple rule makes this slot reusable as
!     an uninitialization: for types that support separate allocation
!     and initialization, tp_clear must be defined (even if the object
!     doesn't support garbage collection) and it must DECREF all
!     contained objects and FREE all other memory areas the object owns.
!     It must also be reentrant: it must be possible to clear an already
!     cleared object.  The easiest way to do this is to replace all
!     pointers DECREFed or FREEd with NULL pointers.
! 
! 
! Subtyping in C
! 
!     The simplest form of subtyping is subtyping in C.  It is the
!     simplest form because we can require the C code to be aware of the
!     various problems, and it's acceptable for C code that doesn't
!     follow the rules to dump core; while for Python subtyping we would
!     need to catch all errors before they become core dumps.
! 
!     The idea behind subtyping is very similar to that of single
!     inheritance in C++.  A base type is described by a structure
!     declaration plus a type object.  A derived type can extend the
!     structure (but must leave the names, order and type of the fields
!     of the base structure unchanged) and can override certain slots in
!     the type object, leaving others the same.
! 
!     Not every type can serve as a base type.  The base type must
!     support separation of allocation and initialization by having a
!     tp_construct slot that can be called with a preallocated object,
!     and it must support uninitialization without deallocation by
!     having a tp_clear slot as described above.  The derived type must
!     also export the structure declaration for its instances through a
!     header file, as it is needed in order to derive a subtype.  The
!     type object for the base type must also be exported.
! 
!     If the base type has a type-checking macro (e.g. PyDict_Check()),
!     this macro may be changed to recognize subtypes.  This can be done
!     by using the new PyObject_TypeCheck(object, type) macro, which
!     calls a function that follows the base class links.  There are
!     arguments for and against changing the type-checking macro in this
!     way.  The argument for the change should be clear: it allows
!     subtypes to be used in places where the base type is required,
!     which is often the prime attraction of subtyping (as opposed to
!     sharing implementation).  An argument against changing the
!     type-checking macro could be that the type check is used
!     frequently and a function call would slow things down too much
!     (hard to believe); or one could fear that a subtype might break an
!     invariant assumed by the support functions of the base type.
!     Sometimes it would be wise to change the base type to remove this
!     reliance; other times, it would be better to require that derived
!     types (implemented in C) maintain the invariants.
! 
!     The derived type begins by declaring a type structure which
!     contains the base type's structure.  For example, here's the type
!     structure for a subtype of the built-in list type:
! 
!     typedef struct {
!         PyListObject list;
!         int state;
!     } spamlistobject;
! 
!     Note that the base type structure field (here PyListObject) must
!     be the first field in the structure; any following fields are
!     extension fields.  Also note that the base type is not referenced
!     via a pointer; the actual contents of its structure must be
!     included! (The goal is for the memory lay out of the beginning of
!     the subtype instance to be the same as that of the base type
!     instance.)
! 
!     Next, the derived type must declare a type object and initialize
!     it.  Most of the slots in the type object may be initialized to
!     zero, which is a signal that the base type slot must be copied
!     into it.  Some fields that must be initialized properly:
! 
!     - the object header must be filled in as usual; the type should be
!       PyType_Type
! 
!     - the tp_basicsize field must be set to the size of the subtype
!       instances
! 
!     - the tp_base field must be set to the address of the base type's
!       type object
! 
!     - the tp_dealloc slot function must be a deallocation function for
!       the subtype
! 
!     - the tp_flags field must be set to the usual Py_TPFLAGS_DEFAULT
!       value
! 
!     - the tp_name field must be set (otherwise it will be inherited,
!       which is wrong)
! 
!     Exception: if the subtype defines no additional fields in its
!     structure (i.e., it only defines new behavior, no new data), the
!     tp_basicsize and the tp_dealloc fields may be set to zero.  In
!     order to complete the initialization of the type,
!     PyType_InitDict() must be called.  This replaces zero slots in the
!     subtype with the value of the corresponding base type slots.  It
!     also fills in tp_dict, the type's dictionary; this is more a
!     matter of PEP 252.
! 
!     The subtype's tp_dealloc slot deserves special attention.  It must
!     uninitialize and deallocate the object in an orderly manner: first
!     it must uninitialize the fields added by the extension type; then
!     it must call the base type's tp_clear function; finally it must
!     deallocate the memory of the object.  Usually, the base type's
!     tp_clear function has no global name; it is permissible to call it
!     via the base type's tp_clear slot, e.g. PyListType.tp_clear(obj).
!     Only if it is known that the base type uses the same allocation
!     method as the subtype and the subtype requires no uninitialization
!     (e.g. it adds no data fields or all its data fields are numbers)
!     is it permissible to leave tp_dealloc set to zero in the subtype's
!     type object; it will be copied from the base type.
! 
!     A subtype is not usable until PyType_InitDict() is called for it;
!     this is best done during module initialization, assuming the
!     subtype belongs to a module.  An alternative for subtypes added to
!     the Python core (which don't live in a particular module) would be
!     to initialize the subtype in their constructor function.  It is
!     allowed to call PyType_InitDict() more than once, the second and
!     further calls have no effect.  In order to avoid unnecessary
!     calls, a test for tp_dict==NULL can be made.
! 
!     If the subtype itself should be subtypable (usually desirable), it
!     should follow the same rules are given above for base types: have
!     a tp_construct that accepts a preallocated object and calls the
!     base type's tp_construct, and have a tp_clear that calls the base
!     type's tp_clear.
! 
! 
! Subtyping in Python
! 
!     The next step is to allow subtyping of selected built-in types
!     through a class statement in Python.  Limiting ourselves to single
!     inheritance for now, here is what happens for a simple class
!     statement:
! 
!     class C(B):
!         var1 = 1
!         def method1(self): pass
!         # etc.
! 
!     The body of the class statement is executes in a fresh environment
!     (basically, a new dictionary used as local namespace), and then C
!     is created.  The following explains how C is created.
! 
!     Assume B is a type object.  Since type objects are objects, and
!     every object has a type, B has a type.  B's type is accessible via
!     type(B) or B.__class__ (the latter notation is new for types; it
!     is introduced in PEP 252).  Let's say B's type is M (for
!     Metatype).  The class statement will create a new type, C.  Since
!     C will be a type object just like B, we view the creation of C as
!     an instantiation of the metatype, M.  The information that needs
!     to be provided for the creation of C is: its name (in this example
!     the string "C"); the list of base classes (a singleton tuple
!     containing B); and the results of executing the class body, in the
!     form of a dictionary (e.g. {"var1": 1, "method1": <function...>,
!     ...}).
! 
!     According to the Don Beaudry hook, the following call is made:
! 
!     C = M("C", (B,), dict)
! 
!     (where dict is the dictionary resulting from execution of the
!     class body).  In other words, the metatype (M) is called.  Note
!     that even though we currently require there to be exactly one base
!     class, we still pass in a (singleton) sequence of base classes;
!     this makes it possible to support multiple inheritance later (or
!     for types with a different metaclass!) without changing this
!     interface.
! 
!     Note that calling M requires that M itself has a type: the
!     meta-metatype.  In the current implementation, I have introduced a
!     new type object for this purpose, named turtle because of my
!     fondness of the phrase "turtles all the way down".  However I now
!     believe that it would be better if M were its own metatype, just
!     like before.  This can be accomplished by making M's tp_call slot
!     slightly more flexible.
! 
!     In any case, the work for creating C is done by M's tp_construct
!     slot.  It allocates space for an "extended" type structure, which
!     contains space for: the type object; the auxiliary structures
!     (as_sequence etc.); the string object containing the type name (to
!     ensure that this object isn't deallocated while the type object is
!     still referencing it); and some more auxiliary storage (to be
!     described later).  It initializes this storage to zeros except for
!     a few crucial slots (e.g. tp_name is set to point to the type
!     name) and then sets the tp_base slot to point to B.  Then
!     PyType_InitDict() is called to inherit B's slots.  Finally, C's
!     tp_dict slot is updated with the contents of the namespace
!     dictionary (the third argument to the call to M).