Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

PEP 3123 -- Making PyObject_HEAD conform to standard C

PEP: 3123
Title: Making PyObject_HEAD conform to standard C
Author: Martin von Löwis <martin at v.loewis.de>
Status: Final
Type: Standards Track
Created: 27-Apr-2007
Python-Version: 3.0
Post-History:

Abstract

Python currently relies on undefined C behavior, with its usage of PyObject_HEAD . This PEP proposes to change that into standard C.

Rationale

Standard C defines that an object must be accessed only through a pointer of its type, and that all other accesses are undefined behavior, with a few exceptions. In particular, the following code has undefined behavior:

struct FooObject{
  PyObject_HEAD
  int data;
};

PyObject *foo(struct FooObject*f){
 return (PyObject*)f;
}

int bar(){
 struct FooObject *f = malloc(sizeof(struct FooObject));
 struct PyObject *o = foo(f);
 f->ob_refcnt = 0;
 o->ob_refcnt = 1;
 return f->ob_refcnt;
}

The problem here is that the storage is both accessed as if it where struct PyObject , and as struct FooObject .

Historically, compilers did not have any problems with this code. However, modern compilers use that clause as an optimization opportunity, finding that f->ob_refcnt and o->ob_refcnt cannot possibly refer to the same memory, and that therefore the function should return 0, without having to fetch the value of ob_refcnt at all in the return statement. For GCC, Python now uses -fno-strict-aliasing to work around that problem; with other compilers, it may just see undefined behavior. Even with GCC, using -fno-strict-aliasing may pessimize the generated code unnecessarily.

Specification

Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int , the struct * may also be cast to an int * , allowing to write int values into the first field.

For Python, PyObject_HEAD and PyObject_VAR_HEAD will be changed to not list all fields anymore, but list a single field of type PyObject / PyVarObject :

typedef struct _object {
  _PyObject_HEAD_EXTRA
  Py_ssize_t ob_refcnt;
  struct _typeobject *ob_type;
} PyObject;

typedef struct {
  PyObject ob_base;
  Py_ssize_t ob_size;
} PyVarObject;

#define PyObject_HEAD        PyObject ob_base;
#define PyObject_VAR_HEAD    PyVarObject ob_base;

Types defined as fixed-size structure will then include PyObject as its first field, PyVarObject for variable-sized objects. E.g.:

typedef struct {
  PyObject ob_base;
  PyObject *start, *stop, *step;
} PySliceObject;

typedef struct {
  PyVarObject ob_base;
  PyObject **ob_item;
  Py_ssize_t allocated;
} PyListObject;

The above definitions of PyObject_HEAD are normative, so extension authors MAY either use the macro, or put the ob_base field explicitly into their structs.

As a convention, the base field SHOULD be called ob_base. However, all accesses to ob_refcnt and ob_type MUST cast the object pointer to PyObject* (unless the pointer is already known to have that type), and SHOULD use the respective accessor macros. To simplify access to ob_type, ob_refcnt, and ob_size, macros:

#define Py_TYPE(o)    (((PyObject*)(o))->ob_type)
#define Py_REFCNT(o)  (((PyObject*)(o))->ob_refcnt)
#define Py_SIZE(o)    (((PyVarObject*)(o))->ob_size)

are added. E.g. the code blocks

#define PyList_CheckExact(op) ((op)->ob_type == &PyList_Type)

return func->ob_type->tp_name;

needs to be changed to:

#define PyList_CheckExact(op) (Py_TYPE(op) == &PyList_Type)

return Py_TYPE(func)->tp_name;

For initialization of type objects, the current sequence

PyObject_HEAD_INIT(NULL)
0, /* ob_size */

becomes incorrect, and must be replaced with

PyVarObject_HEAD_INIT(NULL, 0)

Compatibility with Python 2.6

To support modules that compile with both Python 2.6 and Python 3.0, the Py_* macros are added to Python 2.6. The macros Py_INCREF and Py_DECREF will be changed to cast their argument to PyObject * , so that module authors can also explicitly declare the ob_base field in modules designed for Python 2.6.

Source: https://hg.python.org/peps/file/tip/pep-3123.txt