Notice: While JavaScript is not essential for this website, your interaction with the content will be limited. Please turn JavaScript on for the full experience.

PEP 670 -- Convert macros to functions in the Python C API

PEP:670
Title:Convert macros to functions in the Python C API
Author:Erlend Egeberg Aasland <erlend.aasland at protonmail.com>, Victor Stinner <vstinner at python.org>
Status:Draft
Type:Standards Track
Created:19-Oct-2021
Python-Version:3.11

Abstract

Convert macros to static inline functions or regular functions to avoid macro pitfalls.

Convert macros and static inline functions to regular functions to make them usable by Python extensions which cannot use macros or static inline functions, like extensions written in a programming languages other than C or C++.

Remove the return value of macros having a return value, whereas they should not, to aid detecting bugs in C extensions when the C API is misused.

Some function arguments are still cast to PyObject* to prevent emitting new compiler warnings.

Macros which can be used as l-value in an assignment are not converted to functions to avoid introducing incompatible changes.

Rationale

The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain.

Converting macros to functions has multiple advantages:

  • By design, functions don't have macro pitfalls.
  • Arguments type and return type are well defined.
  • Debuggers and profilers can retrieve the name of inlined functions.
  • Debuggers can put breakpoints on inlined functions.
  • Variables have a well defined scope.
  • Code is usually easier to read and to maintain than similar macro code. Functions don't need the following workarounds for macro pitfalls:
    • Add parentheses around arguments.
    • Use line continuation characters if the function is written on multiple lines.
    • Add commas to execute multiple expressions.
    • Use do { ... } while (0) to write multiple statements.

Converting macros and static inline functions to regular functions makes these regular functions accessible to projects which use Python but cannot use macros and static inline functions.

Macro Pitfalls

The GCC documentation lists several common macro pitfalls:

  • Misnesting
  • Operator precedence problems
  • Swallowing the semicolon
  • Duplication of side effects
  • Self-referential macros
  • Argument prescan
  • Newlines in arguments

Performance and inlining

Static inline functions is a feature added to the C99 standard. Modern C compilers have efficient heuristics to decide if a function should be inlined or not.

When a C compiler decides to not inline, there is likely a good reason. For example, inlining would reuse a register which requires to save/restore the register value on the stack and so increases the stack memory usage, or be less efficient.

Debug build

Benchmarks must not be run on a Python debug build, only on release build. Moreover, using LTO and PGO optimizations is recommended for best performances and reliable benchmarks. PGO helps the compiler to decide if function should be inlined or not.

./configure --with-pydebug uses the -Og compiler option if it's supported by the compiler (GCC and LLVM Clang support it): optimize debugging experience. Otherwise, the -O0 compiler option is used: disable most optimizations.

With GCC 11, gcc -Og can inline static inline functions, whereas gcc -O0 does not inline static inline functions. Examples:

  • Call Py_INCREF() in PyBool_FromLong():
    • gcc -Og: inlined
    • gcc -O0: not inlined, call Py_INCREF() function
  • Call _PyErr_Occurred() in _Py_CheckFunctionResult():
    • gcc -Og: inlined
    • gcc -O0: not inlined, call _PyErr_Occurred() function

On Windows, when Python is built in debug mode by Visual Studio, static inline functions are not inlined.

Force inlining

The Py_ALWAYS_INLINE macro can be used to force inlining. This macro uses __attribute__((always_inline)) with GCC and Clang, and __forceinline with MSC.

Previous attempts to use Py_ALWAYS_INLINE didn't show any benefit, and were abandoned. See for example: bpo-45094: "Consider using __forceinline and __attribute__((always_inline)) on static inline functions (Py_INCREF, Py_TYPE) for debug build".

When the Py_INCREF() macro was converted to a static inline functions in 2018 (commit), it was decided not to force inlining. The machine code was analyzed with multiple C compilers and compiler options: Py_INCREF() was always inlined without having to force inlining. The only case where it was not inlined was the debug build. See discussion in the bpo-35059: "Convert Py_INCREF() and PyObject_INIT() to inlined functions".

Disable inlining

On the other side, the Py_NO_INLINE macro can be used to disable inlining. It can be used to reduce the stack memory usage, or to prevent inlining on LTO+PGO builds, which are generally more aggressive to inline code: see bpo-33720. The Py_NO_INLINE macro uses __attribute__ ((noinline)) with GCC and Clang, and __declspec(noinline) with MSC.

Specification

Convert macros to static inline functions

Most macros should be converted to static inline functions to prevent macro pitfalls.

The following macros should not be converted:

  • Empty macros. Example: #define Py_HAVE_CONDVAR.
  • Macros only defining a number, even if a constant with a well defined type can better. Example: #define METH_VARARGS 0x0001.
  • Compatibility layer for different C compilers, C language extensions, or recent C features. Example: #define Py_ALWAYS_INLINE __attribute__((always_inline)).
  • Macros that need C preprocessor features, like stringification and concatenation. Example: Py_STRINGIFY().
  • Macros which can be used as l-value in an assignment. This change is an incompatible change and is out of the scope of this PEP. Example: PyBytes_AS_STRING().

Convert static inline functions to regular functions

The performance impact of converting static inline functions to regular functions should be measured with benchmarks. If there is a significant slowdown, there should be a good reason to do the conversion. One reason can be hiding implementation details.

To avoid any risk of performance slowdown on Python built without LTO, it is possible to keep a private static inline function in the internal C API and use it in Python, but expose a regular function in the public C API.

Using static inline functions in the internal C API is fine: the internal C API exposes implementation details by design and should not be used outside Python.

Cast to PyObject*

When a macro is converted to a function and the macro casts its arguments to PyObject*, the new function comes with a new macro which cast arguments to PyObject* to prevent emitting new compiler warnings. This implies that a converted function will accept pointers to structures inheriting from PyObject (ex: PyTupleObject).

For example, the Py_TYPE(obj) macro casts its obj argument to PyObject*:

#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))

static inline PyTypeObject* _Py_TYPE(const PyObject *ob) {
    return ob->ob_type;
}
#define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob))

The undocumented private _Py_TYPE() function must not be called directly. Only the documented public Py_TYPE() macro must be used.

Later, the cast can be removed on a case by case basis, but that is out of scope for this PEP.

Remove the return value

When a macro is implemented as an expression, it has an implicit return value. This return value can be misused in third party C extensions. See bpo-30459 regarding the misuse of the PyList_SET_ITEM() and PyCell_SET() macros.

Such issue is hard to catch while reviewing macro code. Removing the return value aids detecting bugs in C extensions when the C API is misused.

The issue has already been fixed in public C API macros by the bpo-30459 in Python 3.10: add a (void) cast to the affected macros. Example of the PyTuple_SET_ITEM() macro:

#define PyTuple_SET_ITEM(op, i, v) ((void)(_PyTuple_CAST(op)->ob_item[i] = v))

Example of macros currently using a (void) cast to have no return value:

  • PyCell_SET()
  • PyList_SET_ITEM()
  • PyTuple_SET_ITEM()
  • Py_BUILD_ASSERT()
  • _PyGCHead_SET_FINALIZED()
  • _PyGCHead_SET_NEXT()
  • _PyObject_ASSERT_FROM()
  • _Py_atomic_signal_fence()
  • _Py_atomic_store_64bit()
  • asdl_seq_SET()
  • asdl_seq_SET_UNTYPED()

Backwards Compatibility

Removing the return value of macros is an incompatible API change made on purpose: see the Remove the return value section.

Some function arguments are still cast to PyObject* to prevent emitting new compiler warnings.

Macros which can be used as l-value in an assignment are not modified by this PEP to avoid incompatible changes.

Rejected Ideas

Keep macros, but fix some macro issues

Converting macros to functions is not needed to remove the return value: adding a (void) cast is enough. For example, the PyList_SET_ITEM() macro was already fixed like that.

Macros are always "inlined" with any C compiler.

The duplication of side effects can be worked around in the caller of the macro.

People using macros should be considered "consenting adults". People who feel unsafe with macros should simply not use them.

These ideas are rejected because macros _are_ error prone, and it is too easy to miss a macro pitfall when writing and reviewing macro code. Moreover, macros are harder to read and maintain than functions.

Examples of Macro Pitfalls

Duplication of side effects

Macros:

#define PySet_Check(ob) \
    (Py_IS_TYPE(ob, &PySet_Type) \
     || PyType_IsSubtype(Py_TYPE(ob), &PySet_Type))

#define Py_IS_NAN(X) ((X) != (X))

If the op or the X argument has a side effect, the side effect is duplicated: it executed twice by PySet_Check() and Py_IS_NAN().

For example, the pos++ argument in the PyUnicode_WRITE(kind, data, pos++, ch) code has a side effect. This code is safe because the PyUnicode_WRITE() macro only uses its 3rd argument once and so does not duplicate pos++ side effect.

Misnesting

Example of the bpo-43181: Python macros don't shield arguments. The PyObject_TypeCheck() macro before it has been fixed:

#define PyObject_TypeCheck(ob, tp) \
    (Py_IS_TYPE(ob, tp) || PyType_IsSubtype(Py_TYPE(ob), (tp)))

C++ usage example:

PyObject_TypeCheck(ob, U(f<a,b>(c)))

The preprocessor first expands it:

(Py_IS_TYPE(ob, f<a,b>(c)) || ...)

C++ "<" and ">" characters are not treated as brackets by the preprocessor, so the Py_IS_TYPE() macro is invoked with 3 arguments:

  • ob
  • f<a
  • b>(c)

The compilation fails with an error on Py_IS_TYPE() which only takes 2 arguments.

The bug is that the op and tp arguments of PyObject_TypeCheck() must be put between parentheses: replace Py_IS_TYPE(ob, tp) with Py_IS_TYPE((ob), (tp)). In regular C code, these parentheses are redundant, can be seen as a bug, and so are often forgotten when writing macros.

To avoid Macro Pitfalls, the PyObject_TypeCheck() macro has been converted to a static inline function: commit.

Examples of hard to read macros

PyObject_INIT()

Example showing the usage of commas in a macro which has a return value.

Python 3.7 macro:

#define PyObject_INIT(op, typeobj) \
    ( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )

Python 3.8 function (simplified code):

static inline PyObject*
_PyObject_INIT(PyObject *op, PyTypeObject *typeobj)
{
    Py_TYPE(op) = typeobj;
    _Py_NewReference(op);
    return op;
}

#define PyObject_INIT(op, typeobj) \
    _PyObject_INIT(_PyObject_CAST(op), (typeobj))
  • The function doesn't need the line continuation character "\".
  • It has an explicit "return op;" rather than the surprising ", (op)" syntax at the end of the macro.
  • It uses short statements on multiple lines, rather than being written as a single long line.
  • Inside the function, the op argument has the well defined type PyObject* and so doesn't need casts like (PyObject *)(op).
  • Arguments don't need to be put inside parentheses: use typeobj, rather than (typeobj).

_Py_NewReference()

Example showing the usage of an #ifdef inside a macro.

Python 3.7 macro (simplified code):

#ifdef COUNT_ALLOCS
#  define _Py_INC_TPALLOCS(OP) inc_count(Py_TYPE(OP))
#  define _Py_COUNT_ALLOCS_COMMA  ,
#else
#  define _Py_INC_TPALLOCS(OP)
#  define _Py_COUNT_ALLOCS_COMMA
#endif /* COUNT_ALLOCS */

#define _Py_NewReference(op) (                   \
    _Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA  \
    Py_REFCNT(op) = 1)

Python 3.8 function (simplified code):

static inline void _Py_NewReference(PyObject *op)
{
    _Py_INC_TPALLOCS(op);
    Py_REFCNT(op) = 1;
}

PyUnicode_READ_CHAR()

This macro reuses arguments, and possibly calls PyUnicode_KIND multiple times:

#define PyUnicode_READ_CHAR(unicode, index) \
(assert(PyUnicode_Check(unicode)),          \
 assert(PyUnicode_IS_READY(unicode)),       \
 (Py_UCS4)                                  \
    (PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \
        (PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \
            ((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \
        ) \
    ))

Possible implementation as a static inlined function:

static inline Py_UCS4
PyUnicode_READ_CHAR(PyObject *unicode, Py_ssize_t index)
{
    assert(PyUnicode_Check(unicode));
    assert(PyUnicode_IS_READY(unicode));

    switch (PyUnicode_KIND(unicode)) {
    case PyUnicode_1BYTE_KIND:
        return (Py_UCS4)((const Py_UCS1 *)(PyUnicode_DATA(unicode)))[index];
    case PyUnicode_2BYTE_KIND:
        return (Py_UCS4)((const Py_UCS2 *)(PyUnicode_DATA(unicode)))[index];
    case PyUnicode_4BYTE_KIND:
    default:
        return (Py_UCS4)((const Py_UCS4 *)(PyUnicode_DATA(unicode)))[index];
    }
}

Macros converted to functions since Python 3.8

List of macros already converted to functions between Python 3.8 and Python 3.11 showing that these conversions didn't not impact the Python performance and didn't break the backward compatibility, even if some converted macros are very commonly used by C extensions like Py_INCREF().

Macros converted to static inline functions

Python 3.8:

  • Py_DECREF()
  • Py_INCREF()
  • Py_XDECREF()
  • Py_XINCREF()
  • PyObject_INIT()
  • PyObject_INIT_VAR()
  • _PyObject_GC_UNTRACK()
  • _Py_Dealloc()

Macros converted to regular functions

Python 3.9:

  • PyIndex_Check()
  • PyObject_CheckBuffer()
  • PyObject_GET_WEAKREFS_LISTPTR()
  • PyObject_IS_GC()
  • PyObject_NEW(): alias to PyObject_New()
  • PyObject_NEW_VAR(): alias to PyObjectVar_New()

To avoid any risk of performance slowdown on Python built without LTO, private static inline functions have been added to the internal C API:

  • _PyIndex_Check()
  • _PyObject_IS_GC()
  • _PyType_HasFeature()
  • _PyType_IS_GC()

Static inline functions converted to regular functions

Python 3.11:

  • PyObject_CallOneArg()
  • PyObject_Vectorcall()
  • PyVectorcall_Function()
  • _PyObject_FastCall()

To avoid any risk of performance slowdown on Python built without LTO, a private static inline function has been added to the internal C API:

  • _PyVectorcall_FunctionInline()

Incompatible changes

While other converted macros didn't break the backward compatibility, there is an exception.

The 3 macros Py_REFCNT(), Py_TYPE() and Py_SIZE() have been converted to static inline functions in Python 3.10 and 3.11 to disallow using them as l-value in assignment. It is an incompatible change made on purpose: see bpo-39573 for the rationale.

This PEP does not convert macros which can be used as l-value to avoid introducing incompatible changes.

Benchmark comparing macros and static inline functions

Benchmark run on Fedora 35 (Linux) with GCC 11 on a laptop with 8 logical CPUs (4 physical CPU cores).

The PR 29728 replaces existing the following static inline functions with macros:

  • PyObject_TypeCheck()
  • PyType_Check(), PyType_CheckExact()
  • PyType_HasFeature()
  • PyVectorcall_NARGS()
  • Py_DECREF(), Py_XDECREF()
  • Py_INCREF(), Py_XINCREF()
  • Py_IS_TYPE()
  • Py_NewRef()
  • Py_REFCNT(), Py_TYPE(), Py_SIZE()

When static inline functions are inlined: Release build

Benchmark of the ./python -m test -j5 command on Python built in release mode with gcc -O3, LTO and PGO:

  • Macros (PR 29728): 361 sec +- 1 sec
  • Static inline functions (reference): 361 sec +- 1 sec

There is no significant performance difference between macros and static inline functions when static inline functions are inlined.

When static inline functions are not inlined: Debug build and -O0

Benchmark of the ./python -m test -j10 command on Python built in debug mode with gcc -O0 (explicitly disable compiler optimizations):

  • Macros (PR 29728): 345 sec ± 5 sec
  • Static inline functions (reference): 360 sec ± 6 sec

Replacing macros with static inline functions makes Python 1.04x slower when the compiler does not inline static inline functions.

Post History

References

  • bpo-45490: [meta][C API] Avoid C macro pitfalls and usage of static inline functions (October 2021).
  • What to do with unsafe macros (March 2021).
  • bpo-43502: [C-API] Convert obvious unsafe macros to static inline functions (March 2021).
Source: https://github.com/python/peps/blob/master/pep-0670.rst