[Python-Dev] Python initialization and embedded Python

Victor Stinner victor.stinner at gmail.com
Fri Nov 17 19:01:47 EST 2017


Hi,

The CPython internals evolved during Python 3.7 cycle. I would like to
know if we broke the C API or not.

Nick Coghlan and Eric Snow are working on cleaning up the Python
initialization with the "on going" PEP 432:
https://www.python.org/dev/peps/pep-0432/

Many global variables used by the "Python runtime" were move to a new
single "_PyRuntime" variable (big structure made of sub-structures).
See Include/internal/pystate.h.

A side effect of moving variables from random files into header files
is that it's not more possible to fully initialize _PyRuntime at
"compilation time". For example, previously, it was possible to refer
to local C function (functions declared with "static", so only visible
in the current file). Now a new "initialization function" is required
to must be called.

In short, it means that using the "Python runtime" before it's
initialized by _PyRuntime_Initialize() is now likely to crash. For
example, calling PyMem_RawMalloc(), before calling
_PyRuntime_Initialize(), now calls the function NULL: dereference a
NULL pointer, and so immediately crash with a segmentation fault.

I'm writing this email to ask if this change is an issue or not to
embedded Python and the Python C API. Is it still possible to call
"all" functions of the C API before calling Py_Initialize()?

I was bitten by the bug while reworking the Py_Main() function to
split it into subfunctions and cleanup the code to handle the command
line arguments and environment variables. I fixed the issue in main()
by calling _PyRuntime_Initialize() as soon as possible: it's now the
first instruction of main() :-) (See Programs/python.c)

To give a more concrete example: Py_DecodeLocale() is the recommanded
function to decode bytes from the operating system, but this function
calls PyMem_RawMalloc() which does crash before
_PyRuntime_Initialize() is called. Is Py_DecodeLocale() used to
initialize Python?

For example, "void Py_SetProgramName(wchar_t *);" expects a text
string, whereas main() gives argv as bytes. Calling
Py_SetProgramName() from argv requires to decode bytes... So use
Py_DecodeLocale()...

Should we do something in Py_DecodeLocale()? Maybe crash if
_PyRuntime_Initialize() wasn't called yet?

Maybe, the minimum change is to expose _PyRuntime_Initialize() in the
public C API?

Victor


More information about the Python-Dev mailing list