[issue46070] [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)

STINNER Victor report at bugs.python.org
Thu Jan 13 10:32:44 EST 2022


STINNER Victor <vstinner at python.org> added the comment:

When the crash occurs, the _sre.compile function is not destroyed in the interpreter which created the function.



The crash is related to _sre.compile method. This method is created in PyInit__sre() called by "import _sre".

On Windows, the _sre module is not imported at startup. So it's imported first in a subinterpreter.

In Python 3.9, the _sre module doesn't use the multiphase initialization API and PyModuleDef.m_size = -1. When the module is imported, _PyImport_FixupExtensionObject() copies the module dictionary into PyModuleDef.m_copy.

In Py_Finalize() and Py_EndInterpreter(), _PyImport_Cleanup() does two things:

* (1) set _sre.__dict__['compile'] to None -> kill the first reference to the function
* (2) call _PyInterpreterState_ClearModules() which does Py_CLEAR(def->m_base.m_copy), clear the cached copy of the _sre module dict -> kill the second reference

I modified Python to add an "ob_interp" member to PyObject to log in which interpreter an object is created. I also modified meth_dealloc() to log when _sre.compile function is deleted.

Extract of the reformatted output to see what's going on:
---
(...)

(1)
fixup: COPY _sre ModuleDef copy: def=00007FFF19209810 interp=000001EC1846F2A0

    (2)
    import: UPDATE(_sre ModuleDef copy): interp=000001EC184AB790

(3)
_PyImport_Cleanup: interp=000001EC1846F2A0
_PyInterpreterState_ClearModules: PY_CLEAR _sre ModuleDef m_copy: def=00007FFF19209810 interp=000001EC1846F2A0

    (4)
    _PyImport_Cleanup: interp=000001EC184AB790
    meth_dealloc(compile): m->ob_interp=000001EC1846F2A0, interp=000001EC184AB790

    Windows fatal exception: access violation
    (...)
---

Steps:

* (1)

  * interpreter #1 (000001EC1846F2A0) creates the _sre.compile function
  * interpreter #1 (000001EC1846F2A0) copies _sre module dict into PyModuleDef.m_copy
  * at this point, _sre.compile should have 2 references

* (2)

  * interpreter #2 (000001EC184AB790) imports _sre: it creates a new module object and copies the function from PyModuleDef.m_copy
  * at this point, _sre.compile should have 3 references

* (3)

  * interpreter #1 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * at this point, _sre.compile should have 1 reference

* (4)

  * interpreter #2 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * the last reference to _sre.compile is deleted: 0 reference
  * meth_dealloc() is called

The first problem is that the function was created in the interpreter #1 but deleted in the interpreter #2.

The second problem is that the function is tracked by the GC and it is part of the GC list of the interpreter #1. When the interpreter #2 destroys the function, the GC list of interpreter #1 is already freed: PyGC_Head contains dangling pointers.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46070>
_______________________________________


More information about the Python-bugs-list mailing list