[Python-checkins] [3.12] gh-104372: Drop the GIL around the vfork() call. (GH-104782) (#104942)

gpshead webhook-mailer at python.org
Thu May 25 16:44:48 EDT 2023


https://github.com/python/cpython/commit/5c2971b78f7e2bdf8ed6073c7470cdfe2a4b7a69
commit: 5c2971b78f7e2bdf8ed6073c7470cdfe2a4b7a69
branch: 3.12
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: gpshead <greg at krypto.org>
date: 2023-05-25T20:44:29Z
summary:

[3.12] gh-104372: Drop the GIL around the vfork() call. (GH-104782) (#104942)

gh-104372: Drop the GIL around the vfork() call. (GH-104782)

On Linux where the `subprocess` module can use the `vfork` syscall for
faster spawning, prevent the parent process from blocking other threads
by dropping the GIL while it waits for the vfork'ed child process `exec`
outcome.  This prevents spawning a binary from a slow filesystem from
blocking the rest of the application.

Fixes GH-104372.
(cherry picked from commit d08679212d9af52dd074cd4a6abb440edb944c9c)

Co-authored-by: Gregory P. Smith <gps at python.org>

files:
A Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst
M Doc/library/subprocess.rst
M Modules/_posixsubprocess.c

diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst
index 53dfbf827260..738e611c05ad 100644
--- a/Doc/library/subprocess.rst
+++ b/Doc/library/subprocess.rst
@@ -57,10 +57,13 @@ underlying :class:`Popen` interface can be used directly.
    and combine both streams into one, use ``stdout=PIPE`` and ``stderr=STDOUT``
    instead of *capture_output*.
 
-   The *timeout* argument is passed to :meth:`Popen.communicate`. If the timeout
-   expires, the child process will be killed and waited for.  The
-   :exc:`TimeoutExpired` exception will be re-raised after the child process
-   has terminated.
+   A *timeout* may be specified in seconds, it is internally passed on to
+   :meth:`Popen.communicate`. If the timeout expires, the child process will be
+   killed and waited for. The :exc:`TimeoutExpired` exception will be
+   re-raised after the child process has terminated. The initial process
+   creation itself cannot be interrupted on many platform APIs so you are not
+   guaranteed to see a timeout exception until at least after however long
+   process creation takes.
 
    The *input* argument is passed to :meth:`Popen.communicate` and thus to the
    subprocess's stdin.  If used it must be a byte sequence, or a string if
@@ -734,7 +737,7 @@ arguments.
 code.
 
 All of the functions and methods that accept a *timeout* parameter, such as
-:func:`call` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if
+:func:`run` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if
 the timeout expires before the process exits.
 
 Exceptions defined in this module all inherit from :exc:`SubprocessError`.
diff --git a/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst b/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst
new file mode 100644
index 000000000000..ea13ec85543c
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst
@@ -0,0 +1,5 @@
+On Linux where :mod:`subprocess` can use the ``vfork()`` syscall for faster
+spawning, prevent the parent process from blocking other threads by dropping
+the GIL while it waits for the vfork'ed child process ``exec()`` outcome.
+This prevents spawning a binary from a slow filesystem from blocking the
+rest of the application.
diff --git a/Modules/_posixsubprocess.c b/Modules/_posixsubprocess.c
index 63403795569a..36470804c6a1 100644
--- a/Modules/_posixsubprocess.c
+++ b/Modules/_posixsubprocess.c
@@ -559,7 +559,7 @@ reset_signal_handlers(const sigset_t *child_sigmask)
  * required by POSIX but not supported natively on Linux. Another reason to
  * avoid this family of functions is that sharing an address space between
  * processes running with different privileges is inherently insecure.
- * See bpo-35823 for further discussion and references.
+ * See https://bugs.python.org/issue35823 for discussion and references.
  *
  * In some C libraries, setrlimit() has the same thread list/signalling
  * behavior since resource limits were per-thread attributes before
@@ -798,6 +798,7 @@ do_fork_exec(char *const exec_array[],
     pid_t pid;
 
 #ifdef VFORK_USABLE
+    PyThreadState *vfork_tstate_save;
     if (child_sigmask) {
         /* These are checked by our caller; verify them in debug builds. */
         assert(uid == (uid_t)-1);
@@ -805,7 +806,22 @@ do_fork_exec(char *const exec_array[],
         assert(extra_group_size < 0);
         assert(preexec_fn == Py_None);
 
+        /* Drop the GIL so that other threads can continue execution while this
+         * thread in the parent remains blocked per vfork-semantics on the
+         * child's exec syscall outcome. Exec does filesystem access which
+         * can take an arbitrarily long time. This addresses GH-104372.
+         *
+         * The vfork'ed child still runs in our address space. Per POSIX it
+         * must be limited to nothing but exec, but the Linux implementation
+         * is a little more usable. See the child_exec() comment - The child
+         * MUST NOT re-acquire the GIL.
+         */
+        vfork_tstate_save = PyEval_SaveThread();
         pid = vfork();
+        if (pid != 0) {
+            // Not in the child process, reacquire the GIL.
+            PyEval_RestoreThread(vfork_tstate_save);
+        }
         if (pid == (pid_t)-1) {
             /* If vfork() fails, fall back to using fork(). When it isn't
              * allowed in a process by the kernel, vfork can return -1
@@ -819,6 +835,7 @@ do_fork_exec(char *const exec_array[],
     }
 
     if (pid != 0) {
+        // Parent process.
         return pid;
     }
 



More information about the Python-checkins mailing list