need to safely spawn subshells from multithreaded server - how?
Thilo Ernst
Thilo.Ernst at dlr.de
Mon Mar 4 12:40:25 EST 2002
Hello c.l.py,
I'm building something best described as a 'job dispatcher' . This program is intended as a
long-running process which takes requests via an easy-to-talk-to remote object protocol such
as XML-RPC or PYRO (the requests come from another Python interpreter process), and
according to these requests starts, stops and otherwise supervises lots of child processes.
The child processes will be Unix (platform for the whole thing is Linux/x86) command line
applications to be started in subshells. As each child process will typically run a few seconds
or even minutes, and there will be lots of concurrent requests which should not block each other,
I want the dispatcher server to be multithreaded. This in principle is no problem - multithreading
is supported by available Python implementations of the protocols above. Using one of these,
the situation will be that the dispatcher server has a "main thread" listening for requests, and as
soon as a request comes in it is handed over to a "handler thread" created for the purpose,
freeing the main thread to listen for further requests.
However after researching the web for similar problems/approaches I came to suspect I might
be in for unpleasant surprises with the above design. Python multithreading combined with
the spawning of child processes is said to be dangerous. All popen() variants, os.system(), and
pty.spawn() rely on fork() which produces a full-blown clone of my server process. As long
as I cannot make 100% sure that only the intended thread - the "handler thread" for the current
request- continues to run in the child process, it seems that I might end up with two competing server
threads (one in parent, one in child).
Now my questions. Execuitve summary: "can it be done, and how?"
1. Did I get something utterly wrong here? Is the danger of getting two interfering server threads
a real one at all? This isn't the type of question a quick prototype reliably answers. Interestingly,
from Zope (which is a multithreaded server, too), I've been fork()ing subshells happily and ignorantly
for years already with no apparent problems, and so will other people have done. But I'd rather
be on the safe side. I don't want inexplicable failures due to rare race conditions or something
similar later on.
2. Can I enforce (from Python) that only the handler thread continues to run in the child process?
My current idea is to somehow block the server main thread right before the fork(), and
unblock it immediately afterwards - but only in the parent process. Would that be safe? And
could I perhaps reach that goal without having to substantially change the implementation
of my chosen remote object system?
3. Any other ideas? Is there any other, fork()-less way to spawn a subshell (from Python, on Linux)?
The following Tim Peters quote (from <mailman.1001479048.15356.python-list at python.org>)
doesn't sound too encouraging...
> The Python C API defines a PyOS_AfterFork() function, which platforms can
> fill with whatever crud they need to do after a fork. On all platforms to
> date, it calls (in both parent and child) PyEval_ReInitThreads(), and resets
> the Python signal module's notion of what the current pid is. However,
> mixing threads with fork is a frigging mess on the best of platforms, and it
> generally takes a bona fide platform expert to guess what happens in the end
Any hints much appreciated.
Best regards, Thilo Ernst
More information about the Python-list
mailing list