need to safely spawn subshells from multithreaded server - how?

Mon Mar 4 12:40:25 EST 2002

Hello c.l.py,

I'm building something best described as a  'job dispatcher' .  This program  is intended as a
long-running process which takes requests via an easy-to-talk-to remote object protocol such
as XML-RPC or PYRO  (the requests come from another Python interpreter process), and
according to these requests starts, stops and otherwise supervises lots of child processes.

The child processes will be Unix (platform for the whole thing is Linux/x86) command line
applications to be started in subshells.  As each child process will typically run a few seconds
or even minutes, and there will be lots of concurrent requests which should not block each other,
I want the dispatcher server to be multithreaded. This in principle is no problem - multithreading
is supported by available Python implementations  of  the protocols above. Using one of these,
the situation will be that the dispatcher server has a "main thread" listening for requests, and as
soon as a request  comes in it is handed over to a "handler thread" created for the purpose,
freeing the main thread to listen for further requests.

However after researching the web for similar problems/approaches I came to suspect I might
be in for unpleasant surprises with the above design. Python multithreading combined with
the spawning of child processes is said to be dangerous.  All popen()  variants, os.system(), and
pty.spawn() rely on  fork() which produces a full-blown clone of my  server process.  As long
as I  cannot make 100% sure  that only the intended thread  - the "handler thread" for the current
request- continues to run in the child process, it seems that I might end up with two competing server
threads (one in parent, one in child).

Now my questions. Execuitve summary: "can it be done, and how?"

1. Did I get something utterly wrong here? Is the danger of getting two interfering server threads
     a real one at all? This isn't the type of question a quick prototype reliably answers. Interestingly,
     from  Zope (which is a multithreaded server, too),  I've been fork()ing subshells happily and ignorantly
     for years already with no apparent problems, and so will other people have done. But I'd rather
     be on the safe side. I don't want inexplicable failures due to rare race conditions or something
     similar later on.

2. Can I enforce (from Python) that only the handler thread continues to run in the child process?
    My current idea is to somehow block the server main thread right before  the fork(), and
    unblock it  immediately afterwards - but only in the parent process. Would that be safe? And
    could I perhaps reach that goal without having to substantially change the implementation
    of my chosen remote object system?

3. Any other ideas? Is there any other, fork()-less way to spawn a subshell (from Python, on Linux)?

The following Tim Peters quote (from <mailman.1001479048.15356.python-list at python.org>)
doesn't sound too encouraging...

> The Python C API defines a PyOS_AfterFork() function, which platforms can
> fill with whatever crud they need to do after a fork.  On all platforms to
> date, it calls (in both parent and child) PyEval_ReInitThreads(), and resets
> the Python signal module's notion of what the current pid is.  However,
> mixing threads with fork is a frigging mess on the best of platforms, and it
> generally takes a bona fide platform expert to guess what happens in the end

Any hints much appreciated.

Best regards, Thilo Ernst