parallel computations: subprocess.Popen(...).communicate()[0] does not work with multiprocessing.Pool

Chris Torek nospam at torek.net
Sun Jun 12 18:00:58 EDT 2011


In article <mailman.105.1307737402.11593.python-list at python.org>
Hseu-Ming Chen  <hseuming at gmail.com> wrote:
>I am having an issue when making a shell call from within a
>multiprocessing.Process().  Here is the story: i tried to parallelize
>the computations in 800-ish Matlab scripts and then save the results
>to MySQL.   The non-parallel/serial version has been running fine for
>about 2 years.  However, in the parallel version via multiprocessing
>that i'm working on, it appears that the Matlab scripts have never
>been kicked off and nothing happened with subprocess.Popen.  The debug
>printing below does not show up either.

I obviously do not have your code, and have not even tried this as
an experiment in a simplified environment, but:

>import subprocess
>from multiprocessing import Pool
>
>def worker(DBrow,config):
>   #  run one Matlab script
>   cmd1 = "/usr/local/bin/matlab  ...  myMatlab.1.m"
>   subprocess.Popen([cmd1], shell=True, stdout=subprocess.PIPE).communicate()[0]
>   print "this does not get printed"
 ...
># kick off parallel processing
>pool = Pool()
>for DBrow in DBrows: pool.apply_async(worker,(DBrow,config))
>pool.close()
>pool.join()

The multiprocessing code makes use of pipes to communicate between
the various subprocesses it creates.  I suspect these "extra" pipes
are interfering with your subprocesses, when pool.close() waits
for the Matlab script to do something with its copy of the pipes.
To make the subprocess module close them -- so that Matlab does
not have them in the first place and hence pool.close() cannot get
stuck there -- add "close_fds=True" to the Popen() call.

There could still be issues with competing wait() and/or waitpid()
calls (assuming you are using a Unix-like system, or whatever the
equivalent is for Windows) "eating" the wrong subprocess completion
notifications, but that one is harder to solve in general :-) so
if close_fds fixes things, it was just the pipes.  If close_fds
does not fix things, you will probably need to defer the pool.close()
step until after all the subprocesses complete.
-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html



More information about the Python-list mailing list