[SciPy-User] Distributed computing: running embarrassingly parallel (python/c++) codes over a cluster

Anne Archibald peridot.faceted at gmail.com
Mon Nov 9 13:18:46 EST 2009


2009/11/9 Rohit Garg <rpg.314 at gmail.com>:
> Hi all,
>
> I have an embarrassingly parallel problem, very nicely suited to
> parallelization. I am looking for community feedback on how to best
> approach this matter? Basically, I just setup a bunch of tasks, and
> the various cpu's will pull data, process it, and send it back. Out of
> order arrival of results is no problem. The processing times involved
> are so large that the communication is effectively free, and hence I
> don't care how fast/slow the communication is. I thought I'll ask in
> case somebody has done this stuff before to avoid reinventing the
> wheel. Any other suggestions are welcome too.
>
> My only constraint is that it should be able to run a python extension
> (c++) with minimum of fuss. I want to minimize the headaches involved
> with setting up/writing the boilerplate code. Which
> framework/approach/library would you recommend?

For our pulsar searches, we pick about the simplest possible method.
Each job is set up so that you run it from a UNIX shell in a directory
containing all the needed files, and it saves any output to a common
directory. We then submit jobs to the PBS batch system. We have some
minor complications to this setup because copying the input data is
quite network-intensive, so we make sure only one job starts at a
time, but other than that the jobs have no interaction at all.

Anne

> There is one method mentioned at [1], and of course, one could resort
> to something like mpi4py.
>
> [1] http://docs.python.org/library/multiprocessing.html   {see the last example}
>
> --
> Rohit Garg
>
> http://rpg-314.blogspot.com/
>
> Senior Undergraduate
> Department of Physics
> Indian Institute of Technology
> Bombay
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list