[SciPy-User] Distributed computing: running embarrassingly parallel (python/c++) codes over a cluster

Mon Nov 9 13:18:03 EST 2009

Rohit Garg wrote:
> Hi all,
>
> I have an embarrassingly parallel problem, very nicely suited to
> parallelization. I am looking for community feedback on how to best
> approach this matter? Basically, I just setup a bunch of tasks, and
> the various cpu's will pull data, process it, and send it back. Out of
> order arrival of results is no problem. The processing times involved
> are so large that the communication is effectively free, and hence I
> don't care how fast/slow the communication is. I thought I'll ask in
> case somebody has done this stuff before to avoid reinventing the
> wheel. Any other suggestions are welcome too.
>
> My only constraint is that it should be able to run a python extension
> (c++) with minimum of fuss. I want to minimize the headaches involved
> with setting up/writing the boilerplate code. Which
> framework/approach/library would you recommend?
>
> There is one method mentioned at [1], and of course, one could resort
> to something like mpi4py.
>
> [1] http://docs.python.org/library/multiprocessing.html   {see the last example}
>
>   
Hi,

I've never done any parallel processing, but you might consider 
Shedskin, a Python to C++ compiler, which makes it easy to convert 
Python functions into fast C++ modules, and offers support for parallel 
processing:

http://code.google.com/p/shedskin/

Best,

James

-- 
-------------------------------------------------------
James Coughlan, Ph.D., Scientist                     

The Smith-Kettlewell Eye Research Institute

Email: coughlan at ski.org
URL: http://www.ski.org/Rehab/Coughlan_lab/
Phone: 415-345-2146 
Fax: 415-345-8455
-------------------------------------------------------