client-server parallellised number crunching

Hans Georg Schaathun georg at schaathun.net
Wed Apr 27 01:58:41 EDT 2011


On Tue, 26 Apr 2011 14:31:59 -0700, geremy condra
  <debatem1 at gmail.com> wrote:
:  Without knowledge of what you're doing it's hard to comment
:  intelligently, 

I need to calculate map( foobar, L ) where foobar() is a pure function
with no dependency on the global state, L is a list of tuples, each  
containing two numpy arrays, currently 500-1000 floats each + a scalar
or two.  The result is a pair of floats.

The foobar() function is sufficiently heavy to merit demonstratably
parallellisation.

The CPU-s I have available to spread the load further are not clustered.
They are prone to crash without warning and I do not have root access.
I don't have exclusive use.  I do not even have physical access, so I 
cannot use a liveCD.  (They are, however, equipped with a batch queue
system (torque).)

:                 but I'd try something like CHAOS or OpenSSI to see if
:  you can't get what you need for free, if that doesn't do it then try
:  dropping a liveCD with Hadoop on it in each machine and running it
:  that way.  If that can't work, try MPI. If you've gotten that far and
:  nothing does the trick then you're probably going to have to give
more
:  details.

TANSTAFL :-)
There is always the learning curve

If I understand it correctly, openSSI requires root access; is that
right?  For CHAOS I need more details to be able to google; I found
a fractals toolbox, but that did not seem relevant :-)

MPI I have tried before.  Unless there is a new, massively more
sophisticated MPI library around now, I would certainly have to
do my own code to cope with lost clients.  

Hadoop sounds intresting.  I had encountered it before, but did not
think about it.  However, the liveCD is clearly not an option.  Thanks
for the tip; I'll read up on map-reduce at least.

:-- Hans Georg



More information about the Python-list mailing list