[SciPy-User] using multiple processors for particle filtering

Thu Jun 3 06:31:05 EDT 2010

On Thu, May 27, 2010 at 10:37 PM, Andy Fraser <afraser at lanl.gov> wrote:
>
> #Multiprocessing version:
>
>        noise = numpy.random.standard_normal((N_particles,noise_df))
>        jobs = zip(self.particles,noise)
>        self.particles = self.pool.map(func, jobs, self.chunk_size)
>        return (m,v)

What platform are you on? I often forget that multiprocessing works
quite differently on Windows to unix platforms (and is much less
useful). On unix platforms the child processes are spawned with
fork(), which means they share all the memory state of the parent
process, with copy on write if they make changes. On Windows seperate
processes are spawned and all the state has to be past through the
serialiser (I think). So on unix you can share large quantities of
(read only) data very cheaply by making it accessible before the fork.

So if you are on Mac/Linux and the slow down is caused by passing the
large noise array, you could get around this by making it a global
somehow before the fork when you initiate the pool... ie

import mymodule
mymodule.noise = numpy.random.standard_normal((N_particles,noise_df))

then use this in func, dont pass the noise array in the map call.

But I agree with Zachary about using arrays of object parameters
rather than lists of objects each with their own parameter variables.

Cheers

Robin