`high overhead of multiple Python processes' (was: Willmultithreading make python less popular?)

Hendrik van Rooyen mail at microcorp.co.za
Sun Feb 22 03:09:39 EST 2009


"Paul Rubin" <http://phr.cx@NOSPAM.invalid> wrote:

> The cost of messing with the multiprocessing module instead of having
> threads work properly, and the overhead of serializing Python data
> structures to send to another process by IPC, instead of using the
> same object in two threads.  Also, one way I often use threading is by
> sending function objects to another thread through a Queue, so the
> other thread can evaluate the function.  I don't think multiprocessing
> gives a way to serialize functions, though maybe something like it
> can be done at higher nuisance using classes.

There are also Pyro and xmlrpc and shm. - all of them more apparent
hassle than threads, and all of them better able to exploit parallelism.

That said, this has made me think the following:

<conjecture>
It is an error to pass anything but plain data between processes,
as anything else does not scale easily.

Passing plain data between processes means either serialising
the data and using channels such as pipes or sockets, or 
passing a pointer to a shared memory block through a similar
channel.

Following this leads to a clean design, while attempting to pass
higher order stuff quickly leads to convoluted code when
you try to make things operate in parallel.
<!conjecture>

The above can be crudely summed up as:

You can share and pass data, but you should not pass 
or share code between processes.

Now the above is congruent with my own experience, 
but I wonder if it is generally applicable.

- Hendrik





More information about the Python-list mailing list