Parallel Python

robert no-spam at no-spam-no-spam.invalid
Thu Jan 11 09:17:49 EST 2007


sturlamolden wrote:
> robert wrote:
> 
>> Thus communicated data is "serialized" - not directly used as with threads or with custom shared memory techniques like POSH object sharing.
> 
> Correct, and that is precisely why MPI code is a lot easier to write
> and debug than thread code. The OP used a similar technique in his
> 'parallel python' project.

Thus there are different levels of parallelization:

1 file/database based; multiple batch jobs
2 Message Passing, IPC, RPC, ...
3 Object Sharing 
4 Sharing of global data space (Threads)
5 Local parallelism / Vector computing, MMX, 3DNow,...

There are good reasons for all of these levels.
Yet "parallel python" to me fakes to be on level 3 or 4 (or even 5 :-) ), while its just a level 2 system, where "passing", "remote", "inter-process" ... are the right vocables.

With all this fakes popping up - a GIL free CPython is a major feature request for Py3K - a name at least promising to run 3rd millenium CPU's ...


> This does not mean that MPI is inherently slower than threads however,
> as there are overhead associated with thread synchronization as well.

level 2 communication is slower. Just for selected apps it won't matter a lot.

> With 'shared memory' between threads, a lot more fine grained
> synchronization ans scheduling is needed, which impair performance and
> often introduce obscure bugs.

Its a question of chances and costs and nature of application.
Yet one can easily restrict inter-thread communcation to be as simple and modular or even simpler as IPC. Search e.g. "Python CallQueue" and "BackgroundCall" on Google.
Thread programming is less complicated as it seems. (Just Python's stdlib offers cumbersome 'non-functional' classes)


Robert



More information about the Python-list mailing list