Parallelization in Python 2.6

Wed Aug 19 08:27:43 EDT 2009

Hendrik van Rooyen wrote:
> On Tuesday 18 August 2009 22:45:38 Robert Dailey wrote:
>
>   
>> Really, all I'm trying to do is the most trivial type of
>> parallelization. Take two functions, execute them in parallel. This
>> type of parallelization is called "embarrassingly parallel", and is
>> the simplest form. There are no dependencies between the two
>> functions. They do requires read-only access to shared data, though.
>> And if they are being spawned as sub-processes this could cause
>> problems, unless the multiprocess module creates pipelines or other
>> means to handle this situation.
>>     
>
> Just use thread then and thread.start_new_thread.
> It just works.
>
> - Hendrik
>
>   
But if you do it that way, it's slower than sequential.  And if you have 
a multi-core processor, or two processors, or ...   then it gets much 
slower yet, and slows down other tasks as well.

With the current GIL implementation, for two CPU-bound tasks, you either 
do them sequentially, or make a separate process.

Now, you can share data between separate processes, and if the data is 
truly going to be readonly, you shouldn't have any locking issues.

Naturally you should do your own timings.  Maybe your particular CPU and 
OS will have different results.

DaveA