multiprocessing shows no benefit

Ian Kelly ian.g.kelly at gmail.com
Wed Oct 18 12:31:22 EDT 2017


On Wed, Oct 18, 2017 at 10:13 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> On Wed, Oct 18, 2017 at 9:46 AM, Jason <jasonhihn at gmail.com> wrote:
>> #When I change line19 to True to use the multiprocessing stuff it all slows down.
>>
>> from multiprocessing import Process, Manager, Pool, cpu_count
>> from timeit import default_timer as timer
>>
>> def f(a,b):
>>         return dict_words[a]-b
>
> Since the computation is so simple my suspicion is that the run time
> is dominated by IPC, in other words the cost of sending objects back
> and forth outweighs the gains you get from parallelization.
>
> What happens if you remove dict_words from the Manager and just pass
> dict_words[a] across instead of just a? Also, I'm not sure why
> dict_keys is a managed list to begin with since it only appears to be
> handled by the main process.

Timings from my system:

# Original code without using Manager
$ python test.py
CPUs: 12
<built-in function map> 0.0757319927216 100000 1320445.9094
<bound method Pool.map of <multiprocessing.pool.Pool object at
0x7fb11f93a390>> 0.143120765686 100000 698710.627495

# Original code with Manager
$ python test.py
CPUs: 12
<built-in function map> 5.5354039669 100000 18065.5288391
<bound method Pool.map of <multiprocessing.pool.Pool object at
0x7fdc61f07490>> 4.3253660202 100000 23119.4307101

# Modified code without Manager and avoiding sharing the dict
$ python test.py
CPUs: 12
<built-in function map> 0.0657241344452 100000 1521511.09854
<bound method Pool.map of <multiprocessing.pool.Pool object at
0x7ff29c636350>> 0.0966320037842 100000 1034853.83811



More information about the Python-list mailing list