multiprocessing shows no benefit

Thomas Nyberg tomuxiong at gmx.com
Fri Oct 20 04:33:45 EDT 2017


Correct me if I'm wrong, but at a high level you appear to basically
just have a mapping of strings to values and you are then shifting all
of those values by a fixed constant (in this case, `z = 5`). Why are you
using a dict at all? It would be better to use something like a numpy
array or a series from pandas. E.g. something like this without
multiprocessing:

-----------------------------------------
import pandas as pd
from timeit import default_timer as timer

s = pd.Series(
        xrange(100000),
        index=[str(val) for val in xrange(100000)])

z = 5
start = timer()
x = s - 5
duration = float(timer() -start)
print duration, len(x), len(x) / duration
-----------------------------------------

Then if you wanted to multiprocess it, you could basically just split
the series into num_cpu pieces and then concatenate results afterwards.

Though I do agree with others here that the operation itself is so
simple that IPC might be a drag no matter what.

Cheers,
Thomas



More information about the Python-list mailing list