Using multiprocessing
Jesse Noller
jnoller at gmail.com
Fri Oct 10 22:01:24 EDT 2008
On Fri, Oct 10, 2008 at 4:32 PM, nhwarriors <edward.reed at gmail.com> wrote:
> I am attempting to use the (new in 2.6) multiprocessing package to
> process 2 items in a large queue of items simultaneously. I'd like to
> be able to print to the screen the results of each item before
> starting the next one. I'm having trouble with this so far.
>
> Here is some (useless) example code that shows how far I've gotten by
> reading the documentation:
>
> from multiprocessing import Process, Queue, current_process
>
> def main():
> facs = []
> for i in range(50000,50005):
> facs.append(i)
>
> tasks = [(fac, (i,)) for i in facs]
> task_queue = Queue()
> done_queue = Queue()
>
> for task in tasks:
> task_queue.put(task)
>
> for i in range(2):
> Process(target = worker, args = (task_queue, done_queue)).start()
>
> for i in range(len(tasks)):
> print done_queue.get()
>
> for i in range(2):
> task_queue.put('STOP')
>
> def worker(input, output):
> for func, args in iter(input.get, 'STOP'):
> result = func(*args)
> output.put(result)
>
> def fac(n):
> f = n
> for i in range(n-1,1,-1):
> f *= i
> return 'fac('+str(n)+') done on '+current_process().name
>
> if __name__ == '__main__':
> main()
>
> This works great, except that nothing can be output until everything
> in the queue is finished. I'd like to write out the result of fac(n)
> for each item in the queue as it happens.
>
> I'm probably approaching the problem all wrong - can anyone set me on
> the right track?
I'm not quite following: If you run this, the results are printed by
the main thread, unordered, as they are put on the results queue -
this works as intended (and the example this is based on works the
same way) .
For example:
result put Process-2
result put Process-1
fac(50000) done on Process-1
result put Process-2
fac(50001) done on Process-2
result put Process-1
fac(50003) done on Process-1
result put Process-2
fac(50002) done on Process-2
fac(50004) done on Process-2
You can see this if you expand the range:
result put Process-1
result put Process-2
result put Process-2
fac(50001) done on Process-2
result put Process-1
fac(50000) done on Process-1
fac(50003) done on Process-2
result put Process-2
result put Process-1
fac(50004) done on Process-2
result put Process-2
fac(50006) done on Process-2
result put Process-1
fac(50002) done on Process-1
fac(50005) done on Process-1
fac(50007) done on Process-1
result put Process-2
result put Process-1
fac(50008) done on Process-2
result put Process-2
result put Process-1
fac(50010) done on Process-2
result put Process-2
result put Process-1
fac(50009) done on Process-1
fac(50011) done on Process-1
fac(50013) done on Process-1
result put Process-2
fac(50012) done on Process-2
fac(50014) done on Process-2
One trick I use is when I have a results queue to manage, I spawn an
addition process to read off of the results queue and deal with the
results. This is mainly so I can process the results outside of the
main thread, as they appear on the results queue
-jesse
More information about the Python-list
mailing list