How to queue functions

Kushal Kumaran kushal.kumaran+python at gmail.com
Tue Sep 18 04:31:55 EDT 2012


On Tue, Sep 18, 2012 at 10:26 AM, Dhananjay <dhananjay.c.joshi at gmail.com> wrote:
> Dear all,
>
> I am trying to use multiprocessing module.
> I have 5 functions and 2000 input files.
>
> First, I want to make sure that these 5 functions execute one after the
> other.
> Is there any way that I could queue these 5 functions within the same script
> ?
>
>
> Next, as there are 2000 input files.
> I could queue them by queue.put() and get back to run one by one using
> queue.get() as follows:
>
> for file in files:
>         if '.dat.gz' in file:
>             q.put(file)
>
> while True:
>         item = q.get()
>         x1 = f1(item)
>         x2 = f2(x1)
>         x3 = f3(x2)
>         x4 = f4(x3)
>         final_output = f5(x4)
>
>
> However, how can I input them on my 8 core machine, so that at a time 8
> files will be processed (to the set of 5 functions; each function one after
> the other) ?
>

The multiprocessing.Pool class seems to be what you need.
Documentation at
http://docs.python.org/py3k/library/multiprocessing.html#using-a-pool-of-workers

Example:

#!/usr/bin/env python3
import multiprocessing

def file_handler(filename):
    # do processing on filename, return the final_output
    print('working on {}'.format(filename))
    return 'processed-{}'.format(filename)

def main():
    p = multiprocessing.Pool(8)
    files = [ 'a', 'b', 'c' ]
    result = p.map(file_handler, files)
    print(result)

if __name__ == '__main__':
    main()

If you want, you can also implement everything using
multiprocessing.Process and multiprocessing.Queue, but using pools
should be simpler.

-- 
regards,
kushal



More information about the Python-list mailing list