multiprocessing, pool, queue length

Ian Kelly ian.g.kelly at gmail.com
Mon Mar 21 15:12:14 EDT 2016


On Mon, Mar 21, 2016 at 4:25 AM, Michael Welle <mwe012008 at gmx.net> wrote:
> Hello,
>
> I use a multiprocessing pool. My producer calls pool.map_async()
> to fill the pool's job queue. It can do that quite fast, while the
> consumer processes need much more time to empty the job queue. Since the
> producer can create a lot of jobs, I thought about asking the pool for
> the amount of jobs it has in its queue and then only produce more jobs
> if the current value is below a threshold. It seems like the pool
> doesn't want to tell me the level of the queue, does it? What is a
> better strategy to solve this problem? Implementing a pool around
> multiprocessing's Process and Queue?

A simple solution would be to have a shared multiprocessing.Value that
tracks how many items are in the pool. Whenever the producer produces
items it increments the Value, and whenever a consumer finishes a job
it decrements the Value.

An alternative solution that doesn't require adding a small amount of
work to every job would be to have the producer add a sentinel task
that does nothing at or near the end of the batch, and either wait on
the result or check it periodically. When it's done, then the pool is
low enough to add more jobs.



More information about the Python-list mailing list