Producer-consumer threading problem

George Sakkis george.sakkis at gmail.com
Tue Jun 10 23:33:44 EDT 2008


I'd like some feedback on a solution to a variant of the producer-
consumer problem. My first few attempts turned out to deadlock
occasionally; this one seems to be deadlock-free so far but I can't
tell if it's provably correct, and if so, whether it can be
simplified.

The generic producer-consumer situation with unlimited buffer capacity
is illustrated at http://docs.python.org/lib/condition-objects.html.
That approach assumes that the producer will keep producing items
indefinitely, otherwise the consumer ends up waiting forever. The
extension to the problem I am considering requires the consumer to be
notified not only when there is a new produced item, but also when
there is not going to be a new item so that it stops waiting. More
specifically, I want a generator (or iterator class) with the
following generic signature:

def iter_consumed(items, produce, consume):
    '''Return an iterator over the consumed items.

    :param items: An iterable of objects to be `produce`()d and
`consume`()d.

    :param produce: A callable `f(item)` that produces a single item;
the return
        value is ignored. What "produce" exactly means is application-
specific.

    :param consume: A callable `f()` that consumes a previously
produced item
        and returns the consumed item. What "consume" exactly means is
        application-specific. The only assumption is that if `produce`
is called
        `N` times, then the next `N` calls to `consume` will
(eventually, though
        not necessarily immediatelly) return, i.e they will not block
indefinitely.
    '''

One straightforward approach would be to serialize the problem: first
produce all `N` items and then call consume() exactly N times.
Although this makes the solution trivial, there are at least two
shortcomings. First, the client may have to wait too long for the
first item to arrive. Second, each call to produce() typically
requires resources for the produced task, so the maximum resource
requirement can be arbitrarily high as `N` increases. Therefore
produce() and consume() should run concurrently, with the invariant
that the calls to consume are no more than the calls to produce. Also,
after `N` calls to produce and consume, neither should be left
waiting.

I pasted my current solution at http://codepad.org/FXF2SWmg. Any
feedback, especially if it has to do with proving or disproving its
correctness, will be appreciated.

George



More information about the Python-list mailing list