Thread vs. generator problem

Tim Peters tim.peters at gmail.com
Fri May 26 19:07:11 EDT 2006


[Paul Rubin]
> ...
> When I try to do it in a separate thread:
>
>     import time, itertools
>     def remote_iterate(iterator, cachesize=5):
>         # run iterator in a separate thread and yield its values
>         q = Queue.Queue(cachesize)
>         def f():
>             print 'thread started'
>             for x in iterator:
>                 q.put(x)
>         threading.Thread(target=f).start()
>         while True:
>             yield q.get()
>
>     g = remote_iterate(itertools.count)

You didn't run this code, right?  itertools.count() was intended.  In
any case, as when calling any generator, nothing in the body of
remote_iterate() is executed until the generator-iterator's next()
method is invoked.  Nothing here does that.  So, in particular, the
following is the _only_ line that can execute next:

>     print 'zzz...'

And then this line:

>     time.sleep(3)

And then this:

>     print 'hi'

And then this:

>     for i in range(5):

And then the first time you execute this line is the first time any
code in the body of remote_iterate() runs:

>         print g.next()
>
> I'd expect to see 'thread started' immediately, then 'zzz...', then a 3
> second pause, then 'hi', then the numbers 0..4.  Instead, the thread
> doesn't start until the 3 second pause has ended.

That's all as it must be.

> When I move the yield statement out of remote_iterate's body and
> instead have return a generator made in a new internal function, it
> does what I expect:
>
>     import time, itertools
>     def remote_iterate(iterator, cachesize=5):

Note that remote_iterate() is no longer a generator, so its body is
executed as soon as it's called.

>         # run iterator in a separate thread and yield its values
>         q = Queue.Queue(cachesize)
>         def f():
>             print 'thread started'
>             for x in iterator:
>                 q.put(x)
>         threading.Thread(target=f).start()

And so the thread starts when remote_iterate() is called.

>         def g():
>            while True:
>                yield q.get()
>         return g()
>
> Any idea what's up?  Is there some race condition, where the yield
> statement freezes the generator before the new thread has started?

No.

> Or am I just overlooking something obvious?

No, but it's not notably subtle either ;-)



More information about the Python-list mailing list