Thread vs. generator problem

Paul Rubin http
Fri May 26 18:37:18 EDT 2006


As I understand it, generators are supposed to run til they hit a
yield statement:

   import time
   def f():
     print 1
     time.sleep(3)
     for i in range(2,5):
        yield i

   for k in f():
      print k

prints "1" immediately, sleeps for 3 seconds, then prints 2, 3, and 4
without pausing, as expected.  When I try to do it in a separate thread:

    import time, itertools
    def remote_iterate(iterator, cachesize=5):
        # run iterator in a separate thread and yield its values
        q = Queue.Queue(cachesize)
        def f():
            print 'thread started'
            for x in iterator:
                q.put(x)
        threading.Thread(target=f).start()
        while True:
            yield q.get()

    g = remote_iterate(itertools.count)
    print 'zzz...'
    time.sleep(3)
    print 'hi'
    for i in range(5):
        print g.next()

I'd expect to see 'thread started' immediately, then 'zzz...', then a 3
second pause, then 'hi', then the numbers 0..4.  Instead, the thread
doesn't start until the 3 second pause has ended.

When I move the yield statement out of remote_iterate's body and
instead have return a generator made in a new internal function, it
does what I expect:

    import time, itertools
    def remote_iterate(iterator, cachesize=5):
        # run iterator in a separate thread and yield its values
        q = Queue.Queue(cachesize)
        def f():
            print 'thread started'
            for x in iterator:
                q.put(x)
        threading.Thread(target=f).start()
        def g():
           while True:
               yield q.get()
        return g()

Any idea what's up?  Is there some race condition, where the yield
statement freezes the generator before the new thread has started?  Or
am I just overlooking something obvious?

Thanks.



More information about the Python-list mailing list