[Web-SIG] Iterators, generators and threads.

Alan Kennedy py-web-sig at xhaus.com
Fri Sep 3 14:07:12 CEST 2004


Dear Sig,

With the focus on iterables in WSGI, I think we may need to put 
something into the WSGI spec about generators and threading.

As I'm sure you're all aware, generators are an excellent mechanism for 
generating content on demand: a perfect fit for memory efficient WSGI 
"pull" processing and for event driven servers.

However, generator-iterators are different from other iterables, in that 
they cannot be resumed/iterated  simultaneously from multiple threads 
(without external locking anyway).

Pep 255 is specific on the topic: "Restriction:  A generator cannot be 
resumed while it is actively running". Which effectively means that a 
generator cannot be used from multiple threads without some form of 
external synchronization/locking.

Offhand, I can't think of scenarios where a WSGI server or application 
would *need* to iterate over an iterable across multiple threads. But I 
can certainly think of multiple server architectures where the request 
and its related response will pass through multiple threads before 
completion. Whether or not it would make sense for such architectures to 
iterate an iterable from multiple threads: well, I don't know: is it 
possible some server designer might attempt something like this?

Which would probably work as long as the iterable is not a generator. 
But if it is: *boom*, the generator could be resumed simultaneously from 
multiple threads, thus resulting in a ValueError.

Perhaps we need to describe this problem in the PEP? Or are python 
programmers suppoed to be big and old enough to know these things?

I find myself wondering: is this a cpython specific thing? Does resuming 
a generator from multiple threads have any meaning?

Obviously, calling a standard function/method from different threads 
works because each thread gets an independent stack frame, i.e. local 
variables, etc. So if there is no (unsynchronized) shared state between 
the threads, everything will work fine.

Since a generator is a single resumable stack frame, resuming it 
multiple times simultaneously from multiple threads won't work, from an 
isolation point-of-view.

Or am I mis-understanding it? Is the restriction somehow related to the 
cpython's GIL?

Obviously, resuming general iterators from multiple threads is related. 
Pep 234 makes no statements about threads (well, one unrelated reference 
to modifying dictionaries while they are being iterated). So I take this 
to mean that iterating iterables from multiple threads is acceptable.

Regards,

Alan.

P.S. I hope Phillip is OK. He said yesterday that he was right in the 
Frances path, although obviously that path will have a significant 
margin for error. But Frances is *huge*: see this stunning picture from 
NASA.

http://antwrp.gsfc.nasa.gov/apod/ap040903.html




More information about the Web-SIG mailing list