[Web-SIG] Proposal for asynchronous WSGI variant

Christopher Stawarz cstawarz at csail.mit.edu
Wed May 7 21:00:10 CEST 2008


On May 6, 2008, at 8:51 PM, Ionel Maries Cristian wrote:

> - there is no support for chunked input - that would require having  
> support for readline in the first place,

Why is readline a requirement for chunked input?  Each chunk specifies  
its size, and the application receiving a chunk just keeps calling  
recv() until it's read the specified number of bytes.

> also, it should be the gateway's business decoding the chunked input.

OK, but if it's the gateway's responsibility, then this isn't an issue  
at all, as decoding of chunked data takes place before the application  
ever sees the request body.

To be clear, I didn't mean to imply that awsgi.input must be the  
actual socket object connected to the client.  It just has to provide  
a recv() method with the semantics of a socket.  The server is free to  
pre-read the entire request, or it can receive data on demand,  
decoding any chunked input before it passes it to the application.

> - i don't see how removing the write callable will help (i don't see  
> a issue having the server providing a stringio.write as the write  
> callable for synchronous apps)

Manlio explained this well, so I'll refer you to his response.

> - passing nonstring values though middleware will make using/porting  
> existing wsgi middleware hairy (suppose you have a middleware that  
> applies some filter to the appiter - you'll have your code full of  
> isinstance nastiness)

Yes, my proposal would require existing middleware to be modified to  
support AWSGI, which is unfortunate.

> Also, have you looked at the existing gateway implementations with  
> asynchronous support?
> There are a bunch of them:
> http://trac.wiretooth.com/public/wiki/asycwsgi
> http://chiral.j4cbo.com/trac
> http://wiki.secondlife.com/wiki/Eventlet
> my own shot at the problem: http://code.google.com/p/cogen/
> and manlio's mod_wsgi for nginx
> (I may be missing some)

I've seen some of these, but I'll be sure to take a look at the others.

> [*1]In my implementation i do a bunch of tricks to make use of  
> regular wsgi middleware with async apps possible - i have a bunch of  
> working examples using pylons:
>  - the extensions in the environ (like your  
> environ['awsgi.readable']) return a empty string that penetrates  
> most[*2] middleware and set the actual message (like your (token,  
> fd, timeout) tuple on some internal object)
> From this point of view, an async middleware stack is just a set of  
> middleware that supports streaming.

This is an interesting idea that I'd like to explore some more.  I  
really like the fact that it works with existing middleware (or at  
least fully WSGI-compliant middleware, as you point out).

Apart from the write() callable, the biggest issue I see with the WSGI  
spec for asynchronous servers is wsgi.input.  The problem is that this  
is explicitly a file-like object.  This means that input.read(n) reads  
until it finds n bytes or EOF, input.readline() reads until it finds a  
newline or EOF, and input.readlines() and input.__iter__() always read  
to EOF.  Every one of these functions implies multiple I/O operations  
(calls to fread() for a file or recv() for a socket).

This means that if an application calls input.read(8), and only 4  
bytes are available, the first call to recv() returns 4 bytes, and the  
second one blocks.  And now your entire server is blocked until data  
is available on this one socket.   (Of course, the server is free to  
pre-read the entire request at its leisure and feed it to the  
application from a buffer, but this may not always be practical or  
desirable, and I don't think asynchronous servers should be forced to  
do so.)

This is why I propose replacing wsgi.input with awsgi.input, which  
exposes a recv() method with socket-like (rather than file-like)  
semantics.  The meaning of input.recv(n) is therefore "read at most n  
bytes (possibly less), calling the underlying socket recv() at most  
one time".

So, although your suggestion may eliminate the need to yield non- 
string output from the application iterable, I still think there needs  
to be a separate specification for asynchronous gateways, since the  
semantics of wsgi.input just aren't compatible with an asynchronous  
model.


Chris


More information about the Web-SIG mailing list