[Web-SIG] WSGI 2.0

Fri Oct 5 23:13:29 CEST 2007

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:34 PM 10/5/2007 +0100, Robin Bryce wrote:
> >Is there a means to support a non blocking read on wsgi.input ?
>
> No.  Some ideas have been proposed, but nobody has shown a practical
> scenario where it is useful.
>
> For it to be useful, you would have to have an asynchronous server
> that is interleaving in its main thread, and therefore requires
> applications to be non-blocking.

It requires asynchronous parts of the wsgi stack to co-operate with
the server in order to deal with requests which end up being processed
(or part processed) by synchronous components.

A requirement to be able to process *some* requests synchronously -
for a particular connection - should not prevent a server from
supporting both async & synchronous models of processing

>
> However, to run "normal" WSGI applications, such a server has to
> *allow* them to block, so it is going to have to run them in a
> different thread anyway.

Yes.

>
> This is why the whole idea of creating an async *variant* of WSGI is
> moot - an async WSGI protocol is essentially 100% incompatible with
> synchronous WSGI, since any async WSGI components can't use
> synchronous WSGI components, unless they spawn another thread or process.

This does not have to be the case. All synchronous wsgi components
require the presence of wsgi.input which behaves as specified in
pep-333.

No wsgi async *aware* components exist, because pep-333 does not allow it.

async *aware* components, like async servers in general, should be
willing to accept greater complexity in the interface. With some
additional complexity, exposed WSG 2.0 async aware components, I can't
see any reason wsgi 2.0 can't allow for both - provided that async
aware components always live at the top of the wsgi stack.

Here is my stab at it:

Let the async server provide

environ['wsgi.async_input']

Some to be agreed non-blocking, iterative, interface to the *content*
of a single request. It is legal for an async aware component to call
environ['wsgi.async_input'].next(), at most, once for each value of
response data it yields. Note that it need not call async_input.next()
every time it is resumed.

And substitute wsgi.async_input for wsgi.input in my previous message.

environ['wsgi.input_factory']

A callable. Which MUST be called by an application which wishes to
switch to synchronous processing for the remainder of the current
requests content. The application must yield the return value of this
factory as the next value it produces. The next time the application
is resumed the environ will contain a pep-333 compatible wsgi.input
environ key. Applications which call this function MUST accommodate
the possibility that that they will be resumed in a different thread
from that in which they called wsgi.input_factory

Let the server define its own interface for thread / process
interaction and provide it via server specific environ keys and expose
it through server specific environ keys.

Require, as MUST, that the server implementation provides a middle
ware component which uses that server specific api to support
wsgi.input_factory.

Perhaps *disallow* all but the top most wsgi application in the stack
from interacting with the server specific threading api.

Perhaps define a wsgi.resume_with_result callable such that it can be
leveraged *only* by async aware wsgi components - it lets async aware
components delegate a callable for execution in a different thread

With respect to wsgi.input its helps (me at any rate) to remember that
even an async server can not possibly proceed with the next request
until it knows it has read (up to or past) the end of the current
requests content boundary.

WSGI is defined at the per request level there is no need for the
async/sync middle ware bridge to 'push back' data. The server sees
both Content:close, Content-Length etc, and so can arrange for
wsgi.async_input to respect the boundaries.

I believe this would be enough to support an asynchronous
implementation of Comet.

http://en.wikipedia.org/wiki/Comet_%28programming%29 and
http://rphd.sourceforge.net/

This sketch is not completely shot from the hip. I have an async
server implementation (hey who hasn't these days) which I used mainly
as a means to explore *how* a server could possibly interact with an
async aware wsgi stack. See
http://svn.wiretooth.com/svn/open/asycamore/trunk/asycamore/

and in particular in httpconnectioncontext.py
   WSGIServiceContext.start_request
   HTTPServiceContext.continue_reading

It does not implement the above sketch but *could* easily do so.

> The whole thing is an exercise in futility, until/unless there is
> more than one such server and application, at which point they could
> get together and create AWSGI or WSGI-A or something of that sort.
>
>

That's to much chicken/egg for my tastes. All you are really saying is
that the CGI model covers the majority of 'common' use cases. I don't
know of anyone who would disagree with this. But as things stand all
wsgi-ish implementations which aim to support async/sync are consigned
to the dust bin of 'non conformant'. This acts as a strong
disincentive to experiment and innovate.

If, for clear technical reasons, nothing can be done so support mixing
async aware and synchronous applications in WSGI 2.0, then so it goes.

If it can't be done without imposing significant complexity on
applications that are perfectly happy with the highly successful wsgi
1.0 model, then fair enough - WSGI-A is a non starter.

Or are you against introducing features to support async servers and
composition of mixed async/sync stacks on principle ?

If a collective decision is made that WSGI will only ever support half
async (blocking read, asynchronous response) then both the pep and the
new spec should state this very clearly indeed.

Best,
Robin