[Web-SIG] WSGI 2.0

Sat Oct 6 16:33:01 CEST 2007

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
> >That's to much chicken/egg for my tastes. All you are really saying is
> >that the CGI model covers the majority of 'common' use cases. I don't
> >know of anyone who would disagree with this. But as things stand all
> >wsgi-ish implementations which aim to support async/sync are consigned
> >to the dust bin of 'non conformant'. This acts as a strong
> >disincentive to experiment and innovate.
> >
> >If, for clear technical reasons, nothing can be done so support mixing
> >async aware and synchronous applications in WSGI 2.0, then so it goes.
> >
> >If it can't be done without imposing significant complexity on
> >applications that are perfectly happy with the highly successful wsgi
> >1.0 model, then fair enough - WSGI-A is a non starter.
> >
> >Or are you against introducing features to support async servers and
> >composition of mixed async/sync stacks on principle ?
>
> Not in *principle*, only in practice.  :)  If you read the archives
> of a few years back, I was rather enthusiastic until I realized that
> there really wasn't any way to make it of practical benefit.

I have tried to follow the history of "we want more asynch support in
wsgi" but I don't think I've kept up with you on this.

> See, in order for a server to take advantage of an application's
> "asynchronous" nature, the server has to *know* the application won't
> "block".  That is, the app has to *promise* not to block.  (Because
> without this promise, the server is forced to run the app in a
> separate thread or process, so as not to block the server.)
>
> But in order for the app to make this promise, it can only use
> components that either make the same promise, unless it runs *them*
> in other threads or processes...  which means giving up on easily
> composing applications from multiple WSGI components.
>

Which is why I drew a distinction between async *aware* components and
others and advocated a composition model in which the composer of the
wsgi stack must guarantee that async aware components live at the top.
Ie, a synchronous component can not sensibly be provided with a means
to drive an async aware component.

This places the burden of the composition problem firmly on the server
and those components writen specifically to be async aware and yet
allows those components to take advantage synchronous components from
time to time.

> So far, discussion on this matter has hinged on the claim that it's
> *possible* to make such mixed stacks, and I don't disagree.  What
> nobody has shown is that it's 1. practical, and 2. produces some
> actual benefit, compared to the synchronous model now in use.  As a
> practical matter, the vast majority of Python web applications and
> frameworks are synchronous by nature, and those that aren't are
> already tied to a specific async API.
>
> If we were going to try to implement an asynchronous WSGI, what we
> would *really* need to do is discard the app_iter and make write()
> the standard way of sending the body.  This would let us implement a
> CPS (continuation-passing style) API.  We would also have to change
> the input stream so that instead of reading from it, we instead
> passed it functions to be called when input was available, and so
> on.  We would also need a way to tell write() that we were finished
> writing, and some way to manage connection timeouts.
>

I don't understand why you think this is necessary. I especially don't
like the thought that there is an argument that useful and performant
wsgi-a support is impossible without requiring use of CSP. I *like*
the app_iter model and believe it is perfectly workable for an async
component - provided that:

1. There is a non-blocking variant of wsgi.input say wsgi.async_input
2. There is a means for an async aware component to signal the server
that it should process the remainder of the current request in a
synchronous manner.
3. The server and async aware components are allowed to use an
extended set of yield values which provide the co-operative
communication necessary for performant async components.
   3a. A yield that means "don't resume me until there is more data
available on wsgi.async_input"
   3b. A yield that means "I ran out of data reading from
wsgi.async_input but please continue resuming me anyway as I have
useful work to do"

And a yield of the empty string means the same as it does for wsgi 1.0

3a & 3b allows the component to pass "up" the information that the
server needs to determine that the underlying socket has encountered
EAGAIN on recv. The async aware component *knows* what its last yield
was and so can reliably interpret resume after 3a as meaning "more
data available". After 3b it does no harm to the perfomance of the
server if the component speculatively attempts to read from
wsgi.async_input.

Absence of wsgi.input in the environ until the 'switch' takes place
will cause any accidentally included synchronous application to break
if it attempts to perform a blocking read on the input. An async
server should have no problem with synchronous applications that
*dont* use wsgi.input yes ?

> Unfortunately, this programming style is verbose and more difficult
> to learn for people versed in less "twisted" ways of programming.  To
> write middleware in this style, you also need to write deeply nested
> functions.  And synchronous servers would need to figure out what to
> do when an application returns without having called start_response()
> yet or figured out how to close the stream.

Agreed. I have always assumed that async aware components would be
incompatible with synchronous servers.

> Anyway, my point here is that I see how we could either cater to
> synchronous apps or async apps in a given API.  But throwing a
> half-baked async API on top of a synchronous one is just making a
> mess and helping no-one.

...

> My gut feel is that it's harder to write middleware for WSGI-A style
> of API, because you have to do at least doubly nested functions if
> you're dealing with the output at all (as this example shows).
>
> And if we mix modes, then we have this sort of messy back-and-forth
> adaptation in between.  And as best I can tell, the proposal for a
> mixed-mode API that you gave would actually make it even *harder*
> than this to write WSGI middleware, as there would be similar
> boundary issues for the input stream.

No I'm definitely not advocating mixed modes. I'm saying that I want a
means to allow an async aware component to switch the current request
to synchronous processing for the remainder of the request. And
explicitly _dont_ think its sensible to attempt to support synchronous
-> asynchronous. The only reason for supporting the switch at all is
to enable async aware components to leverage synchronous components
"from time to time".

Async aware components would be harder to write than synchronous but
synchronous components would remain as they are. And, by avoiding CSP,
asynchronous servers could freely leverage wsgi 1.0 style components
which don't consume wsgi.input

Perhaps I should attempt an asyncwsgiref, which by my definition
should be able to host apps in wsgiref but not the converse.

More to say but out of time for today.

Cheers,

Robin