[Web-SIG] WSGI 2.0
Ian Bicking
ianb at colorstudy.com
Fri Mar 30 06:30:58 CEST 2007
Phillip J. Eby wrote:
> At 07:56 PM 3/29/2007 -0500, Ian Bicking wrote:
>> Do we want to discuss WSGI 2.0? I added a wiki page here to list
>> anything anyone wants to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0
>>
>> I've listed the things I can remember, and copying here:
>>
>>
>> start_response and write
>> ------------------------
>>
>> We could remove ``start_response`` and the writer that it implies. This
>> would lead to a signature like::
>>
>> def app(environ):
>> return '200 OK', [('Content-type', 'text/plain')], ['Hello
>> world']
>>
>> That is, return a three-tuple of (status, headers, app_iter).
>>
>> It's relatively simple to provide adapters to and from this signature to
>> the WSGI 1.0 signature.
>
> I think we also want to have a value you can yield from the app_iter to
> explicitly request that the buffer be flushed, and that we should reopen
> the discussion about values to be yielded to communicate with async
> servers, indicating that the iterator should be paused pending input or
> some other operation.
(this should probably be opened as a separate item from the signature
change, as I don't think it relates much to that)
I'd rather not introduce new objects, since we don't have any new
objects yet. None is an obvious object, but it's vague in this context.
To me it feels more like a pause than a flush. Flush really means
*do* something, and None feels like the no-op, which is more like a pause.
I've become interested in using WSGI middleware as an HTTP translating
proxy, so the async opportunities are of more interest to me now. In
part just the app_iter non-thread-affinity change would be helpful, I
think. Dealing with large request bodies is harder, I think, because
those would have to be processed before the WSGI app returned. But
that's less concerning to me.
It seems like if yielding None from an app_iter meant "put me at the
back of the queue" that would be a fairly simple and effective way of
handling async for large (or slow) response bodies. This wouldn't
really work for the Twisted stuff where you keep a response open and
trickle out data based on server-side events (because you can't control
when you get back to the beginning of the queue), but otherwise it seems
pretty good. I suppose full control could be allowed if you could do
something like return an object that could be part of the event loop
somehow. If we had some standard async-wrapping-key of some sort,
perhaps. For example (I say with no real knowledge of Deferred):
environ['wsgi.async_callback'] = EventMatcher
# in the app:
yield environ['wsgi.async_callback'](some_event)
# in the server:
for item in app_iter:
if isinstance(item, EventMatcher):
# queue up the app_iter, leaving it paused until something
# matching that event happens
I feel somehow that it could be useful for intermediaries to be able to
filter out this callback, and so a documented key (or keys) would be
good. But I can't quite place why I'd want to do that. Well, except
that any intermediary would have to be able to detect this kind of
object and pass it back up. So maybe instead of filtering it out of the
environ, there needs to be some easy test that can be applied.
What the event object looks like ("some_event"), I have no idea.
> Ideally, this should be done in a way that's easy for middleware to
> handle; a flush signal should be handled by the middleware *and* passed
> up the chain, while any other async signals would be passed directly up
> the chain (unless it's something like "pause for input" and the
> middleware controls the input).
>
> If we do this right, it should be easier to write middleware that works
> correctly with respect to buffering, since the issues of flushing and
> pausing now become explicit rather than implicit. (This should make it
> easier to teach/learn as well.)
In terms of buffering, I can't think of many cases where it would
matter. Either the middleware passes back the response with no changes,
or it needs to consume the entire response body (and probably headers
and maybe status) to do whatever transformation it needs to do.
Things like pauses and async signals would ideally be passed upstream,
but flushes and content would all be consumed by the middleware.
>> It's not clear if the app_iter must be used in the same thread as the
>> application. Since the application is blocking, presumably *it* must be
>> run all in one thread. This should be more explicitly documented.
>
> Definitely. I think that we should not require thread affinity between
> the application and the app_iter -- my feeling at this point is that
> actual yielding is an edge case with respect to most WSGI apps. The
> common case WSGI application should be just returning a list or tuple
> with a single string in it, and not doing any complex iteration.
> Allowing the server more flexibility here is probably the better choice.
>
> Indeed, I'm not sure we should require thread affinity across
> invocations of app_iter.next().
It seems unlikely there'd be a need to move it between threads, but then
it doesn't seem like there's much need for the application to have it
all called in one thread either (i.e., if you move threads once, moving
threads again shouldn't be a problem).
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
More information about the Web-SIG
mailing list