[Web-SIG] WSGI and start_response

Thu Apr 8 22:18:02 CEST 2010

P.J. Eby ha scritto:
> At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> What I'm trying to do is:
>>
>> * as in the example I posted, turn Mako render function in a generator.
>>
>>   The reason is that I would lite to to implement support for Nginx
>>   subrequests.
> 
> By subrequest, do you mean that one request is invoking another, like
> one WSGI application calling multiple other WSGI applications to render
> one page containing contents from more than one?
> 

Yes.

> 
>>   During a subrequest, the generated response body is sent directly to
>>   the client, so it is necessary to be able to flush the Mako buffer
> 
> I don't quite understand this, since I don't know what Mako is, or, if
> it's a template engine, what flushing its buffer would have to do with
> WSGI buffering.
> 

Ah, sorry.

Mako is a template engine.
Suppose I have an HTML template file, and I want to use a sub request.

<html>
  <head>...</head>
  <body>
    <div>${subrequest('/header/'}</div>
    ...
  </body>
</html>

The problem with this code is that, since Mako will buffer all generated
content, the result response body will contain incorrect data.

It will first contain the response body generated by the sub request,
then the content generated from the Mako template (XXX I have not
checked this, but I think it is how it works).

So, when executing a sub request, it is necessary to flush (that is,
send to Nginx, in my case) the content generated from the template
before the sub request is done.

Since Mako does not return a generator (I asked the author, and it was
too hard to implement), I use a greenlet in order to "turn" the Mako
render function in a generator.

> 
>> > Under
>> > WSGI 1, you can do this by yielding empty strings before calling
>> > start_response.
>>
>> No, in this case this is not what I need to do.
> 
> Well, if that's not when you're needing to suspend the application, then
> I don't see what you're losing in WSGI 2.
> 
> 
>> I need to call start_response, since the greenlet middleware will yield
>> data to the caller before the application returns.
> 
> I still don't understand you.  In WSGI 1, the only way to suspend
> execution (without using greenlets) prior to determining the headers is
> to yield empty strings.
> 

Ah, you are right sorry.
But this is not required for the Mako example (I was focusing on that
example).

> I'm beginning to wonder if maybe what you're saying is that you want to
> be able to write an application function in the form of a generator? 

The greenlet middleware return a generator, in order to work.

> If
> so, be aware that any WSGI 1 app written as:
> 
>      def app(environ, start_response):
>          start_response(status, headers)
>          yield "foo"
>          yield "bar"
> 
> can be written as a WSGI 2 app thus:
> 
>      def app(environ, start_response):
>          def respond():
>              yield "foo"
>              yield "bar"
>          return status, headers, respond()
> 

The problem, as I wrote, is that with the greenlet middleware, the
application needs not to return a generator.

def app(environ):
    tmpl = ...
    body = tmpl.render(...)

    return status, headers, [body]

This is a very simple WSGI application.

But when using the greenlet middleware, and when using the function for
flushing Mako buffer, some data will be yielded *before* the application
returns and status and headers are passed to Nginx.

> This is also a good time for people to learn that generators are usually
> a *very bad* way to write WSGI apps 

It's the only way to be able to suspend execution, when the WSGI
implementation is embedded in an async web server not written in Python.

The reason is that you can not use (XXX check me) greenlets in C code,
you should probably use something like http://code.google.com/p/coev/

Greenlets can be used in gevent, as an example, because scheduling is
under control of Python code.
This is not the case with Nginx.

> - yielding is for server push or
> sending blocks of large files, not tiny strings.  

Again, consider the use of sub requests.
yielding a "not large" block is the only choice you have.

Unless, of course, you implement sub request support in pure Python (or
using SSI - Server Side Include).

Another use case is when you have a very large page, and you want to
return some data as soon as possible to avoid the user to abort request
if it takes some time.

Also, note that with Nginx (as with Apache, if I'm not wrong), even if
application yields small strings, the server can still do some buffering
in order to increase performance.

In ngx_http_wsgi_module buffering is optional (and disabled by default).

In the sub request example, it means that if both the main request
response body and sub request response body are small, Nginx can buffer
all the data in memory before sending it to the client (XXX I need to
check this).

> In general, if you're
> yielding more than one block, you're almost certainly doing WSGI wrong. 
> The typical HTML, XML, or JSON output that's 99% of a webapp's requests
> should be transmitted as a single string, rather than as a series of
> snippets.
> 

> IOW, the absence of generator support in WSGI 2 is a feature, not a bug.
> 

What do you mean by absence of generator support?
WSGI 2 applications can still return a generator.

> 
>> In my new attempt I plan to:
>>
>> 1) Implement the simple suspend/resume extension
>> 2) Implement a Python extension module that wraps the Nginx events
>>    system.
>> 3) Implement a pure Python WSGI middleware that, using greenlets, will
>>    enable normal applications to take advantage of Nginx async features.
> 
> I think maybe I'm understanding a little better now -- you want to
> implement the WSGI gateway entirely in C, without using any Python, and
> without using the greenlet API directly.
> 

Right.

> I think I've been unable to understand because I'm thinking in terms of
> a server implemented in Python, or at least that has the WSGI part
> implemented in Python.
> 

Yes.
I had a similar problem trying to explain how ngx_http_wsgi_module works
to another person (and I'm not even good at explaining things!).

> [...]

Thanks   Manlio