[Web-SIG] WSGI and start_response

Thu Apr 8 23:53:10 CEST 2010

At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:
>Suppose I have an HTML template file, and I want to use a sub request.
>
>...
>${subrequest('/header/'}
>...
>
>The problem with this code is that, since Mako will buffer all generated
>content, the result response body will contain incorrect data.
>
>It will first contain the response body generated by the sub request,
>then the content generated from the Mako template (XXX I have not
>checked this, but I think it is how it works).

Okay, I'm confused even more now.  It seems to me like what you've 
just described is something that's fundamentally broken, even if 
you're not using WSGI at all.

>So, when executing a sub request, it is necessary to flush (that is,
>send to Nginx, in my case) the content generated from the template
>before the sub request is done.

This seems to only makes sense if you're saying that the subrequest 
*has to* send its output directly to the client, rather than to the 
parent request.  If the subrequest sends its output to the parent 
request (as a sane implementation would), then there is no 
problem.  Likewise, if the subrequest is sent to a buffer that's then 
inserted into the parent invocation.

Anything else seems utterly insane to me, unless you're basically 
taking a bunch of legacy CGI code using 'print' statements and 
hacking it into something else.  (Which is still insane, just 
differently. ;-) )

>Ah, you are right sorry.
>But this is not required for the Mako example (I was focusing on that
>example).

As far as I can tell, that example is horribly wrong.  ;-)

>But when using the greenlet middleware, and when using the function for
>flushing Mako buffer, some data will be yielded *before* the application
>returns and status and headers are passed to Nginx.

And that's probably because sharing a single output channel between 
the parent and child requests is a bad idea.  ;-)

(Specifically, it's an increase in "temporal coupling", I believe.  I 
know it's some kind of coupling between functions that's considered 
bad, I just don't remember if that's the correct name for it.)

> > This is also a good time for people to learn that generators are usually
> > a *very bad* way to write WSGI apps
>
>It's the only way to be able to suspend execution, when the WSGI
>implementation is embedded in an async web server not written in Python.

It's true that dropping start_response() means you can't yield empty 
strings prior to determining your headers, yes.

> > - yielding is for server push or
> > sending blocks of large files, not tiny strings.
>
>Again, consider the use of sub requests.
>yielding a "not large" block is the only choice you have.

No, it isn't.  You can buffer your output and yield empty strings 
until you're ready to flush.

>Unless, of course, you implement sub request support in pure Python (or
>using SSI - Server Side Include).

I don't see why it has to be "pure", actually.  It just that the 
subrequest needs to send data to the invoker rather than sending it 
straight to the client.

That's the bit that's crazy in your example -- it's not a scenario 
that WSGI 2 should support, and I'd consider the fact that WSGI 1 
lets you do it to be a bug, not a feature.  ;-)

That being said, I can see that removing start_response() closes a 
loophole that allows async apps to *potentially* exist under WSGI 1 
(as long as you were able to tolerate the resulting crappy API).

However, to fix that crappy API requires greenlets or threads, at 
which point you might as well just use WSGI 2.  In the Nginx case, 
you can either do WSGI 1 in C and then use an adapter to provide WSGI 
2, or you can expose your C API to Python and write a small 
greenlets-using Python wrapper to support suspending.  It would look 
something like:

     def gateway(request_info, app):
         # set up environ
         run(greenlet(lambda: Finished(app(environ))))

     def run(child):
         while not child.dead:
              data = child.switch()
              if isinstance(data, Finished):
                   send_status(data.status)
                   send_headers(data.headers)
                   send_response(data.response)
              else:
                  perform_appropriate_action_on(data)
                  if data.suspend:
                      # arrange for run(child) to be re-called later, then...
                      return

Suspension now works by switching back to the parent greenlet with 
command objects (like Finished()) to tell the run() loop what to 
do.  The run() loop is not stateful, so when the task is unsuspended, 
you simply call run(child) again.

A similar structure would exist for send_response() - i.e., it's a 
loop over the response, can break out of the loop if it needs to 
suspend, and arranges for itself to be re-called at the appropriate time.

Voila - you now have asynchronous WSGI 2 support.

Now, whether you actually *want* to do that is a separate question, 
but as (I hope) you can see, you definitely *can* do it, and without 
needing any greenlet-using code to be in C.  From C, you just call 
back into one of the Python top-level loops (run() and 
send_response()), which then does the appropriate task switching.

>Another use case is when you have a very large page, and you want to
>return some data as soon as possible to avoid the user to abort request
>if it takes some time.

That's the server push case -- but of course that's not a problem 
even in WSGI 2, since the "response" can still be a generator.

>Also, note that with Nginx (as with Apache, if I'm not wrong), even if
>application yields small strings, the server can still do some buffering
>in order to increase performance.

In which case, it's in violation of the WSGI spec.  The spec requires 
eparately-yielded strings to be flushed to OS-level buffering.

>What do you mean by absence of generator support?
>WSGI 2 applications can still return a generator.

Yes - but they can't *be* a generator - previously they could, due to 
the separate start_response callable.