[Web-SIG] WSGI and start_response

Tue Apr 13 12:41:44 CEST 2010

P.J. Eby ha scritto:
> At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> Suppose I have an HTML template file, and I want to use a sub request.
>>
>> ...
>> ${subrequest('/header/'}
>> ...
>>
>> The problem with this code is that, since Mako will buffer all generated
>> content, the result response body will contain incorrect data.
>>
>> It will first contain the response body generated by the sub request,
>> then the content generated from the Mako template (XXX I have not
>> checked this, but I think it is how it works).
> 
> Okay, I'm confused even more now.  It seems to me like what you've just
> described is something that's fundamentally broken, even if you're not
> using WSGI at all.
> 

If you are referring to Mako being turned in a generator, yes, this
implementation is rather obscure.

I wrote it as a proof of concept.
Before this, I wrote a more polite implementation:
http://paste.pocoo.org/show/201324/

> 
>> So, when executing a sub request, it is necessary to flush (that is,
>> send to Nginx, in my case) the content generated from the template
>> before the sub request is done.
> 
> This seems to only makes sense if you're saying that the subrequest *has
> to* send its output directly to the client, rather than to the parent
> request.  

Yes, this is how subrequests work in Nginx. And I assume the same is
true for Apache.

> If the subrequest sends its output to the parent request (as a
> sane implementation would), then there is no problem. 

You are forgetting that Nginx is not an application server.
Why should the subrequest output returned to the parent?

This would only make it less efficient.

> Likewise, if the
> subrequest is sent to a buffer that's then inserted into the parent
> invocation.
> 
> Anything else seems utterly insane to me, unless you're basically taking
> a bunch of legacy CGI code using 'print' statements and hacking it into
> something else.  (Which is still insane, just differently. ;-) )
> 

We are talking about subrequest implementation in a efficient web server
written in C, like Nginx and Apache.

> 
>> Ah, you are right sorry.
>> But this is not required for the Mako example (I was focusing on that
>> example).
> 
> As far as I can tell, that example is horribly wrong.  ;-)
> 

I agree ;-)

> 
>> But when using the greenlet middleware, and when using the function for
>> flushing Mako buffer, some data will be yielded *before* the application
>> returns and status and headers are passed to Nginx.
> 
> And that's probably because sharing a single output channel between the
> parent and child requests is a bad idea.  ;-)
> 

No, this is not specific to subrequests.

As an example, here you can find an up to date greenlet adapters:
http://bitbucket.org/mperillo/txwsgi/src/tip/txwsgi/greenlet.py

The ``write_adapter`` **needs** to yield some data before WSGI
application return, because this is how the write callable workd.

The exposed ``gsuspend`` function, instead, will cause an empty string
to be yielded to the server, before the WSGI application returns.

> (Specifically, it's an increase in "temporal coupling", I believe.  I
> know it's some kind of coupling between functions that's considered bad,
> I just don't remember if that's the correct name for it.)
> 

Nginx code contains some coupling; I assume this is done because it was
designed with efficiency in mind.

> [...] 
> It's true that dropping start_response() means you can't yield empty
> strings prior to determining your headers, yes.
> 
> 
>> > - yielding is for server push or
>> > sending blocks of large files, not tiny strings.
>>
>> Again, consider the use of sub requests.
>> yielding a "not large" block is the only choice you have.
> 
> No, it isn't.  You can buffer your output and yield empty strings until
> you're ready to flush.
> 

As I wrote, this will not work if you want to use subrequest support
from Nginx.

> 
> 
>> Unless, of course, you implement sub request support in pure Python (or
>> using SSI - Server Side Include).
> 
> I don't see why it has to be "pure", actually.  It just that the
> subrequest needs to send data to the invoker rather than sending it
> straight to the client.
> 

You may say this, but it is not how subrequests are implemented in Nginx
;-).

> That's the bit that's crazy in your example -- it's not a scenario that
> WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do
> it to be a bug, not a feature.  ;-)
> 

Are you referring to the bad Mako example, or to the
``greenlet_adapter`` idea?

> That being said, I can see that removing start_response() closes a
> loophole that allows async apps to *potentially* exist under WSGI 1 (as
> long as you were able to tolerate the resulting crappy API).
> 
> However, to fix that crappy API requires greenlets or threads, at which
> point you might as well just use WSGI 2.  In the Nginx case, you can
> either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you
> can expose your C API to Python and write a small greenlets-using Python
> wrapper to support suspending.  

But this is already implemented using the ``greenlet_adapter`` in
txwsgi, and the x-wsgiorg.suspend extension.

And this implementation has the advantage that the greenlet_adapter
works on **every** WSGI implementation that supports the
x-wsgiorg.suspend extension.

> It would look something like:
> 
>     def gateway(request_info, app):
>         # set up environ
>         run(greenlet(lambda: Finished(app(environ))))
> 
>     def run(child):
>         while not child.dead:
>              data = child.switch()
>              if isinstance(data, Finished):
>                   send_status(data.status)
>                   send_headers(data.headers)
>                   send_response(data.response)
>              else:
>                  perform_appropriate_action_on(data)
>                  if data.suspend:
>                      # arrange for run(child) to be re-called later,
> then...
>                      return
> 

I have to actually implement this to check if it works.

This can be done using my txwsgi implementation.

If it can help, I can also implement WSGI 2.0 in txwsgi.  WSGI 1.0 and
WSGI 2.0 stacks will be independent, no adapter will be used (they will
just share most of the code).

> [...]

Regards  Manlio