[Web-SIG] PEP 444 Goals

Graham Dumpleton graham.dumpleton at gmail.com
Fri Jan 7 03:55:15 CET 2011


2011/1/7 Alex Grönholm <alex.gronholm at nextday.fi>:
> 07.01.2011 04:09, Graham Dumpleton kirjoitti:
>>
>> 2011/1/7 Graham Dumpleton<graham.dumpleton at gmail.com>:
>>>
>>> 2011/1/7 Alex Grönholm<alex.gronholm at nextday.fi>:
>>>>
>>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>>
>>>> One other comment about HTTP/1.1 features.
>>>>
>>>> You will always be battling to have some HTTP/1.1 features work in a
>>>> controllable way. This is because WSGI gateways/adapters aren't often
>>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>>> the modules implementing those protocols do, or even are hamstrung by
>>>> how those protocols work.
>>>>
>>>> The classic example is 100-continue processing. This simply cannot
>>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>>> mechanisms where proxying is performed as the protocol being used
>>>> doesn't implement a notion of end to end signalling in respect of
>>>> 100-continue.
>>>>
>>>> I think we need some concrete examples to figure out what is and isn't
>>>> possible with WSGI 1.0.1.
>>>> My motivation for participating in this discussion can be summed up in
>>>> that
>>>> I want the following two applications to work properly:
>>>>
>>>> - PlasmaDS (Flex Messaging implementation)
>>>> - WebDAV
>>>>
>>>> The PlasmaDS project is the planned Python counterpart to Adobe's
>>>> BlazeDS.
>>>> Interoperability with the existing implementation requires that both the
>>>> request and response use chunked transfer encoding, to achieve
>>>> bidirectional
>>>> streaming. I don't really care how this happens, I just want to make
>>>> sure
>>>> that there is nothing preventing it.
>>>
>>> That can only be done by changing the rules around wsgi.input is used.
>>> I'll try and find a reference to where I have posted information about
>>> this before, otherwise I'll write something up again about it.
>>
>> BTW, even if WSGI specification were changed to allow handling of
>> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
>> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>>
>> This is because all of these work on the expectation that the complete
>> request body can be written across to the separate application process
>> before actually reading the response from the application.
>>
>> In other words, both way streaming is not possible.
>>
>> The only solution which would allow this with Apache is mod_wsgi
>> embedded mode, which in mod_wsgi 3.X already has an optional feature
>> which can be enabled so as to allow you to step out of current bounds
>> of the WSGI specification and use wsgi.input as I will explain, to do
>> this both way streaming.
>>
>> Pure Python HTTP/WSGI servers which are a front facing server could
>> also be modified to handle this is WSGI specification were changed,
>> but whether those same will work if put behind a web proxy will depend
>> on how the front end web proxy works.
>
> Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?

Huh! Not sure you understand what I am saying. Even if you changed the
WSGI specification to allow for it, the bulk of implementations
wouldn't be able to support it. The WSGI specification has no
influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
proxy implementations and so cant be used to force them to be changed.

So, as much as I would like to see WSGI specification changed to allow
it, others may not on the basis that there is no point if few
implementations could support it.

Graham

>> Graham
>>
>>>> The WebDAV spec, on the other hand, says
>>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>>
>>>> The 102 (Processing) status code is an interim response used to inform
>>>> the
>>>> client that the server has accepted the complete request, but has not
>>>> yet
>>>> completed it. This status code SHOULD only be sent when the server has a
>>>> reasonable expectation that the request will take significant time to
>>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>>> reasonable, but arbitrary value) to process the server SHOULD return a
>>>> 102
>>>> (Processing) response. The server MUST send a final response after the
>>>> request has been completed.
>>>
>>> That I don't offhand see a way of being able to do as protocols like
>>> SCGI and CGI definitely don't allow interim status. I am suspecting
>>> that FASTCGI and AJP don't allow it either.
>>>
>>> I'll have to even do some digging as to how you would even handle that
>>> in Apache with a normal Apache handler.
>>>
>>> Graham
>>>
>>>> Again, I don't care how this is done as long as it's possible.
>>>>
>>>> The current WSGI specification acknowledges that by saying:
>>>>
>>>> """
>>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>>> in any of several ways:
>>>>
>>>> * Respond to requests containing an Expect: 100-continue request with
>>>> an immediate "100 Continue" response, and proceed normally.
>>>> * Proceed with the request normally, but provide the application with
>>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>>> the application first attempts to read from the input stream. The read
>>>> request must then remain blocked until the client responds.
>>>> * Wait until the client decides that the server does not support
>>>> expect/continue, and sends the request body on its own. (This is
>>>> suboptimal, and is not recommended.)
>>>> """
>>>>
>>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>>> an ability to control it at the application level then you will fail
>>>> with 100-continue to start with.
>>>>
>>>> So, although option 2 above would be the most ideal and is giving the
>>>> application control, specifically the ability to send an error
>>>> response based on request headers alone, and with reading the response
>>>> and triggering the 100-continue, it isn't practical to require it, as
>>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>>> implement it that way.
>>>>
>>>> The same goes for any other feature, there is no point mandating a
>>>> feature that can only be realistically implementing on a minority of
>>>> implementations. This would be even worse where dependence on such a
>>>> feature would mean that the WSGI application would no longer be
>>>> portable to another WSGI server and destroys the notion that WSGI
>>>> provides a portable interface.
>>>>
>>>> This isn't just restricted to HTTP/1.1 features either, but also
>>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>>> that are directly hooked into the URL parsing of the base HTTP server
>>>> can provide that information, which basically means that only pure
>>>> Python HTTP/WSGI servers are likely able to provide it without
>>>> guessing, and in that case such servers usually are always used where
>>>> WSGI application mounted at root anyway.
>>>>
>>>> Graham
>>>>
>>>> On 7 January 2011 09:29, Graham Dumpleton<graham.dumpleton at gmail.com>
>>>> wrote:
>>>>
>>>> On 7 January 2011 08:56, Alice Bevan–McGregor<alice at gothcandy.com>
>>>>  wrote:
>>>>
>>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>>
>>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>
>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al.
>>>> as
>>>> per the HTTP 1.1 RFC.
>>>>
>>>> Requirements on the HTTP compliance of the server don't really have any
>>>> place in the WSGI spec. You should be able to be WSGI compliant even if
>>>> you
>>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>>> requests via SCGI).
>>>> The original spec got this right: chunking etc are something which is
>>>> not
>>>> relevant to the wsgi application code -- it is up to the server to
>>>> implement
>>>> the HTTP transport according to the HTTP spec, if it's purporting to be
>>>> an
>>>> HTTP server.
>>>>
>>>> Chunking is actually quite relevant to the specification, as WSGI and
>>>> PEP
>>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow
>>>> for
>>>> chunked bodies regardless of higher-level support for chunking.  The
>>>> body
>>>> iterator.  Previously you /had/ to define a length, with chunked
>>>> encoding at
>>>> the server level, you don't.
>>>>
>>>> I agree, however, that not all gateways will be able to implement the
>>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google
>>>> search,
>>>> seems to support it as well. I should re-word it as:
>>>>
>>>> "For those servers capable of HTTP/1.1 features the implementation of
>>>> such
>>>> features is required."
>>>>
>>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>>> chunking of responses to the extent that the application can prepare
>>>> the final content including chunks as required by the HTTP
>>>> specification. Further, in Apache at least, the output from a web
>>>> application served via those protocols is still pushed through the
>>>> Apache output filter chain so as to allow the filters to modify the
>>>> response, eg., apply compression using mod_deflate. As a consequence,
>>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>>> filter stack. This means that were a web application to try and do
>>>> chunking itself, then Apache would rechunk such that the original
>>>> chunking became part of the content, rather than the transfer
>>>> encoding.
>>>>
>>>> So, in order to be able to achieve what I think you want, with a web
>>>> application being able to do chunking itself, you would need to modify
>>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>>> also like mod_cgi and mod_cgid of Apache.
>>>>
>>>> The only WSGI implementation I know of for Apache where you might even
>>>> be able to do what you want is uWSGI. This is because I believe from
>>>> memory it uses a mode in Apache by default called assbackwords. What
>>>> this allows is for the output from the web application to bypass the
>>>> Apache output filter stack and directly control the raw HTTP output.
>>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>>> of the ability to actually use Apache output filters and for Apache to
>>>> fix up response headers in any way. There is a flag in uWSGI which can
>>>> optionally be set to make it use the more traditional mode and not use
>>>> assbackwords.
>>>>
>>>> Thus, I believe you would be fighting against server implementations
>>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>>> chunking to be supported at the level of the web application.
>>>>
>>>> About all you can do is ensure that the WSGI specification doesn't
>>>> include anything in it which would prevent a web application
>>>> harnessing indirectly such a feature as chunking where the web server
>>>> supports it.
>>>>
>>>> As it is, it isn't chunked responses which is even the problem,
>>>> because if a underlying web server supports chunking for responses,
>>>> all you need to do is not set the content length.
>>>>
>>>> The problem area with chunking is the request content as the way that
>>>> the WSGI specification is written prevents being able to have chunked
>>>> request content. I have described the issue previously and made
>>>> suggestions about alternate way that wsgi.input could be used.
>>>>
>>>> Graham
>>>>
>>>> +1
>>>>
>>>>        - Alice.
>>>
>
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>


More information about the Web-SIG mailing list