[Web-SIG] PEP 444 Goals

Fri Jan 7 05:12:20 CET 2011

07.01.2011 04:55, Graham Dumpleton kirjoitti:
> 2011/1/7 Alex Grönholm<alex.gronholm at nextday.fi>:
>> 07.01.2011 04:09, Graham Dumpleton kirjoitti:
>>> 2011/1/7 Graham Dumpleton<graham.dumpleton at gmail.com>:
>>>> 2011/1/7 Alex Grönholm<alex.gronholm at nextday.fi>:
>>>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>>>
>>>>> One other comment about HTTP/1.1 features.
>>>>>
>>>>> You will always be battling to have some HTTP/1.1 features work in a
>>>>> controllable way. This is because WSGI gateways/adapters aren't often
>>>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>>>> the modules implementing those protocols do, or even are hamstrung by
>>>>> how those protocols work.
>>>>>
>>>>> The classic example is 100-continue processing. This simply cannot
>>>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>>>> mechanisms where proxying is performed as the protocol being used
>>>>> doesn't implement a notion of end to end signalling in respect of
>>>>> 100-continue.
>>>>>
>>>>> I think we need some concrete examples to figure out what is and isn't
>>>>> possible with WSGI 1.0.1.
>>>>> My motivation for participating in this discussion can be summed up in
>>>>> that
>>>>> I want the following two applications to work properly:
>>>>>
>>>>> - PlasmaDS (Flex Messaging implementation)
>>>>> - WebDAV
>>>>>
>>>>> The PlasmaDS project is the planned Python counterpart to Adobe's
>>>>> BlazeDS.
>>>>> Interoperability with the existing implementation requires that both the
>>>>> request and response use chunked transfer encoding, to achieve
>>>>> bidirectional
>>>>> streaming. I don't really care how this happens, I just want to make
>>>>> sure
>>>>> that there is nothing preventing it.
>>>> That can only be done by changing the rules around wsgi.input is used.
>>>> I'll try and find a reference to where I have posted information about
>>>> this before, otherwise I'll write something up again about it.
>>> BTW, even if WSGI specification were changed to allow handling of
>>> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
>>> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>>>
>>> This is because all of these work on the expectation that the complete
>>> request body can be written across to the separate application process
>>> before actually reading the response from the application.
>>>
>>> In other words, both way streaming is not possible.
>>>
>>> The only solution which would allow this with Apache is mod_wsgi
>>> embedded mode, which in mod_wsgi 3.X already has an optional feature
>>> which can be enabled so as to allow you to step out of current bounds
>>> of the WSGI specification and use wsgi.input as I will explain, to do
>>> this both way streaming.
>>>
>>> Pure Python HTTP/WSGI servers which are a front facing server could
>>> also be modified to handle this is WSGI specification were changed,
>>> but whether those same will work if put behind a web proxy will depend
>>> on how the front end web proxy works.
>> Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?
> Huh! Not sure you understand what I am saying. Even if you changed the
> WSGI specification to allow for it, the bulk of implementations
> wouldn't be able to support it. The WSGI specification has no
> influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
> proxy implementations and so cant be used to force them to be changed.
I believe I understand what you are saying, but I don't want to restrict 
the freedom of the developer just because of some implementations that 
can't support some particular feature. If you need to do streaming, use 
a server that supports it, obviously! If Java can do it, why can't we? I 
would hate having to rely on a non-standard implementation if we have 
the possibility to standardize this in a specification.
> So, as much as I would like to see WSGI specification changed to allow
> it, others may not on the basis that there is no point if few
> implementations could support it.
>
> Graham
>
>>> Graham
>>>
>>>>> The WebDAV spec, on the other hand, says
>>>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>>>
>>>>> The 102 (Processing) status code is an interim response used to inform
>>>>> the
>>>>> client that the server has accepted the complete request, but has not
>>>>> yet
>>>>> completed it. This status code SHOULD only be sent when the server has a
>>>>> reasonable expectation that the request will take significant time to
>>>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>>>> reasonable, but arbitrary value) to process the server SHOULD return a
>>>>> 102
>>>>> (Processing) response. The server MUST send a final response after the
>>>>> request has been completed.
>>>> That I don't offhand see a way of being able to do as protocols like
>>>> SCGI and CGI definitely don't allow interim status. I am suspecting
>>>> that FASTCGI and AJP don't allow it either.
>>>>
>>>> I'll have to even do some digging as to how you would even handle that
>>>> in Apache with a normal Apache handler.
>>>>
>>>> Graham
>>>>
>>>>> Again, I don't care how this is done as long as it's possible.
>>>>>
>>>>> The current WSGI specification acknowledges that by saying:
>>>>>
>>>>> """
>>>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>>>> in any of several ways:
>>>>>
>>>>> * Respond to requests containing an Expect: 100-continue request with
>>>>> an immediate "100 Continue" response, and proceed normally.
>>>>> * Proceed with the request normally, but provide the application with
>>>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>>>> the application first attempts to read from the input stream. The read
>>>>> request must then remain blocked until the client responds.
>>>>> * Wait until the client decides that the server does not support
>>>>> expect/continue, and sends the request body on its own. (This is
>>>>> suboptimal, and is not recommended.)
>>>>> """
>>>>>
>>>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>>>> an ability to control it at the application level then you will fail
>>>>> with 100-continue to start with.
>>>>>
>>>>> So, although option 2 above would be the most ideal and is giving the
>>>>> application control, specifically the ability to send an error
>>>>> response based on request headers alone, and with reading the response
>>>>> and triggering the 100-continue, it isn't practical to require it, as
>>>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>>>> implement it that way.
>>>>>
>>>>> The same goes for any other feature, there is no point mandating a
>>>>> feature that can only be realistically implementing on a minority of
>>>>> implementations. This would be even worse where dependence on such a
>>>>> feature would mean that the WSGI application would no longer be
>>>>> portable to another WSGI server and destroys the notion that WSGI
>>>>> provides a portable interface.
>>>>>
>>>>> This isn't just restricted to HTTP/1.1 features either, but also
>>>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>>>> that are directly hooked into the URL parsing of the base HTTP server
>>>>> can provide that information, which basically means that only pure
>>>>> Python HTTP/WSGI servers are likely able to provide it without
>>>>> guessing, and in that case such servers usually are always used where
>>>>> WSGI application mounted at root anyway.
>>>>>
>>>>> Graham
>>>>>
>>>>> On 7 January 2011 09:29, Graham Dumpleton<graham.dumpleton at gmail.com>
>>>>> wrote:
>>>>>
>>>>> On 7 January 2011 08:56, Alice Bevan–McGregor<alice at gothcandy.com>
>>>>>   wrote:
>>>>>
>>>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>>>
>>>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>>
>>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al.
>>>>> as
>>>>> per the HTTP 1.1 RFC.
>>>>>
>>>>> Requirements on the HTTP compliance of the server don't really have any
>>>>> place in the WSGI spec. You should be able to be WSGI compliant even if
>>>>> you
>>>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>>>> requests via SCGI).
>>>>> The original spec got this right: chunking etc are something which is
>>>>> not
>>>>> relevant to the wsgi application code -- it is up to the server to
>>>>> implement
>>>>> the HTTP transport according to the HTTP spec, if it's purporting to be
>>>>> an
>>>>> HTTP server.
>>>>>
>>>>> Chunking is actually quite relevant to the specification, as WSGI and
>>>>> PEP
>>>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow
>>>>> for
>>>>> chunked bodies regardless of higher-level support for chunking.  The
>>>>> body
>>>>> iterator.  Previously you /had/ to define a length, with chunked
>>>>> encoding at
>>>>> the server level, you don't.
>>>>>
>>>>> I agree, however, that not all gateways will be able to implement the
>>>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google
>>>>> search,
>>>>> seems to support it as well. I should re-word it as:
>>>>>
>>>>> "For those servers capable of HTTP/1.1 features the implementation of
>>>>> such
>>>>> features is required."
>>>>>
>>>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>>>> chunking of responses to the extent that the application can prepare
>>>>> the final content including chunks as required by the HTTP
>>>>> specification. Further, in Apache at least, the output from a web
>>>>> application served via those protocols is still pushed through the
>>>>> Apache output filter chain so as to allow the filters to modify the
>>>>> response, eg., apply compression using mod_deflate. As a consequence,
>>>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>>>> filter stack. This means that were a web application to try and do
>>>>> chunking itself, then Apache would rechunk such that the original
>>>>> chunking became part of the content, rather than the transfer
>>>>> encoding.
>>>>>
>>>>> So, in order to be able to achieve what I think you want, with a web
>>>>> application being able to do chunking itself, you would need to modify
>>>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>>>> also like mod_cgi and mod_cgid of Apache.
>>>>>
>>>>> The only WSGI implementation I know of for Apache where you might even
>>>>> be able to do what you want is uWSGI. This is because I believe from
>>>>> memory it uses a mode in Apache by default called assbackwords. What
>>>>> this allows is for the output from the web application to bypass the
>>>>> Apache output filter stack and directly control the raw HTTP output.
>>>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>>>> of the ability to actually use Apache output filters and for Apache to
>>>>> fix up response headers in any way. There is a flag in uWSGI which can
>>>>> optionally be set to make it use the more traditional mode and not use
>>>>> assbackwords.
>>>>>
>>>>> Thus, I believe you would be fighting against server implementations
>>>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>>>> chunking to be supported at the level of the web application.
>>>>>
>>>>> About all you can do is ensure that the WSGI specification doesn't
>>>>> include anything in it which would prevent a web application
>>>>> harnessing indirectly such a feature as chunking where the web server
>>>>> supports it.
>>>>>
>>>>> As it is, it isn't chunked responses which is even the problem,
>>>>> because if a underlying web server supports chunking for responses,
>>>>> all you need to do is not set the content length.
>>>>>
>>>>> The problem area with chunking is the request content as the way that
>>>>> the WSGI specification is written prevents being able to have chunked
>>>>> request content. I have described the issue previously and made
>>>>> suggestions about alternate way that wsgi.input could be used.
>>>>>
>>>>> Graham
>>>>>
>>>>> +1
>>>>>
>>>>>         - Alice.
>> _______________________________________________
>> Web-SIG mailing list
>> Web-SIG at python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>>