[Web-SIG] WSGI & transfer-encodings

Phillip J. Eby pje at telecommunity.com
Thu Sep 16 20:30:28 CEST 2004


At 01:57 PM 9/16/04 -0400, James Y Knight wrote:
>It is unclear to me from the WSGI spec what parts of HTTP a WSGI 
>application is responsible for handling, and what the host server or 
>middleware has to expect from the app.

The general section for such issues is:

     http://www.python.org/peps/pep-0333.html#other-http-features

The advice is that in general, a WSGI server should consider itself an HTTP 
proxy server, and should consider the application an HTTP origin server.

However, this doesn't fully cover the two issues you've brought up, so 
thanks for bringing them to my attention!


>1) Does the server need to decode incoming chunked encoding? The CGI spec 
>essentially forbids incoming requests with chunked (and thus all others as 
>well) transfer-encoding, as the CONTENT_LENGTH header is required to be 
>present when there is incoming content. Does WSGI do the same thing?
>
>I would suggest the answer should be that WSGI does *not* require 
>CONTENT_LENGTH to be present when there is incoming data.

Hm.  An interesting conundrum.  Do any Python servers or applications exist 
today that *work* when there's no content-length?

Personally, I'm thinking that WSGI should follow CGI here, and decode 
incoming transfer encodings.  If this means HTTP/1.1 servers have to dump 
the incoming data to a file first, so be it.



>The only way to tell if there's incoming data is therefore to attempt to 
>read() the input stream. read() will either immediately return an EOF 
>condition (returning '') or else read the data. Also, it seems that read() 
>with no args isn't allowed? Perhaps it should be.

A no-argument read would be problematic in some environments -- CGI for 
example.


>2) The server is responsible for connection-oriented headers, and the spec 
>states it may override the client's headers in this case. I would take 
>this to mean I should just ignore the client provided Connection and 
>Transfer-Encoding headers and supply those myself according to HTTP spec.
>
>But what about transfer-encoding? The spec says the server is allowed to 
>add a chunked encoding. But,
>- Is an application allowed to yield data that has already been encoded 
>into chunked form?
>- What if it does so and you're talking to a HTTP 1.0 client? Should the 
>server decode the chunking? Or should it just let the application produce 
>bogus output?
>- May the application provide data with a gzip transfer-encoding?
>- What if the server already handles all connection-oriented behavior 
>transparently and doesn't even pass on the Connection, Keep-Alive, TE, 
>Trailers, Transfer-Encoding, Upgrade headers to the client? Is that okay?

The answer to all these questions, according to the current spec, is yes, 
absolutely.  (Per the "server=proxy server, application=origin server" model).


>- Wouldn't providing pre-encoded data screw up middleware that is 
>expecting to do something useful with the data going through it?

Yes, it would.  There are at least two ways to handle it, though:

1. Don't use middleware that's not smart enough to handle your app's output

2. Have the server or middleware munge HTTP_ACCEPT_ENCODING or other 
parameters on the way in to the application, so that the application (if 
written correctly) won't send data the server or middleware can't handle.



>I would suggest that that the correct answer is: the application should 
>have nothing to do with any connection oriented behavior. It should not 
>send a Connection or Transfer-Encoding header and should not expect to 
>receive the Connection, Keep-Alive, TE, Trailers, Transfer-Encoding, or 
>Upgrade headers, although it is optional for the server to strip them. The 
>application should not apply a transfer-encodng to its output and the 
>server should not give it a transfer-encoded input.

I like most of this, *except* that I'd like to leave open the option of an 
application providing transfer-encoding on its output.  I'd rather have 
servers and middleware set HTTP_ACCEPT_ENCODING to "identity;q=1.0, *;q=0" 
(or an empty string, or delete the entry), if they interpret content, and 
have applications be required to respect this.  Specifically, an 
application can only apply a content-encoding if it matches a non-zero 
quality in HTTP_ACCEPT_ENCODING.



More information about the Web-SIG mailing list