[Web-SIG] WSGI & transfer-encodings

Thu Sep 16 19:57:16 CEST 2004

It is unclear to me from the WSGI spec what parts of HTTP a WSGI 
application is responsible for handling, and what the host server or 
middleware has to expect from the app. Sorry if this has been discussed 
previously, but it doesn't appear in the PEP.

1) Does the server need to decode incoming chunked encoding? The CGI 
spec essentially forbids incoming requests with chunked (and thus all 
others as well) transfer-encoding, as the CONTENT_LENGTH header is 
required to be present when there is incoming content. Does WSGI do the 
same thing?

I would suggest the answer should be that WSGI does *not* require 
CONTENT_LENGTH to be present when there is incoming data. This requires 
at least the modification of:

> The server is not required to read past the client's specified 
> Content-Length, and is allowed to simulate an end-of-file condition if 
> the application attempts to read past that point. The application 
> should not attempt to read more data than is specified by the 
> CONTENT_LENGTH variable.

This would have to state something like: "The server must simulate an 
end-of-file condition if the application attempts to read more data 
than is specified by the Content-Length or the incoming 
Transfer-Encoding."

The only way to tell if there's incoming data is therefore to attempt 
to read() the input stream. read() will either immediately return an 
EOF condition (returning '') or else read the data. Also, it seems that 
read() with no args isn't allowed? Perhaps it should be.

2) The server is responsible for connection-oriented headers, and the 
spec states it may override the client's headers in this case. I would 
take this to mean I should just ignore the client provided Connection 
and Transfer-Encoding headers and supply those myself according to HTTP 
spec.

But what about transfer-encoding? The spec says the server is allowed 
to add a chunked encoding. But,
- Is an application allowed to yield data that has already been encoded 
into chunked form?
- What if it does so and you're talking to a HTTP 1.0 client? Should 
the server decode the chunking? Or should it just let the application 
produce bogus output?
- May the application provide data with a gzip transfer-encoding?
- What if the server already handles all connection-oriented behavior 
transparently and doesn't even pass on the Connection, Keep-Alive, TE, 
Trailers, Transfer-Encoding, Upgrade headers to the client? Is that 
okay?
- Wouldn't providing pre-encoded data screw up middleware that is 
expecting to do something useful with the data going through it?

I would suggest that that the correct answer is: the application should 
have nothing to do with any connection oriented behavior. It should not 
send a Connection or Transfer-Encoding header and should not expect to 
receive the Connection, Keep-Alive, TE, Trailers, Transfer-Encoding, or 
Upgrade headers, although it is optional for the server to strip them. 
The application should not apply a transfer-encodng to its output and 
the server should not give it a transfer-encoded input.

James