[Web-SIG] HTTP 1.1 Expect/Continue handling

Tue Jan 29 08:53:02 CET 2008

On Jan 29, 2008, at 1:36 AM, Brian Smith wrote:

> 1. The WSGI gateway must send the response headers immediately when  
> the application yields its first non-empty string.
>
> 2. When there is an "100-continue" token in the request "Expect:"  
> header, the WSGI gateway is allowed to delay sending the "100  
> Continue" response until the application reads from  
> environ["wsgi.input"].
>
> Consequently, if there is a 100-continue expectation, then a WSGI  
> application must not read from wsgi.input after yielding its first  
> non-empty string.
>
> For example, the following application results in undefined  
> (probably erroneous) behavior:
>
> 	def application(environ, start_response):
> 		start_response("400 Bad Request", [])
> 		yield "400 Bad Request"
> 		environ["wsgi.input"].read(1)

Agreed, this is ambiguous in the WSGI specs. However, there is a  
mitigating factor:

The above example should not cause misbehavior when talking to well- 
designed clients. Clients are basically required to always send the  
request body, whether or not a 100-continue arrives, unless the  
connection gets closed, in order to work with older and misdesigned  
servers. They may delay a bit, to see if the server will close the  
connection, but otherwise ought to start sending the request body in  
any case.

However, this omission in the WSGI spec does allow for violation of  
the HTTP RFC:
> Upon receiving a request which includes an Expect request-header  
> field with the “100-continue” expectation, an origin server MUST  
> either respond with 100 (Continue) status and continue to read from  
> the input stream, or respond with a final status code. The origin  
> server MUST NOT wait for the request body before sending the 100  
> (Continue) response. If it responds with a final status code, it MAY  
> close the transport connection or it MAY continue to read and  
> discard the rest of the request. It MUST NOT perform the requested  
> method if it returns a final status code.

If you changed your example to start_response("200 OK", []), that  
would violate the "MUST NOT perform the requested method" clause.

I see three ways to resolve this:

a) One is to clarify this as a requirement upon the WSGI gateway.  
Something like the following:
"If the client requests Expect: 100-continue, and the application  
yields data before reading from the input, and the response code is a  
success (2xx) code, then the gateway MUST send a 100 continue  
response, before writing any other response headers in order to comply  
with RFC 2616 §8.2.3 and to allow the WSGI application to read from  
the input stream later on in request processing".

This should handle most real-world cases. Now, only sending 100 when  
the response code is 2xx may be potentially a bit fragile, and won't  
help e.g. your dummy app above. (maybe some real app really did want  
the input data even for an error response too?). But, on the other  
hand, you really *don't* want to force the transmission of a 100  
continue when the server is sending e.g. a "400 Bad Request" response  
and likely won't ever read input data.

b) Alternatively, the WSGI gateway could raise an exception when you  
attempt to respond with a success code without having read the input.  
This also satisfies RFC2616's prohibition against a successful  
execution of the request without a 100 continue response, but seems to  
me more likely to break things than help them, so I'd say (a) is  
strictly better.

c) Another option is to clarify this as a requirement for a WSGI  
application: "An application must not read from wsgi.input after  
yielding its first non-empty string unless it has already read from  
wsgi.input before having yielded its first non-empty string.  
(environ["wsgi.input"].read(0) may be used to indicate the desire to  
read the input in the future and satisfy this requirement, without  
actually reading any data.)"

The way I see it, (a) is not a change in the spec, but just a  
clarification. The combination of the current spec and HTTP RFC imply  
that you should do that already, in order to not violate 2616  
(although it's quite likely nobody actually is, not having realized  
the requirement). (b) on the other hand, is truly a change in the  
spec, but is a bit theoretically cleaner.

> Should the application be able to detect whether there is a "100- 
> continue" token in the Expect header of the request?

No.

> Or, is the WSGI gateway allowed/required to hide the token?

Allowed.

> Another consequence is that an application cannot explicitly respond  
> with a "100 Continue" itself, like this:
> 	
>      def application(environ, start_response):
> 		start_response("100 Continue", [])
> 		yield ""
> 		start_response("200 OK", [])
> 		yield "OK"
>
> The reasons is that start_response cannot be called twice except  
> when an exception is detected, and also the "100 Continue" would not  
> be sent until right before the "200 OK" was sent anyway.

That's not really a consequence of the above discussion, but, yes,  
that's true.

James