[Web-SIG] CherryPy WSGI server and wsgi.input.read() with no argument.

James Y Knight foom at fuhm.net
Fri Mar 30 00:52:41 CEST 2007


On Mar 29, 2007, at 6:09 PM, Graham Dumpleton wrote:
> On 30/03/07, Robert Brewer <fumanchu at amor.org> wrote:
>
>> We chose to not simulate the EOF, requiring app authors do that for
>> themselves

CherryPy's deveopers are correct: they are following the WSGI spec.  
It is your app that is broken.

> As I believe I have pointed out on the Python web-sig list before, the
> statement:
>
> ""The application should not attempt to read more data than is
> specified by the CONTENT_LENGTH variable."""
>
> is actually a bit bogus.

This requirement comes from CGI. CGI scripts cannot support unknown  
data lengths (yes, this means no chunked transfer). CONTENT_LENGTH is  
required to be provided if there is data, and the server is not  
required to provide an EOF after reading CONTENT_LENGTH bytes. WSGI  
inherits the same restrictions.

I do agree with you that this was a mistake. WSGI should require WSGI  
servers/gateway to provide an EOF for read(), always, and should make  
a break from CGI and declare that CONTENT_LENGTH=0 means no data and  
CONTENT_LENGTH empty/missing means undefined length. This is  
something which ought to be fixed for the next revision of WSGI. This  
makes it a tiny bit harder to write a CGI gateway, of course, but  
it's worth it in my opinion, for the reasons you describe.

HOWEVER, given that the current WSGI spec does not specify that, apps  
*cannot* depend upon that behavior. If your app does an unbounded read 
(), it's wrong. And, by reference to the CGI spec, if a server omits  
CONTENT_LENGTH, and there is data, it is wrong. The server ought to  
return a 411 Length Required if you attempt to access a WSGI app and  
provide chunked data.

And, indeed, server code I wrote is wrong in just this way: it can  
omit CONTENT_LENGTH when given chunked data on input. Spec-compliant  
WSGI apps would then assume there's no input data which will then  
cause data loss. Luckily nobody ever passes chunked data on input. :)

James

PS: what about the readline(size) problem? Are we just going to  
continue indefinitely pretending that it's okay that the spec forbids  
using readline(size) and that cgi.FieldStorage calls it? Perhaps a  
WSGI 1.1 fixing these issues would be a good idea? 


More information about the Web-SIG mailing list