[Web-SIG] Chunked Tranfer encoding on request content.

Graham Dumpleton grahamd at dscpl.com.au
Mon Mar 5 05:50:43 CET 2007


Robert Brewer wrote ..
> Graham Dumpleton wrote:
> > In CherryPy, when it sees that the Transfer-Encoding
> > is set to 'chunked' while parsing the HTTP headers,
> > it will at that point, even before it has called
> > start_response for the WSGI application, read in all
> > content from the body of the request.
> > 
> > CherryPy reads in the content like this for two reasons.
> > The first is so that it can then determine the overall
> > length of the content that was available and set the
> > CONTENT_LENGTH value in the WSGI environ.
> 
> Right; IIRC the rfile just hangs if you try to read
> past Content-Length. Perhaps that can be fixed inside
> socket.makefile somewhere?
> 
> > The second reason is so that it can read in any
> > additional HTTP header fields that may occur in
> > the trailer after the last data chunk and also
> > incorporate them into the WSGI environ.
> 
> Yeah; I didn't see any other way to get Trailers into
> the environ. Perhaps that can be added to WSGI 2.0?

Don't know how you could cater for trailers in WSGI 2.0 without coming up with
some totally new scheme of passing such additional information to the WSGI
application.

First idea I can think of at present is that if chunked transfer encoding
that WSGI server sets 'wsgi.trailers' as an empty dictionary which it keeps a
reference to and only populates when it actually encounters the trailers. Ie.,
only guaranteed to be set when read() finally returns an empty string. Any
middleware would have to be obligated to pass the reference though and not
actually copy the dictionary so that changes made later back at WSGI server
layer would be available to application.

Second idea I can think of is a new member function in 'wsgi.input' called
'trailers()' which could be used to access them. Alternatively, 'wsgi.trailers'
could also be a function. Either way, it could return None when not yet known
and dictionary when it is.

One problem with this is that in Apache, when the trailers are encountered, the
lower level HTTP filter simply merges them on top of the existing input headers.
You don't want to pass the full set of input headers again, so simply means the
WSGI adapter for Apache would need to remember what headers it sent in environ
to begin with and only put in trailers what had changed and thus were actually in
the trailer.

Anyway, it looks for the time being that if I am going to support streaming of
chunked data that I state as a limitation that trailers aren't available as WSGI
doesn't support a way of getting them.

BTW, I looked around at the various packages trying to provide a WSGI server
and I can't find one besides CherryPy WSGI server that even attempts to support
chunked encoding on input. Makes it hard to use what other people did as a
guide. :-(

Graham


More information about the Web-SIG mailing list