[Web-SIG] WSGI and long response header values

Phillip J. Eby pje at telecommunity.com
Fri Sep 8 23:55:02 CEST 2006


At 02:02 PM 9/8/2006 -0700, Robert Brewer wrote:

>PEP 333 says:
>
>"Each header_value must not include any control characters, including 
>carriage returns or linefeeds, either embedded or at the end. (These 
>requirements are to minimize the complexity of any parsing that must be 
>performed by servers, gateways, and intermediate response processors that 
>need to inspect or modify response headers.)" [1]
>
>That's understandable, but HTTP headers are defined as (mostly) *TEXT, and 
>"words of *TEXT MAY contain characters from character sets other than 
>ISO-8859-1 only when encoded according to the rules of RFC 2047." [2] And 
>RFC 2047 specifies that "an 'encoded-word' may not be more than 75 
>characters long...If it is desirable to encode more text than will fit in 
>an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by 
>CRLF SPACE) may be used." [3] This satisfies HTTP header folding rules, as 
>well: "Header fields can be extended over multiple lines by preceding each 
>extra line with at least one SP or HT." [1, again]
>
>So in my reading of HTTP, some code somewhere should introduce newlines in 
>longish, encoded response header values. I see three options:
>
>  1. Keep things as they are and disallow response header values if they 
> contain words over 75 chars that are outside the ISO-8859-1 character set
>  2. Allow newline characters in WSGI response headers
>  3. Require/strongly suggest WSGI servers to do the encoding and folding 
> before sending the value over HTTP.
>
>Any other solutions? I'd like to see 2 or 3 adopted (unless something 
>better comes along), so CherryPy can continue to support as much of the 
>HTTP spec as possible.

#3 sounds most attractive, although I must confess I don't see how it could 
be made to work, since the strings have to be encoded already, unless 
you're saying that applications should encode them in chunks of up to 75 
characters, separated by spaces, and that the servers should then fold the 
result.  That would certainly seem like the least-intrusive way to deal 
with it, as a slight clarification to the spec, rather than any real 
*change* to the spec (and hence a new version of it) as #2 would require.



More information about the Web-SIG mailing list