[Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
Manlio Perillo
manlio_perillo at libero.it
Fri Dec 4 10:46:16 CET 2009
And Clover ha scritto:
> Manlio Perillo wrote:
>
>> Words of *TEXT MAY contain characters from character sets other than
>> ISO-8859-1 [22] only when encoded according to the rules of RFC 2047
>
> Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to
> RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 itself
> specifically denies that an encoded-word can go in a quoted-string.
>
> RFC2047 encoded-words are not on-topic in an HTTP header(*); this has
> been confirmed by newer development work on HTTPbis by Reschke et al.
> (http://tools.ietf.org/wg/httpbis/).
>
Thanks.
HTTPbis seems to fix all these problems:
"Historically, HTTP has allowed field content with text in the ISO-
8859-1 [ISO-8859-1] character encoding and supported other character
sets only through use of [RFC2047] encoding. In practice, most HTTP
header field values use only a subset of the US-ASCII character
encoding [USASCII]. Newly defined header fields SHOULD limit their
field values to US-ASCII characters. Recipients SHOULD treat other
(obs-text) octets in field content as opaque data."
This is the new rule for `quoted-string`:
quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
; OWS / <VCHAR except DQUOTE and "\"> / obs-text
obs-text = %x80-FF
quoted-pair = "\" ( WSP / VCHAR / obs-text )
> The "correct" way of escaping header parameters in an RFC*822-family
> protocol would be RFC2231's complex encoding scheme, but HTTP is
> explicitly not an 822-family protocol despite sharing many of the same
> constructs. See
> http://tools.ietf.org/html/draft-reschke-rfc2231-in-http-06 for a
> strategy for how 2231 should interact with HTTP, but note that for now
> RFC2231-in-HTTP simply does not exist in any deployed tools.
>
It seems reasonable.
> So for now there is basically nothing useful WSGI can do other than
> provide direct, byte-oriented (even if wrapped in 8859-1 unicode
> strings) access to headers.
>
Yes, this is what I think.
I have some doubts about wrapping the headers in 8859-1 unicode strings,
but luckily there is surrogateescape.
Regards Manlio
More information about the Web-SIG
mailing list