[Web-SIG] WSGI, Python 3 and Unicode
Alan Kennedy
pywebsig at xhaus.com
Fri Dec 7 12:24:03 CET 2007
[Alan]
>> The restriction to iso-8859-1 is really a distraction; iso-8859-1 is
>> used simply as an identity encoding that also enforces that all
>> "bytes" in the string have a value from 0x00 to 0xff, so that they are
>> suitable for byte-oriented IO. So, in output terms at least, WSGI *is*
>> a byte-oriented protocol. The problem is the python-the-language
>> didn't have support for bytes at the time WSGI was designed.
[Thomas]
> If you're talking about the "output stream", then yes, it's all about
> bytes (or should be).
Indeed, I was only talking about output, specifically the response body.
> But at the status and headers level, HTTP/1.1 is
> fundamentally ISO-8859-1-encoded.
Agreed.
That is why the WSGI spec also states
"""
Note also that strings passed to start_response() as a status or as
response headers must follow RFC 2616 with respect to encoding. That
is, they must either be ISO-8859-1 characters, or use RFC 2047 MIME
encoding.
"""
So in order to use non-ISO-8859-1 characters in response status
strings or headers, you must use RFC 2047.
As confirmed by the links you posted, this is a HTTP restriction, not
a WSGI restriction.
Regards,
Alan.
More information about the Web-SIG
mailing list