[Web-SIG] WSGI, Python 3 and Unicode

James Bennett ubernostrum at gmail.com
Fri Dec 7 03:23:23 CET 2007


On Dec 6, 2007 6:15 PM, Phillip J. Eby <pje at telecommunity.com> wrote:
> WSGI already copes, actually.  Note that Jython and IronPython have
> this issue today, and see:
>
> http://www.python.org/dev/peps/pep-0333/#unicode-issues

I'm glad you brought that up, because it's been bugging me lately.

That section is somewhat ambiguous as-is, because in one sentence
applications are permitted to return strings encoded in a charset
other than ISO-8859-1, but in another they are unequivocally forbidden
to do so (with the "must not" in bold, even). And that's problematic
not only because of the ambiguity, but because the increasing
popularity of "AJAX" and web-based APIs is making it much more common
for WSGI applications to generate responses of types which do not
default to ISO-8859-1 -- e.g., XML and JSON, both of which default to
UTF-8.

Depending on how draconian one wishes to be when reading the relevant
section of WSGI, it's possible to conclude that XML and JSON must
always be transcoded/escaped to ISO-8859-1 -- with all the headaches
that entails -- before being passed to a WSGI-compliant piece of
software.

And the slightly less strict reading of the spec -- that such
gymnastics are required only when the string type of the Python
implementation is Unicode-based -- will grow increasingly troublesome
as/when Py3K enters production use.

So as long as we're talking about this, could the proscriptions with
respect to encoding perhaps be revisited and (hopefully)
clarified/revised?

-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."


More information about the Web-SIG mailing list