[Web-SIG] Python 3.0 and WSGI 1.0.

Graham Dumpleton graham.dumpleton at gmail.com
Thu Apr 2 01:42:23 CEST 2009


2009/4/2 Robert Brewer <fumanchu at aminus.org>:
> Graham Dumpleton wrote:
>> 2009/4/2 Robert Brewer <fumanchu at aminus.org>:
>> > Alan Kennedy wrote:
>> >> Hi Graham,
>> >>
>> >> I think yours is a good solution to the problem.
>> >>
>> >> [Graham]
>> >> > In other words, leave all the existing CGI variables to come
>> through
>> >> > as latin-1 decode
>> >>
>> >> As latin-1 or rfc-2047 decoded, to unicode.
>> >>
>> >> > and do anything new in 'wsgi' variable namespace,
>> >>
>> >> So the server provides
>> >>
>> >> "wsgi.server_decoded_SCRIPT_NAME" == u"whatever"
>> >> "wsgi.server_decoded_PATH_INFO" == u"whatever"
>> >> "wsgi.server_decode_charset" == u"utf-8"
>> >
>> > I think everyone at the sprint today acquiesced to having
>> > SCRIPT_NAME/PATH_INFO/QUERY_STRING be set in the environ as unicode.
>> The
>> > server can decide (probably subject to configuration). I've
>> implemented
>> > this in the python3 branch of CherryPy and it seems to work
>> brilliantly.
>> > Assuming the server *is* configurable, deployers should be able to
>> > choose Latin-1 if they need to recover the original bytes, without
>> > having to support a separate set of encoded-byte entries.
>>
>> Seems to me that you can't have it be configurable and it must always
>> be latin-1 interpretation. The problem is where you are composing
>> multiple WSGI applications. If they each have different expectations
>> or requirements as to how it is handled, aren't you going to have a
>> problem. Or am I missing something in the way you are explaining it?
>
> I would not expect multiple middlewares to want to decode the same URI
> differently.

I was not thinking about multiple middlewares, but multiple distinct
WSGI applications (end consumer, not middleware) composited together
by something like Paste cascade, Pylons configuration or even
something like a routes based dispatcher.

In the case of something like cascade they aren't necessarily on
different URLs. For the later they would be, even so, just making sure
that having different URLs with different encodings isn't going to be
an issue in respect of mapping middleware. So long as code/config
files are always UTF-8 encoded and capable of representing any
possible decodings of URL, then probably okay.

Graham


More information about the Web-SIG mailing list