[Web-SIG] Python 3.0 and WSGI 1.0.

Bill Janssen janssen at parc.com
Wed Apr 1 19:59:56 CEST 2009


Guido van Rossum <guido at python.org> wrote:

> On Wed, Apr 1, 2009 at 5:18 AM, Robert Brewer <fumanchu at aminus.org> wrote:
> > Good timing. We had been thinking to make everything strings except for
> > SCRIPT_NAME, PATH_INFO, and QUERY_STRING, since these few are pulled
> > from the Request-URI, which may be in any encoding. It was thought that
> > the app would be best-qualified to decode those three.
> 
> Argh. The *meaning* of these fields is clearly text.

I wouldn't read too much into those names -- they were chosen when the
CGI spec was just gestating, long before the usage patterns solidified,
and don't necessarily reflect the usage of the data bound to them.  I
believe this work was done before the formal IETF definition of a URL,
for instance.

I think the controlling reference here is RFC 3875.

It's not at all clear to me what the SCRIPT_NAME is.  Is it a pathname,
involving the local file system's filenames, which recent discussions
seem to indicate may or may not correspond to human-notional strings, or
a URI path?  I'm OK with calling it text, with a proviso that there may
be cases where it's not.

I've never actually seen a CGI call with PATH_INFO set; I think it's
obsolete usage (but pretty clearly a string).  RFC 3875 says, "Similarly,
treatment of non US-ASCII characters in the path is system-defined."

QUERY_STRING -- should always be an ASCII string.  May indeed encode
non-Unicode strings or purely binary data, but when passed to the CGI
script, it's still encoded as it was in the URI.

Bill


More information about the Web-SIG mailing list