[Web-SIG] Proposal to remove SCRIPT_NAME/PATH_INFO

Wed Sep 23 08:42:26 CEST 2009

Hi,

Ian Bicking schrieb:
> I propose we switch primarily to "native" strings: str on both Python 2 and
> 3.
I'm starting to think that this is the best idea.

> I then propose that we eliminate SCRIPT_NAME and PATH_INFO.  Instead we
> have:
IMO they should stick around for compatibility with older applications
and be latin1 encoded on Python 3.  But the use is discouraged.

> Again, it would be better to do;
> 
> parse_cookie(urllib.unquote(environ['HTTP_COOKIE']).decode('utf8'))
That will only work in Python 2, in Python 2 urllib.unquote already
yields unicode strings and assumes an utf-8 quoted string.

> Other variables like environ['wsgi.url_scheme'], environ['CONTENT_TYPE'],
> etc, will be native strings.  A Python 3 hello work app will then look like:
> 
> def hello_world(environ):
>     return ('200 OK', [('Content-type', 'text/html; charset=utf8')], ['Hello
> World!'.encode('utf8')])
>
> start_response and changes to wsgi.input are incidental to what I'm
> proposing here (except that wsgi.input will be bytes); we can decide about
> themseparately.
If we go about dropping start_response, can we move the app iter to the
beginning?  That would be consistent with the signature of common
response objects, making it possible to do this:

    response = Response(*hello_world(environ))

In general I think doing too many changes at once is harmful so I'm
happy to stick with start_response for another iteration of WSGI.

> Well, the biggie: is it right to use native strings for the environ values,
> and response status/headers?  Specifically, tricks like the latin1
> transcoding won't work in Python 2, but will in Python 3.  Is this weird?
> Or just something you have to think about when using the two Python
> versions?
The WSGI PEP should standardize a way for the application to figure out
the environment it runs in.  And that I think that should *not* be
checking sys.version_info but rather comparing string features.

> What happens if you give unicode text in the response headers that cannot be
> encoded as Latin1?
Undefined behavior, the example server should raise an assertion error.

> Should some things specifically be ASCII?  E.g., status.
No, HTTP specifies the status as TEXT and TEXT is specified as (any
8-bit sequence of data except any US-ASCII control character but
including CR, LR, space and tabs).

> Should some things be unicode on Python 2?
I don't think so.

> Is there a common case here that would be inefficient?
Don't think so.

Regards,
Armin