[Web-SIG] Converting REQUEST_URI to wsgi.script_name/wsgi.path_info

Ian Bicking ianb at colorstudy.com
Mon Sep 28 19:36:32 CEST 2009


Thanks for the test case; fixed in tip now.  If anything goes wrong what
should happen is a return value of (quote(script_name), quote(path_info)) --
there's no combination of request_uri/script_name/path_info that should
cause an exception (except bugs).  As you say, there's no promise that those
values are in any way related, and when that is the case it is appropriate
to fix it up at the WSGI stage (not necessarily in the WSGI adapter itself).


On Mon, Sep 28, 2009 at 2:34 AM, Graham Dumpleton <
graham.dumpleton at gmail.com> wrote:

> 2009/9/28 Ian Bicking <ianb at colorstudy.com>:
> > I tried implementing some code to convert REQUEST_URI (the raw request
> URL)
> > and CGI-style SCRIPT_NAME/PATH_INFO into a raw script_name/path_info.
> >   http://bitbucket.org/ianb/wsgi-peps/src/tip/request_uri.py (python 2)
> >   http://bitbucket.org/ianb/wsgi-peps/src/tip/request_uri3.py (python 3)
> > Admittedly the tests are not very complete, I just wasn't feeling
> creative
> > about test cases.  In terms of performance this avoids being entirely
> brute
> > force, but feels kind of complex.  I'm betting there's an entirely
> different
> > approach which is faster.  But whatever.
>
> Got an error:
>
>  mod_wsgi (pid=4301): Exception occurred processing WSGI script
> '/Users/grahamd/Testing/tests/wsgi20.wsgi'.
>  Traceback (most recent call last):
>   File "/Users/grahamd/Testing/tests/wsgi20.wsgi", line 80, in application
>     environ['PATH_INFO'])
>   File "/Users/grahamd/Testing/tests/wsgi20.wsgi", line 64, in
> request_uri_to_path
>     remove_segments = remove_segments - 1 -
> qscript_name_parts[-1].lower().count('%2f')
>  IndexError: list index out of range
>
> This was an extreme corner case where Apache mod_rewrite was being
> used to do stuff:
>
> RewriteEngine On
> RewriteCond %{REQUEST_FILENAME} !-f
> RewriteRule ^(.*)$ /wsgi20.wsgi/$1 [QSA,PT,L]
>
> and Apache was configured to allow encoded slashes. The input would have
> been:
>
> REQUEST_URI: '/a%2fb/c/d'
> SCRIPT_NAME: '/wsgi20.wsgi'
> PATH_INFO: '/a/b/c/d'
>
> That style of rewrite rule is quite often used with Apache, although
> allowing encoded slashes isn't.
>
> That SCRIPT_NAME needs to be adjusted is a known consideration with
> this rewrite rule. Usually you would use wrapper around WSGI
> application which does:
>
> def _application(environ, start_response):
>    # The original application.
>    ...
>
> import posixpath
>
> def application(environ, start_response):
>    # Wrapper to set SCRIPT_NAME to actual mount point.
>    environ['SCRIPT_NAME'] = posixpath.dirname(environ['SCRIPT_NAME'])
>    if environ['SCRIPT_NAME'] == '/':
>        environ['SCRIPT_NAME'] = ''
>    return _application(environ, start_response)
>
> If that algorithm is used in WSGI adapter however, would never get the
> opportunity to do that though as would already have failed before it
> got called.
>
> Graham
>



-- 
Ian Bicking  |  http://blog.ianbicking.org  |
http://topplabs.org/civichacker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20090928/1c776883/attachment.htm>


More information about the Web-SIG mailing list