[Web-SIG] URL quoting in WSGI (or the lack therof)
Ian Bicking
ianb at colorstudy.com
Mon Jan 21 02:30:20 CET 2008
Luis Bruno wrote:
> Hello y'all, delurking,
>
> I'm using a /-delimited path, %-encoding each literal '/' appearing in
> the path segments. I was not amused to see egg:Paste#http urldecoding
> the whole PATH_INFO.
Unfortunately this is in the WSGI spec, so it's not Paste#http so much
as WSGI that demands this.
I think in the CGI implementations this is kind of handled by
REQUEST_URI containing the quoted value. But relating REQUEST_URI with
SCRIPT_NAME/PATH_INFO is awkward and having the information in duplicate
places can lead to errors and unclear situations if they don't match up
properly.
> Ben Bangert wrote:
>> This recently became an issue, when a user noticed that the %2B URL
>> encoding for a + sign, had turned into a space when it hit their app.
> A swift monkey-patch to
> paste.httpserver.py:WSGIHandlerMixin.wsgi_setup() later, and
> ORIGINAL_PATH_INFO is part of the WSGI spec in my world. The following
> URL now Does The Right Thing:
>
> http://127.0.0.1:5000/catalog/NEC/Computers/Laptops/LN500%2F9DW/
It would be the Right Thing, except for not being WSGI. I made note of
this issue on the WSGI 2.0 ideas page, but I don't think anyone
(including myself) has proposed any good resolution. Diverging from CGI
and leaving PATH_INFO/SCRIPT_NAME quoted would work. But it's libel to
lead to bugs as it's a fairly subtle thing and for most applications the
semantics won't change and people won't realize their code is broken for
some corner case. I suppose we could remove SCRIPT_NAME and PATH_INFO
entirely and replace them with new keys.
Ian
More information about the Web-SIG
mailing list