[issue14567] http.server query string handling is incorrect and inefficient

Thu Oct 1 23:24:37 EDT 2015

Martin Panter added the comment:

I think the decision on how to parse the “path” attribute has to be left up to each request handler implementation, rather than being done blindly in BaseHTTPRequestHandler.parse_request(). The reason is there are various forms of HTTP request target that don’t actually have a path:

OPTIONS *  # Asterisk form
OPTIONS http://example.net  # Absolute form with no path nor query
CONNECT example.net:443  # Authority form

I agree that the current situation is far from ideal, and a function or method to parse a URL path should be very useful. Functions urlparse() and urlsplit() can already help with splitting off the query.

Currently there is double percent-decoding going on in the CGI server. “GET /cgi-bin/%2574est.py” decodes to “test.py”, when it should only decode as “%74est.py”. This is probably a side-effect of fixing Issue 14566 and related bugs.

Also, see Issue 5714 about making this function a public API.

----------
nosy: +martin.panter
stage:  -> needs patch
versions: +Python 3.4, Python 3.5, Python 3.6 -Python 3.2, Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14567>
_______________________________________