[issue14567] http/server.py query string handling incorrect, inefficient

Glenn Linderman report at bugs.python.org
Fri Apr 13 10:58:25 CEST 2012


Glenn Linderman <v+python at g.nevcal.com> added the comment:

I finally understand the purpose of the checks in translate path...
Basically, translate path is concatenating the URL path to the current directory (because that is considered the root for Web service by this
server).  But along the way, it does normalization (redundantly compared to _url_collapse_path, but for a different code path, and sadly, using a different algorithm that gets different results), and os-specific checks.

For the os-specific checks, it does a couple splits to see if each path component contains a drive letter or "character other than / that is used as a directory separator", or contains "." or ".." (os specific versions).

It doesn't check for os-specific illegal filename characters (but of course they will not match existing files on the OS, so that eventually would cause a 404).

Such checks are probably best done only on path components that are actually traversed, the only problem is that increasingly large subsets of the path are passed to translate_path by run_cgi so the net effect is an O(n-squared) performance characteristic: most actual paths do not get too long, happily, but it is still inefficient.

Factoring out the checks into a function that could be called by translate path or run_cgi might be appropriate, and then run_cgi could call the new function piece by piece instead of calling translate_path at all.  It would also be good to make translate path produce an error if "drive" or "head" are ever non-empty.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14567>
_______________________________________


More information about the Python-bugs-list mailing list