[Web-SIG] WSGI & CGI spec

Jason Kirtland jason at virtuous.com
Wed Dec 20 00:36:23 CET 2006


Ian wrote:
> Reading the CGI spec I'm noticing some requirements it makes that
> aren't done as much in WSGI.
> [...]
> It's also unclear if the WSGI server is expected to normalize the
> path,  specifically things like /foo/../bar -- Apache does do
> this, wsgiref does not.

The spec could definitely use more clarity on the disposition of 
the original request URI.  There's a CGI compatibility requirement 
to supply SCRIPT_NAME and PATH_INFO in the initial environ: 
concepts inseparably linked to selection and execution of an 
executable (RFC 3875, 3.1 - 3.3).  But a WSGI server isn't shaped 
like CGI- all requests funnel through a single callable with no 
inspection of the HTTP Request-URI.  The PEP delegates the request 
routing (i.e. URI analysis) phase to middleware. [1]

To my reading, PEP 333 implies that a server should plop the 
Request-URI into PATH_INFO, and it should store it there 
unmolested.  Any proactive meddling in the URI space is at odds to 
the guideline that a server "play dumb" and act as a transparent 
gateway server.

Allowing the server to normalize into PATH_INFO as a matter of 
course destroys the original Request-URI and makes a whole class of 
raw URI-consuming middleware impossible- no mod_security, no 
filtering of naive spambots, no accurate request logging (!), no 
proxies, etc.  And I see little to gain: URI mapping middleware 
(dynamic or static content serving) isn't going to trust any web 
input anyhow and will implement its own, probably very paranoid, 
normalization.  And really, what URI-sensitive WSGI applications 
are running bare without a mapper?

> (Is posixpath.normpath good enough to do that?)

Note that posixpath.normpath is not Apache compatible. Apache 
leaves empty path segments (//) untouched, normpath prunes empty 
segments.

-Jason

[1] Fronted servers that are invoked from actual CGI, mod_python, 
behind some proxies, etc. play a middleware roll and may have 
implementation-specific access to "current" initial values for 
SCRIPT_NAME/PATH_INFO.



More information about the Web-SIG mailing list