[Web-SIG] WSGI key conventions
Ian Bicking
ianb at colorstudy.com
Thu May 26 05:04:34 CEST 2005
There's a couple conventions I'd like to suggest, related to URLs.
The first is because some adapters (like mod_scgi) do not set PATH_INFO.
I presume this is because, while you typically set the handler for a
specific Location, it only knows that it handles the complete URL. So
when you have:
<Location /app>
SetHandler scgi-handler
</Location>
The SCGI handler doesn't know that it's only in /app, and that it should
set SCRIPT_NAME to /app. Though, I believe mod_webkit *does* do this,
so maybe it's possible to fix this in mod_scgi. (However, mod_webkit
gets really funky when you use it in <Location />, which maybe is
related.) FastCGI gets this right too, I think. But HTTP proxies
don't. But you can't add variables to HTTP proxies, though you can add
headers, but that would be something like HTTP_SCRIPT_NAME...?
Anyway, while this can be configured on the Python side, it would be
best to keep this together in the Apache configuration (or at least
there should be the option). A simple way is to set an environmental
variable, like:
<Location /app>
SetEnv WSGI_SCRIPT_NAME "/app"
SetHandler scgi-handler
</Location>
(I haven't checked the Apache reference, so excuse any misspellings here)
It would be nice if this was a convention, so that if no explicit
configuration of SCRIPT_NAME is done on the Python side that WSGI
applications should pick this up (particularly when they are connecting
to something that is known to not resolve SCRIPT_NAME and PATH_INFO
properly, and PATH_INFO is missing from the environmental dictionary.)
One other convention that I'd like is for extra variables that are
parsed out of the request. For instance, lets say you have domain names
like username.application.com (using a wildcard *.application.com DNS
entry). The WSGI application that parses this might look like:
def parse_username_middleware(app):
def replacement(environ, start_response):
host = environ['HTTP_HOST']
environ.setdefault('url.vars', {})['username'] = host.split('.')[0]
return app(environ, start_response)
return replacement
The idea of making 'url.vars' a convention is that different frameworks
could give access to this specific set of variables (which could be
derived from the path, hostname, or other data in the environment).
Some might simply put it in a big pool of variables (GET, POST, Cookie),
while others might give access to it separately. But anyway, I've come
upon this several times with URL-parsing middleware, and I keep just
stuffing it in random locations, so a convention would be nice.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Web-SIG
mailing list