[Web-SIG] about WSGI adoption

Graham Dumpleton graham.dumpleton at gmail.com
Sun Nov 18 23:56:01 CET 2007


On 19/11/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Titus Brown ha scritto:
> > On Sun, Nov 18, 2007 at 09:03:23PM +0100, Manlio Perillo wrote:
> > -> Titus Brown ha scritto:
> > -> > ->
> > -> > -> However I still consider remarkable that there is not a "trac.wsgi" script.
> > -> > -> Can this be caused by the lack of a standardized deployment of WSGI
> > -> > -> applications?
> > -> >
> > -> > What would a trac.wsgi script contain?
> > ->
> > -> import trac.web.main
> > ->
> > -> application = trac.web.main.dispatch_request
> >
> > So this is something that can be 'execfile'd, I guess...
> >
>
> No.
> It provides an application callable that the WSGI gateway/server can
> execute.
>
> > -> > WSGI is a programming interface,
> > -> > not a script interface like CGI.
> > ->
> > -> Right, but a WSGI server/gateway just needs a simple script to execute
> > -> the WSGI application.
> >
> > That might be useful for some WSGI deployment techniques and less useful
> > for others.  For example, if you're using an SCGI-based WSGI server, you
> > need a command-line executable;
>
> This is not fully correct.
> The sample script I have posted can be used by a SCGI-based WSGI server too.
>
> I think that the "deployment" must be done by the WSGI gateway/server
> and not by the application.
>
> That is, the "application" should only expose the callable object, and
> should not "start a server", opening logging and configuration files, or
> stacking middlewares.

This would require the WSGI adapter layer to encompass the means of
loading the script file (as Python module) when required the first
time. The only thing that really does it that way at present is
mod_wsgi.

Current CGI-WSGI adapters expect the WSGI application entry point to
effectively be in the same file as the main for the CGI script. Ie.,

  #!/usr/bin/python

  def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!\n'

    response_headers = [('Content-Type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

  if __name__ == '__main__':
    from paste.script.cgi_server import run_with_cgi
    run_with_cgi(application)

This doesn't mean though that you couldn't develop a CGI-WSGI adapter
which separated the two parts, but not really easy to make it
completely transparent. This is because you still have to at least
create the CGI file which refers to the application script file in a
different location.

The Action directive in Apache can be made to make it a bit more
transparent by mapping a .wsgi or .py extension to a single CGI
script, with that script looking at the filename target of the request
to work out actual application file to load. Similar thing could be
done for FASTGCI and SCGI with Apache. Problem with this though is
that using Action directive in Apache in this way looses the correct
value for SCRIPT_NAME from memory. There is also no equivalent in
other servers such as lighttpd and nginx.

Anyway, hope this at least half illustrates that it isn't necessarily
that simple to come up with one concept of having a single WSGI
application script file which knows nothing about the means in which
it is launched. In mod_wsgi it has made this as seamless as possible,
but with other hosting mechanisms such as CGI, FASTCGI and SCGI where
the WSGI adapter isn't actually embedded within the web server itself,
but is within the process launched, it is much harder to make it
transparent to the point where one could just throw a whole lot of
WSGI application scripts in a directory and have it work.

In Python based web servers it gets more complicated again as in that
case it is the Python web server that is providing the top level URL
mapping to a WSGI application entry point, whereas in Apache, Apache
can do that automatically at least down to the initial entry point
before things go into Python code. One could technically write a
Python based web server whose top level URL to application mapping was
file system based like Apache is, but most probably wouldn't see the
point of it.

> > for mod_python, you probably need an
> > importable module with a function; for CGI, you need a CGI script; etc.
> > So I think you're talking about something that is very specific to your
> > own deployment technique.  This is out of the scope of the WSGI
> > proposal, for good reasons -- there are many ways of configuring and
> > deploying WSGI apps and I don't know that we've settled on only one way.
> >
>
> Right.
> But in the WSGI spec there is a propose to standardize a deployment method.
>
> As an example, WSGI says nothing about what happens when an application
> module is imported (and the Python application process is created).

And it can't easily do so as the differences in hosting technology
make it hard to come up with one system which would work for
everything. For some ideas put up previously, see thread about Web
Site Process bus in:

  http://mail.python.org/pipermail/web-sig/2007-June/thread.html

Some of the things that make it difficult are multi process web
servers, plus web servers that only load applications on demand and
not at the start when the processes are started up. Some hosting
technologies from memory allow a logical application to be stopped and
started within the context of the same process, whereas others don't.
So, where as atexit() may be a reasonable of doing shutdown actions
for some hosting technologies, it isn't for others.

> It can be useful if the gateway can execute an
>
>     init_application(enviroment)
>
> function, where environment contains the same objects of the request
> enviroment, excluding the HTTP headers and the input object, and with a
> separate errors object.

The closest you can probably get to portable application
initialisation is for the application itself to track whether it has
been called before and do something special if it hasn't. Even this is
tricky because of multithreading issues.

> Logging is another thing that should be clarified.
> How should an application do logging?
>
> As an example for a WSGI gateway embedded in an existing server (like
> Apache and Nginx) it can be useful and convenient to keep logging in an
> unique log file.
> And if the server logging system uses "log levels", this should be
> usable by the WSGI application.

There is always the Python 'logging' module.  Where things get
interesting with this is how to configure the logging. In Pylons,
provided you use 'paster', it will note that the .ini file mentions
'loggers' and so will push the config automatically to the 'logging'
module. Run a Pylons application under mod_wsgi though and this
doesn't happen so Pylons logging doesn't work. Thus need to make the
magic Pylons call to get it to push the config to the 'logging' module
manually. Use of log levels is almost impossible. If using CGI your
only logging mechanism is sys.stderr and that gets logged as ERR in
Apache. Same for mod_wsgi, and similar for SCGI and FASTCGI I think.
In mod_python its sys.stderr is broken in that output isn't
automatically flush. Yes WSGI specification says that error output
needs to be flushed to ensure it is displayed, but usually isn't done.

> The same is valid for application configuration.

And you will probably never get everyone to agree on that.

The whole thing with WSGI was that it defined as little as possible so
it left enough room for people to experiment with how to do all the
other issues. I doubt you will ever seem a single solution, instead,
you will though see different ways come together into a number of
different frameworks. (or no frameworks). Overall, that probably isn't
a bad thing.

Graham


More information about the Web-SIG mailing list