[Web-SIG] multi-threaded or multi-process wsgi apps

Wed Nov 28 23:33:41 CET 2007

On 29/11/2007, Chris Withers <chris at simplistix.co.uk> wrote:
> Tres Seaver wrote:
> >
> > Note first that we use mod_wsgi's "daemon"-mode exclusively,
>
> Forgive me for being uninformed, but what are the other options?

The other mode is embedded mode. Embedded mode is like using
mod_python. Daemon mode is like using mod_fastcgi. In mod_wsgi it
provides you the flexibility of choosing which you want to use in one
package. You can if appropriate even use a combination of both modes.
For example, run Django in embedded mode for best performance, but
delegate a Trac instance to run in daemon mode so it is separated out
of Apache child processes, there being various reasons with Trac why
you might want to do that.

> > which
> > implies creating one or more dedicated subprocesses for each "process
> > group" defined in the Apache config.
>
> Does each sub process get its own python interpretter?

Each process can if necessary have multiple Python sub interpreters,
and is not limited to just one. This would be used where you need to
run multiple applications in the same process but with sub
interpreters being used as a means of separating them so they don't
interfere with each other.

Take Django for instance, you can't run two instances of that inside a
pure Python WSGI server process because of the way it uses a global to
indicate what its configuration is. I agree that this isn't in the
spirit of WSGI, but that is how things are. The Django folks are
looking at trying to remove the limitation. In the mean time, you
either have to use distinct processes, or using mod_wsgi or
mod_python, run the different Django instances in separate Python sub
interpreters within the one process.

> (ie: does it have to reload all its config and open up its own database
> connections again?)

Being separate processes then obviously they would need to do that.
This is generally no different to how people often run multiple
instances of a standalone Python web based application and then use a
proxy/load balancer to distribute requests across the processes.

> > cache.  A second issue for multi-process configurations is doing all the
> > product initialization dance (for a Zope2 app) or processing ZCML (for
> > either Zope2 or Zope3).  The "frist hit slow" problem is intrinsic to
> > any lazy + scalable system.
>
> Is there really no way that the "slow" work can be shared?

As above, this is usually no different to where someone is creating
multiple distinct Python web application instances and proxy/load
balancing. Most complex Python web applications don't take kindly to
doing complex stuff in a parent process and then forking off worker
processes. This is because a lot of stuff like database connections
can't necessarily be inherited across a fork easily without causing
some problems or requiring some complicated coding to make it work.
Thus generally better for each process to create its own connections
etc. Reading of actual file configuration is generally a quite minor
overhead in the greater scheme of things.

Even if for a particular system one could gain something by doing it
in a parent process and then forking, this isn't practical in Apache
as the parent process typically runs as root and you wouldn't want
user code being run as root. User code running in Apache parent would
also cause a range of other problems as well.

Graham