[Web-SIG] multi-threaded or multi-process wsgi apps

Mon Nov 26 23:06:23 CET 2007

Chris Withers wrote:
> Right, I'm curious as to how wsgi applications end up being
> multi-threaded or multi-process and if they are, how they share
> resources such as databases and configuration.
> 
> There's a couple of reasons I'm asking...
> 
> The first was something Chris McDonough said about one ofthe issues
> they're having with the repoze project: when using something like
> mod_wsgi, it's the first person to hit each thread that takes the hit
> of loading the configuration and opening up the zodb. Opening the
ZODB,
> in particular, can take a lot of time. How should repoze be structured
> such that all the threads load their config and open their databases
> when apache is restarted rather than when each thread is first hit?

If I were coding it, repoze would use a database connection pool that is
populated at (sub)process startup. The main thread is the only one
"loading config". That avoids any waits during the HTTP request, so your
req/sec rate will go way up. It also allows the process to fail fast in
the event of unreachable databases, so such errors during deployment
will be found sooner and will be easier to debug if they occur outside
of an HTTP request.

It's like a stage production: you don't ask your actors to buy props and
build the set during the show--instead, you buy/build all that and
script/debug/automate the hell out of it before you have an audience.
All long-running servers are a lot like that; do everything you can
before the first request to make absolutely sure nothing slows or stops
you during showtime.

> The second is a problem I see an app I'm working on heading towards.
> The app has web-alterable configuration, so in a multi-threaded and
> particular multi-process environment, I need some way to get the other
> threads or processes to re-read their configuration when it has
> changed.

In a multithreaded environment, I recommend apps read config only at
process startup, parse the entries and use them to modify live objects,
and then throw away the config. Then, if you need to make changes to
settings while live, you just modify the live objects in the same way
the config parsing step did (and then modify the config file only if
desired). That avoids having to re-read the whole config file for each
potential change. In a multiprocess environment, you can notify other
process with any of various forms of IPC or shared state mechanisms.

Robert Brewer
fumanchu at aminus.org