best way to serve wsgi with multiple processes

Robin Becker robin at reportlab.com
Wed Feb 11 07:10:47 EST 2009


Robin wrote:
> Hi,
> 
> I am building some computational web services using soaplib. This
> creates a WSGI application.
> 
> However, since some of these services are computationally intensive,
> and may be long running, I was looking for a way to use multiple
> processes. I thought about using multiprocessing.Process manually in
> the service, but I was a bit worried about how that might interact
> with a threaded server (I was hoping the thread serving that request
> could just wait until the child is finished). Also it would be good to
> keep the services as simple as possible so it's easier for people to
> write them.
> 
> I have at the moment the following WSGI structure:
> TransLogger(URLMap(URLParser(soaplib objects)))
> although presumably, due to the beauty of WSGI, this shouldn't matter.
> 
> As I've found with all web-related Python stuff, I'm overwhelmed by
> the choice and number of alternatives. I've so far been using cherrypy
> and ajp-wsgi for my testing, but am aware of Spawning, twisted etc.
> What would be the simplest [quickest to setup and fewest details of
> the server required - ideally with a simple example] and most reliable
> [this will eventually be 'in production' as part of a large scientific
> project] way to host this sort of WSGI with a process-per-request
> style?
> 
>......

We've used forked fastcgi (flup) with success as that decouples the wsgi process 
(in our case django) from the main server (in our case apache). Our reasons for 
doing that were to allow the backend to use modern pythons without having to 
upgrade the server (which is required if using say mod_python). The wsgi process 
runs as an ordinary user which eases some tasks.

A disadvantage of our scheme is that long running processes may cause problems 
eg timeouts. In practice since there are no guarantees for how long an http 
connection will hold up (because of proxies etc etc) we decided to work around 
this problem. Basically long running jobs go into a task queue on the server and 
the response is used to reconnect to the long running job peridically for status 
querying/results etc etc.
-- 
Robin Becker




More information about the Python-list mailing list