Python does not play well with others

Sun Feb 4 19:25:28 EST 2007

On Feb 5, 9:45 am, John Nagle <n... at animats.com> wrote:
> Graham Dumpleton wrote:
> > On Feb 4, 1:05 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>
> >>"Paul Boddie" <p... at boddie.org.uk> writes:
>
> >>>Probably the biggest inhibitor, as far as I can see, has been the
> >>>server technology chosen. Many hosting providers have historically
> >>>offered no better than CGI for Python, whilst PHP runs within Apache
> >>>itself, and it has previously been stated that mod_python has been
> >>>undesirable with regard to isolating processes from each other.
> >>>Consequently, a number of Python people seem to have held out for
> >>>other "high performance" solutions, which various companies now offer.
>
> >>Your point that shared hosting with Python isn't so easy because of
> >>insufficient isolation between apps is valid.  Maybe Python 3.0 can do
> >>something about that and it seems like a valid thing to consider while
> >>fleshing out the 3.0 design.
>
> > To clarify some points about mod_python, since these posts do not
> > properly explain the reality of the situation and I feel people are
> > getting the wrong impression.
>
> > First off, when using mod_python it is possible to have it create
> > multiple sub interpreters within each Apache child process.
>
>      Realistically, mod_python is a dead end for large servers,
> because Python isn't really multi-threaded.  The Global Python
> Lock means that a multi-core CPU won't help performance.

That is not true if 'prefork' MPM is used for Apache which is how most
people seem to run it. This is because each Apache child process only
run one request at a time and so there isn't normally going to be any
contention on the GIL at all. The only case where there would be
contention in this arrangement is if the request handlers within
Apache had spawned off distinct threads themselves to do stuff. Even
then, in this arrangement the main request handler is usually not
doing anything and is just waiting for the created thread to finish
what it was doing. Thus if only one thread was spawned to do some work
or a blocking operation to allow the main thread to timeout, then
again there isn't really any contention as only one thread is actually
doing anything. If you are trying to embed very intensive operations
with threads within Apache then I would suggest it is not really the
best design you could use anyway as such things would be much better
farmed off to a long running backend process using XML-RPC or some
other interprocess communication mechanism.

If one is using the "worker" MPM then yes there will be some
contention if multiple requests are being handled by mod_python at the
same time within the same Apache child process. The downside of this
is lessened however by the fact that there are still multiple Apache
child processes and Apache will spread requests across all the Apache
child processes, thus the amount that may be running concurrently
within any one process is less.

The basic problem of GIL contention here is no different to a single
Python backend process which is handling everything behind Apache. In
some respects the Apache approach actually works better as there are
multiple processes spreading the load. Your comment on the GIL is
therefore partly unjustified in that respect for Apache and
mod_python. Your statement in some respect still stands for Python
itself when run as a single process, but you linked it to mod_python
and Apache which lessens the impact through its architecture of using
multiple child processes.

Finally we have 'winnt' MPM, again, because this is all in the one
process you will have GIL contention in a much more substantial
manner. However, I'd suggest that most wouldn't choose Apache on
Windows as a major deployment platform.

>      FastCGI, though, can get all the CPUs going.  It takes more
> memory, though, since each instance has a full copy of Python
> and all the libraries in use.

How is that any different to Apache child processes. Each Apache child
process has a full copy of Python and the libraries in use. Each
Apache child process can be making use of different CPUs. Further,
static file requests, plus other requests against PHP, mod_perl etc
can when mod_python is also running be on separate CPUs within the
same child process when 'worker' MPM is being used. Thus you haven't
lost all forms or parallelism that may be possible, it is only within
the mod_python world that there will be some GIL contention and only
with 'worker' and 'winnt' MPMs, not 'prefork'. It isn't going to lock
out non Python stuff from making use of additional CPUs.

>      (FastCGI is a straightforward transaction processing engine.
> Each transaction program is launched in a separate process, and,
> once done with one transaction, can be used to do another one
> without reloading.  When things are slow, the extra transaction processes
> are told to exit; when load picks up, more of them are forked.
> Security is comparable to CGI.)

Apache will also kill off excess child processes when it deems they
are no longer required, or create new ones as demand dictates.

So, I am still not sure where the big issue is, the architecture of
Apache limits the impact of GIL contention in ways that Python alone
doesn't.

Graham