Best processor (i386) for Python performance?

David Bolen db3l at fitlinxx.com
Thu Aug 26 13:58:15 EDT 2004


Dave Brueck <dave at pythonapocrypha.com> writes:

> Grant Edwards wrote:
(...)
> I/O is the most common reason, so adding another processor to an I/O
> bound program can give you a good performance boost (in our lab I've
> seen easily 75% improvement over a single proc box for a program that
> was very I/O bound, but I haven't measured it to see if it's closer to
> 75% or to 100% improvement).

I don't doubt the performance gains, but I'd argue that if you are
seeing that sort of improvement, then you clearly don't have an I/O
bound program at all, but a compute bound one.  By definition, an I/O
bound program's performance is gated by the I/O operations, and not
CPU usage, so adding more CPU shouldn't really change anything.  After
all, if your program is "very I/O bound" it means it is waiting on I/O
virtually all of the time (and thus not executing any Python code
using the CPU), so where would adding CPU time gain anything?

I do think it can be tricky to determine just what case an application
falls into (and many oscillate between I/O and CPU bound modes), and
indeed a purely CPU bound Python application (if in Python code and
not a well-behaving extension module) isn't going to be helped at all.

But to see benefit from additional CPUs for a Python application, I
believe you're really looking for a multithreaded application that is
technically compute bound - certainly on a instant to instant basis if
not overall - but which performs a lot of (or at least regular) I/O
operations (or as you note, other extension calls which release the
GIL).  The good news is that I believe many applications do fall into
this category, even if from the outside they might be considered I/O
bound, if only because it doesn't take too much executing Python code
to process the I/O responses to create a CPU bottleneck at a given
instant.

(...)
> But then again very few of the projects I work on end up having CPU as
> the most scarce resource so the machines that do have multiple CPUs
> are that way because they are running oodles of other processes as
> well.

This is an excellent point since even if the only advantage to the
extra CPUs was to free up more of a single CPU for a Python
application, you'd still see a net gain for that application when
running in its real world environment.

-- David



More information about the Python-list mailing list