high performance/threaded applications in Python - your experiences?

Josiah Carlson josiah.carlson at sbcglobal.net
Sun Jun 24 17:23:02 EDT 2007


Ivan Voras wrote:
> Jay Loden wrote:
> 
>> I was hoping for some experiences that some of you on the list may have had in dealing with Python in a high performance and/or threaded environment. In essence, I'm wondering how big of a deal the GIL can be in a  real-world scenario where you need to take advantage of multiple processor machines, thread pools, etc. How much does it get in the way (or not), and how difficult have you found it to architect applications for high performance? I have read a number of articles and opinions on whether or not the GIL is a good thing, and how it affects threaded performance on multiple processor machines, but what I haven't seen is experiences of people who have actually done it and reported back "it was a nightmare" or "it's no big deal" ;)
> 
> The theory: If your threads mostly do IO, you can get decent CPU usage
> even with Python. If the threads are CPU-bound (e.g. you do a lot of
> computational work), you'll effectively only make use of one processor.
> 
> In practice, I've noticed that Python applications don't scale very much
> across CPUs even if they're doing mostly IO. I blame cache trashing or
> similar effect caused by too many global synchronization events. I
> didn't measure but the speedup may even be negative with large-ish
> number of CPUs (>=4).
> 
> OTOH, if you can get by with using forking instead of threads (given
> enough effort) you can achieve very good scaling.

Also, see the 'processing' package in the Python cheeseshop.  It allows 
you to use processes rather than threads with most of the same 
abstractions.  I hear it recently acquired the ability to pass file 
handles between processes on the same machine :)

  - Josiah



More information about the Python-list mailing list