[melbourne-pug] stackless python question

Wed Apr 1 02:06:50 CEST 2009

On 01/04/2009, at 10:27 AM, Tennessee Leeuwenburg wrote:

> For some reason which I genuinely don't understand, lots of people  
> hate this thing called the global intepreter lock, which basically  
> means that only one thread can own that lock at any one time. This  
> is needed in order to guarantee "stuff" in Python's core, mostly  
> reference counts for garbage collection. However, when you've got  
> only one CPU, it's self-obvious that you can only do one thing at a  
> time. So it just doesn't matter that there is a global interpreter  
> lock. The argument against threading without the GIL is that you  
> need to spend more time maintaining your threads, such that you  
> don't get any advantages from it. So all your Python stuff runs in a  
> single operating-system thread, and your "python threads" get time- 
> share of Python's CPU share, rather than having multiple OS-level  
> Python threads running. But I still don't see what having multiple  
> OS-level threads buys you.

You're right in that the GIL is used to protect the global interpreter  
state, such as reference counting, which are otherwise not thread- 
safe.  As per its definition, the GIL is a single, global, lock, so  
any thread that needs to update the global interpreter state (inc/dec  
reference counts) needs to acquire the GIL before continuing.

Python makes it easy to use native OS threads. Changing a CPU  
intensive application to split calculations into multiple threads is  
also easy. However, on a multi core machine, performance will not  
automatically improve linearly as two busy Python threads will almost  
always be fighting for the GIL.  Instead of a near 2x performance  
boost it would gain little over 1x.

For this reason, many people would prefer to remove the GIL and  
replace it with many more fine grained locks so contention between  
busy threads is reduced.  The problem here is that single threaded  
code performance may be hurt a bit by having to deal with so many more  
locks (constantly acquiring and releasing them).  Another problem is  
the added complexity to the Python core.

Guido has been against replacing the GIL with fine grained locks and  
instead recommends using multi process techniques.  These techniques  
are obviously much simpler because multiple Python interpreters are  
running (one per process) and so there is no contention.  Sharing data  
between processes becomes the challenge with this model, though.

Another technique is to write CPU intensive code in a Python C  
extension such that the busy loop does not make any calls into the  
Python interpreter and so can can release the GIL and happily run flat  
out in its own thread (hopefully on another CPU core) alongside any  
other Python threads.

Finally, with regards to Stackless Python, the aim here is to provide  
a framework for "better" development of applications that require  
concurrency.  "Better" in this sense is more to do with the  
concurrency "interface" (microthreads, coroutines, etc vs threads) as  
well as readability and maintainability.

Stackless does not take advantage of multiple CPU cores and so  
performance gains for CPU bound applications is limited.  Ref http://stackoverflow.com/questions/377254/stackless-python-and-multicores

I was interested to find recently that Stackless Python had been  
ported to the Sony PSP. If only I had some spare time to have a play.  http://code.google.com/p/pspstacklesspython/ 
   -- If anyone can get a demo or game up & running on a PSP to  
demonstrate at the pub meetup I'll happily by them a beer :-)

Cheers,
Chris