[Python-Dev] Making python C-API thread safe (try 2)

Shane Hathaway shane at zope.com
Thu Sep 11 14:50:02 EDT 2003


[Moved to python-list at python.org, where this thread belongs]

Harri Pesonen wrote:
> But my basic message is this: Python needs to be made thread safe. 
> Making the individual interpreters thread safe is trivial, and benefits 
> many people, and is a necessary first step; making threads within 
> interpreter thread safe is possible as well, at least if you leave 
> something for the developer, as you should, as you do in every other 
> programming language as well.

Lately, I've been considering an alternative to this line of thinking. 
I've been wondering whether threads are truly the right direction to 
pursue.  This would be heresy in the Java world, but maybe Pythonistas 
are more open to this thought.

The concept of a thread is composed of two concepts: multiple processes 
and shared memory.  Supporting multiple simultaneous processes is 
relatively simple and has proven value.  Shared memory, on the other 
hand, results in a great number of complications.  Some of the 
complications have remained difficult problems for a long time: 
preventing deadlocks, knowing exactly what needs to be locked, finding 
race conditions, etc.  I don't believe we should force the burden of 
thread safety on every software engineer.  Engineers have better things 
to do.

At the same time, shared memory is quite valuable when you're ready to 
take on the burden of thread safety.  Therefore, I'm looking for a good 
way to split a process into multiple processes and share only certain 
parts of a program with other processes.  I'd like some form of 
*explicit* sharing with a Pythonic API.

Imagine the following Python module:


import pseudothreads

data = pseudothreads.shared([])

def data_collection_thread():
     s = get_some_data()
     data.append(s)

for n in range(4):
     pseudothreads.start_new_thread(data_collection_thread)


In this made-up example, nothing is shared between threads except for 
the "data" global.  The shared() function copies the list to shared 
memory and returns a wrapper around the list that prevents access by 
multiple threads simultaneously.  start_new_thread() is a thin wrapper 
around os.fork().  Each pseudothread has its own global interpreter lock.

I wonder whether others would consider such a thing valuable, or even 
feasible. :-)

Shane






More information about the Python-list mailing list