[Python-Dev] Python Interpreter Thread Safety?

"Martin v. Löwis" martin at v.loewis.de
Sat Jan 29 01:44:09 CET 2005


Evan Jones wrote:
> Sorry, I really meant *parallel* execution of Python code: Multiple 
> threads simultaneously executing a Python program, potentially on 
> different CPUs.

There cannot be parallel threads on a single CPU - for threads to
be truly parallel, you *must* have two CPUs, at a minimum.

Python threads can run truly parallel, as long as one of them
invoke BEGIN_ALLOW_THREADS.

> What I was trying to ask with my last email was what are the trouble 
> areas? There are probably many that I am unaware of, due to my 
> unfamiliarity the Python internals.

I think nobody really remembers - ask Google for "Python free 
threading". Greg Stein did the patch, and the main problem apparently
was that the performance became unacceptable - apparently primarily
because of dictionary locking.

> Right, but as said in a previous post, I'm not convinced that the 
> current implementation is completely correct anyway.

Why do you think so? (I see in your previous post that you claim
it is not completely correct, but I don't see any proof).

> Wouldn't it be up to the programmer to ensure that accesses to shared 
> objects, like containers, are serialized? 

In a truly parallel Python, two arbitrary threads could access the
same container, and it would still work. If some containers cannot
be used simultaneously in multiple threads, this might ask for a
desaster.

> For example, with Java's 
> collections, there are both synchronized and unsynchronized versions.

I don't think this approach can apply to Python. Python users are
used to completely thread-safe containers, and lots of programs
would break if the container would suddenly throw exceptions.

Furthermore, the question is what kind of failure you'ld expect
if an unsynchronized dictionary is used from multiple threads.
Apparently, Java guarantees something (e.g. that the interpreter
won't crash) but even this guarantee would be difficult to
make.

For example, for lists, the C API allows direct access to the pointers
in the list. If the elements of the list could change in-between, an
object in the list might go away after you got the pointer, but before
you had a chance to INCREF it. This would cause a crash shortly
afterwards. Even if that was changed to always return a new refence,
lots of code would break, as it would create large memory leaks
(code would have needed to decref the list items, but currently
doesn't - nor is it currently necessary).

Regards,
Martin


More information about the Python-Dev mailing list