returning a value from a thread

David Bolen db3l at fitlinxx.com
Fri Jul 16 11:22:37 EDT 2004


Antoon Pardon <apardon at forel.vub.ac.be> writes:

> Op 2004-07-14, Jeff Shannon schreef <jeff at ccvcorp.com>:
> > This may work if the worker thread will perform a relatively short task 
> > and then die *before* you access the result.  But lists and dictionaries 
> > are not thread-safe -- if they are potentially accessed by multiple 
> > threads concurrently, then the behavior will be unpredictable.
> 
> I thought the GIL was supposed to take care of that.

It takes care of basic integrity of the interpreter, but depending on
your definition of "thread-safe" (which is a fairly ambiguous term)
you will likely need additional synchronization controls above and
beyond the GIL.  There have been a number of discussions relating to
this in the past.

> > (Think 
> > of a case where thread B starts to update a dictionary, inserts a key 
> > but is interrupted before it can attach a value to that key, and then 
> > while thread B is interrupted thread A looks at the dictionary and finds 
> > the key it's looking for, but with no valid reference as its value... 
> > [Disclaimer: I don't know the inner workings of dictionaries well enough 
> > to know if this exact situation is possible, but I do know that dicts 
> > are not threadsafe, so something *similar* is possible...])
> 
> My understanding was that the GIL is there to guarantee that python
> statements are atomic. Now if your statements here are correct that
> is not the case. So what is the GIL supposed to do?

Yes, I do believe the GIL will protect against the specific example
cited, at least for the built-in dict type (no guarantees against
Python level subclasses).  That is, the act of inserting a key/value
pair into a built-in dictionary is atomic with respect to the Python
byte code interpreter since it occurs within the C core under control
of the GIL.

To the extent that you only care about the physical integrity of a
dictionary (e.g., the sort of internal state mismatch discussed
above), a dictionary can be considered thread-safe.  There's no way
(from Python code) to create a key in a built-in dictionary without
some sort of value, nor for that operation to be interrupted (at the
Python bytecode level) once begun.

Likewise, a list is thread-safe to the point that there is no way to
create an "inconsistent" list from Python code - it may or may not
have the precise values you expect depending on sequence of execution,
but it'll have or not have the values, nothing in between.

The risk is in thinking that the above makes any general use of a
mutable container or other state objects within a multi-threaded
application thread-safe.  At that point you need to consider
thread-safety at the appropriate level of granularity, which is
typically more than a single object or container, but often the
interaction of multiple state elements within the thread object that
need to remain consistent.  Or even the need to have multiple elements
of a container kept in sync.

In other cases, thread objects may be using a mutable container object
(perhaps a Python class that works just like a dictionary) that itself
has imposed additional state information above and beyond the built-in
object it emulates or subclasses.  In such cases the above guarantee
no longer holds since there is Python code handling state that can be
interrupted and result in an inconsistency.

The way I tend to think of it is that Python's job (from the
perspective of supporting a multi-threaded application) is to ensure
that its native data types remain internally consistent, in terms of
providing their proper functionality, in the presence of multiple
threads, but nothing more.  Anything above that level is the
application's responsibility, and in almost all cases means you need
your own synchronization control to manage any state information.

Lastly, while this discussion has been, I believe, CPython specific
due to the GIL, for my part I have believed the prior point (basic
built-in object internal consistency) to be true in any Python
implementation, including Jython, but can't recall if I saw that
stated anywhere, so I suppose it's possible there may be some
additional risk in Jython or other implementations.

-- David



More information about the Python-list mailing list