Why not easy Thread synchronization?

Alex Martelli aleax at aleax.it
Wed May 14 05:42:48 EDT 2003


Iwan van der Kleyn wrote:

> It's very easy writing simple  threaded programs in Python. In fact, its
> part of my "evangelize demo's" :-)
> However, thread synchroni(s/z)ation seems to be a bit more fussy and
> complex then,  for example, Java.
> 
> Why not a "synchronization" reserved word like in Java which takes care
> of "scary" bits under the hood?
> 
> For example:
> 
> sync def log(s):  #in which the hypothetical keyword 'sync' makes the
> function 'log' thread safe
> sys.stout.write(s)
> sys.stout.flush()
> 
> Seems to be a bit more fitting to Python mucking about with locks and
> mutexes.

The most productive pythonic approach is generally to use Queue instances
for all inter-thread operation.  A 'sync' keyword would make it harder
for people to see that.

In general, it's best of only one thread ever interacts with any given
'external resource', such as sys.stdout (which I assume is what you mean
by 'sys.stout' in your example).  Putting locks around the interactions
of many threads with that resource (whether explicitly, or implicitly by
the proposed keyword 'sync') is normally not quite as good and smooth
as designating a specific thread to be "the sole owner and handler of
sys.stdout" and having that thread peel requests from a Queue and act
on them in a serialized (and thus implicitly synchronized;-) fashion.

Consider what lock should your proposed function log use.  A single
global one shared by ALL functions declared to be 'sync'?  That's hardly
a recipe for sensible multi-threaded programming -- a "big huge global
lock" is a sad kludge one works to get AWAY from as soon as possible.

A specific lock connected to function log only?  Then log would easily
interfere with other functions that also deal with sys.stdout and may
happen to be called in other threads.  No, in this case what you'd want
would be a lock related to sys.stdout -- but then what about a function
that needs to e.g. guarantee the same output appears without interleaving
on both sys.stdout and sys.stderr, would that need to refer to TWO locks
and how would we avoid the likelihood of deadlocks?

With the "dedicated thread owning external resource with Queue for
communication with other threads" you simply need to make the same
thread responsible for sys.stdout AND sys.stderr if the two ever need
to be serialized together -- it's as simple as that.  Incidentally,
as 'requests' to put on the Queue it may be perfectly feasible to
use tuples of (callable, argstuple) or (callables, argstuple, kwdict),
so the service threads don't need much code -- they just peel those
tuples off their request Queue and do the needed calling -- the only
design issues left is whether a request needs a response [which the
requesting thread will eventually wait for] or not, and that in turn
can be handled by optionally placing another Queue instance in the
request tuple itself.

I've tried expressing this concisely but convincingly in the
"Threaded Program Architecture" section of the Nutshell chapter
on Threads and Processes -- but the scarcity of available space
may not have allowed me to do justice to the subject.  I believe
you can find very similar and probably more detailed suggestions
in the works by Aahz (to whom, as well as to Tim Peters, I owe a
substantial debt of gratitude for reviewing the Threads chapter
and providing substantial input about it!).


Alex





More information about the Python-list mailing list