Threading

Wed Nov 14 20:42:50 EST 2001

Thomas Jensen <thomasNO at SPAM.obscure.dk> writes:

> That's very interesting, I'm usually quite paranoid about wrapping such 
> stuff in locks.

When in doubt, that's the right approach.  You might be able to eeek a
little more performance out of hand tuning some "safe" scenarios, but
I don't think the potential downsides are worth it in general.

> Is there any "rule-of-thumb" (or doc) regarding which operations are 
> atomic, for example is the following safe:

An operation would be atomic with respect to Python interpreter
threads if it executes while holding onto the GIL (global interpreter
lock).  Generally speaking this must (a) be something written in C,
either as part of the interpreter or an extension module, and (b) not
release the GIL to provide for threading execution around a long
running or blocked activity, such as I/O.

This typically includes most builtin operations on native types that
are implemented in C.  However, I don't know of any general rule to
recognize this without actually looking at a specific implementation,
and suspect that any approach tried would be very implementation
specific.

Plus, even if you examined a particular type to determine how it
behaved, if your code ever got used with a replacement class
implemented in Python itself, the rules would change again.

Generally, it's just safer to lock more conservatively.

> thread 1:
> mylist.reverse()
> 
> thread 2:
> for item in mylist:
>  ...
> 
> (silly example perhaps)

This doesn't appear as it would be safe if the goal of thread 2 was to
iterate through every item in the list consistently.  If mylist was a
builtin list type object (and at least in the CPython implementation)
then you are guaranteed that once it begins, the reverse() method call
in thread 1 will run to completion before any other Python thread
continues execution.  What you aren't guaranteed is how that reverse
call's timing will occur with respect to thread 2.

So if thread 1 managed to transfer control to the reverse() method
prior to thread 2 hitting that for loop, you'd get a consistent
(reverse) list.

But if thread 1 didn't make it into the call until thread 2 was
partially through the loop, then the list would be reversed, and it
would be equivalent to having Python code that mutated the list within
the for loop, in which case behavior is non-deterministic (at least in
general, since it depends on the type of mutation).

Since the mutation in this case keeps the list the same length but
just reverses the contents, I expect you'd have entries processed in
the for loop up until the thread context switch, and then you'd get
the entries in the remaining positions of the list but for the
reversed copy.

I'd definitely protect something like this - you've clearly got higher
level processing (the iteration over the list) that needs the list to
remain intact during the work.  Either that, or protect the act of
making a copy of the list, and then iterate over that copy.  The
copying is one of those potential tuning points - I'd have to double
check, but I bet that an operation like "newlist = mylist[:]" would be
atomic, so you could probably get away with not protecting that.  At
least until mylist happened to end up a user class with its own
__getslice__.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/