Thread scheduling

Peter Hansen peter at engcorp.com
Sat Feb 26 21:11:28 EST 2005


Jack Orenstein wrote:
> Peter Hansen wrote:
>  > You've got two shared global variables, "done" and "counter".
>  > Each of these is modified in a manner that is not thread-safe.
>  > I don't know if "counter" is causing trouble, but it seems
>  > likely that "done" is.
> 
> I understand that. 

>  > Basically, the statement "done += 1" is equivalent to the
>  > statement "done = done + 1" which, in Python or most other
>  > languages is not thread-safe.  
> 
> Understood. I was counting on this being unlikely for my test
> case. I realize this isn't something to rely on in real software.

Hmm... okay.  I may have been distracted by the fact that your
termination condition is based on "done" incrementing properly,
and that it was possible this wouldn't happen because of the race
condition.  So, if I understand you now, you're saying that the
reason "done" doesn't increment is actually because one of the
threads is never finishing properly, for some reason not related
to the code itself. (?)

> The point of the
> test is not to maintain counter -- it's to show that sometimes even
> after one thread completes, the other thread never is scheduled
> again. This seems wrong. Try running the code, and let me see if you
> see this behavior.

On my machines (one Py2.4 on WinXP, one Py2.3.4 on RH9.0) I don't
see this behaviour.  Across about fifty runs each.

> And the two increments of done (one by each thread) are still
> almost certainly not going to coincide and cause a problem. Also, if
> you look at the output from the code on a hang, you will see that
> 'thread X: leaving' only prints once. This has nothing to do with what
> happens with the done variable.

Okay, I believe you.  As I said, I hadn't taken the time to read
through everything at first, jumping on an "obvious" bug related
to the "done" variable not meeting your termination conditions.
I can see that something else is likely to be causing this.

One thing you might try is experimenting with sys.setcheckinterval(),
just to see what effect it might have, if any.

It's also possible there were some threading bugs in Py2.2 under
Linux.  Maybe you could repeat the test with a more recent
version and see if you get different behaviour.  (Not that that
proves anything conclusively, but at least it might be a good
solution for your immediate problem.)

-Peter



More information about the Python-list mailing list