[Python-bugs-list] [ python-Bugs-433625 ] bug in PyThread_release_lock()

noreply@sourceforge.net noreply@sourceforge.net
Tue, 19 Jun 2001 15:06:04 -0700


Bugs item #433625, was updated on 2001-06-15 19:17
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=433625&group_id=5470

Category: Threads
Group: Platform-specific
Status: Open
Resolution: Invalid
Priority: 5
Submitted By: Shih-Hao Liu (shihao)
Assigned to: Tim Peters (tim_one)
Summary: bug in PyThread_release_lock()

Initial Comment:
Mutex should be hold when calling
pthread_cond_signal().  This function should look like:


PyThread_release_lock(PyThread_type_lock lock)
{
        pthread_lock *thelock = (pthread_lock *)lock;
        int status, error = 0;

        dprintf(("PyThread_release_lock(%p) called\n",
lock));

        status = pthread_mutex_lock( &thelock->mut );
        CHECK_STATUS("pthread_mutex_lock[3]");

        thelock->locked = 0;
        /* ***** call pthread_cond_signal before unlock
mutex */
        status = pthread_cond_signal(
&thelock->lock_released );
        CHECK_STATUS("pthread_cond_signal");

        status = pthread_mutex_unlock( &thelock->mut );
        CHECK_STATUS("pthread_mutex_unlock[3]");

        /* wake up someone (anyone, if any) waiting on
the lock */
}


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-19 15:06

Message:
Logged In: YES 
user_id=31435

> The problem will be arised if there is a thread 3
> called PyThread_acquire_lock 

You never mentioned thread 3 again, so I don't know what it 
has to do with this.

> after thread 1 set thelock->locked to 0 and before
> thread 2 calling pthread_mutex_lock. 

I understand both of those.  Are you assuming that, e.g., 
thread 3's PyThread_acquire_lock completes in whole during 
this gap?  I don't know what else you could mean, so let's 
assume that.

> "while (thelock->locked)" will success for thread 2

Sure.

> and it will call pthread_cond_wait

Yup.

> and might collide with thread 1's pthread_cond_signal.

What does "collide" mean to you?  All the pthread_cond_xxx 
functions must be implemented as if atomic, so there's no 
meaningful sense (to me) in which they can collide -- 
unless they're implemented incorrectly.

Assuming they are implemented correctly, it again doesn't 
matter that thread 2 misses thread 1's signal, because 
thread *3* exploited the information thread 1 was going to 
signal, by acquiring the lock.  It's actually good that 
thread 2 isn't bothered with it:  there's no real info in 
the signal anymore (at best, if thread 2 got it, it would 
wake up and go "oops! it's still locked; I'll wait again").

All that matters now is whether thread 2 gets a chance to 
see thread *3*'s signal, at the time thread 3 releases the 
lock.  And it will, because thread 3 can't release the lock 
without acquiring the mutex first, and thread 2 holds the 
mutex at all times except when in its pthread_cond_wait 
call (so thread 3 can't release the lock except when thread 
2 is in pthread_cond_wait).

Note that I'm not at all concerned about "fairness" here, 
only about races.


----------------------------------------------------------------------

Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-19 14:41

Message:
Logged In: YES 
user_id=246388

Oops.
The problem will be arised if there is a thread 3 called
PyThread_acquire_lock after thread 1 set thelock->locked to
0 and before thread 2 calling pthread_mutex_lock. "while
(thelock->locked)" will success for thread 2 and it will
call pthread_cond_wait and might collide with thread 1's
pthread_cond_signal.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-19 13:52

Message:
Logged In: YES 
user_id=31435

It appears you're concerned that the signal will "get lost" 
in this scenario.  I agree that it may, but it doesn't 
matter:  thread 2's "while (thelock->locked)" test fails 
because thread 1 already set thelock->locked to 0, so 
thread 2 doesn't execute its pthread_cond_wait, so it 
doesn't matter that nobody is *waiting* to see thread 1's 
pthread_cond_signal.  IOW, the condition thread 1 will try 
to signal has *already* been detected by thread 2, and the 
signal is useless (because redundant) information.  Indeed, 
it's a beauty of the condition protocol that signals can be 
sloppy.

The linuxthread man page is worded strangely here, but note 
that it does not say the mutex *must* be locked.  They 
can't, either, because POSIX doesn't require it; while 
POSIX stds aren't available freely online, some derived 
specs are, and are much clearer about this; e.g., see the 
Single Unix Specification, here:

<http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_con
d_signal.html>

I'll tell you why I don't *want* to change this:  in 
Python's use of the global interpreter lock, it's almost 
always the case that someone is waiting on the lock.  By 
releasing the mutex before signaling, this gives a waiter a 
chance to run immediately upon calling 
pthread_cond_signal.  Else, because pthread_cond_wait 
(which the waiters are executing) has to lock the mutex, if 
the signaler is holding the mutex during the signal, 
pthread_cond_signal can't finish the job -- it was to go 
back to the signaler and let it unlock the mutex first 
before a waiter can proceed.  This makes the region of 
exclusion longer than it needs to be.

So if there's not an actual race problem here (& I still 
don't see one), I don't want to change this.

----------------------------------------------------------------------

Comment By: Shih-Hao Liu (shihao)
Date: 2001-06-17 23:01

Message:
Logged In: YES 
user_id=246388

I closed it because I thought the thelock->locked variable 
will ensure that the PyThread_release_lock will help to 
protect the condition variable and I was wrong.  The 
linuxthread man page on pthread_cond_signal:

A  condition  variable  must  always  be associated with a
mutex, to avoid the race condition where a thread prepares
to wait on a condition variable and another thread signals
the condition just before the first thread actually  waits
on it.

which means you can't call pthread_cond_signal & 
pthread_cond_wait on the same condition variable at the 
same time.  And using a mutex to protect them is a good 
idea.  Here is how thing might go wrong with current 
implementation:

    thread 1                            thread 2        
                                                        
                            |int PyThread_acquire_lock _
                            |/** assume lock was acquired
                            |  by thread 1, hence locked=0 
                            |  & success would be 0  **/
                            |{                          
                            |  ...                      
                            |  status = pthread_mutex_lo
                            |  CHECK_STATUS("pthread_mut
                            |  success = thelock->locked
                            |  if (success) thelock->loc
                            |  status = pthread_mutex_un
                            |  /** thread 2 suspended **/
void PyThread_release_lock _|                           
{                           |                           
  ...                       |                           
  status = pthread_mutex_loc|                           
  CHECK_STATUS("pthread_mute|                           
                            |                           
  thelock->locked = 0;      |                           
                            |                           
  status = pthread_mutex_unl|                           
/** thread 1 suspend **/    |                           
                            |  CHECK_STATUS("pthread_mut
                            |                           
                            |  if ( !success && waitflag
                            |    /* continue trying unti
                            |                           
                            |    /* mut must be locked b
                            |     * protocol */         
                            |    status = pthread_mutex_
                            |    CHECK_STATUS("pthread_m
                            |    while ( thelock->locked
                            |      status = pthread_cond
                            |/** thread 2 suspended while
                            |    updating shared data **
  CHECK_STATUS("pthread_mute|                           
                            |                           
  /* wake up someone (anyone|                           
  status = pthread_cond_sign|                           
/** thread 1 update shared  |                           
  data and corrupt it. **/  |                           

Not sure what the effect would be.  It's wouldn't be nice 
anyway.





----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-17 16:02

Message:
Logged In: YES 
user_id=31435

Closing this again, as it appears the original submitter 
deleted it.  shihao, if you want to pursure this, open it 
again.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-17 15:01

Message:
Logged In: YES 
user_id=31435

Ack, did I delete this?!  I sure didn't intend to -- didn't 
even intend to close it.  Reopened pending more info.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-17 14:51

Message:
Logged In: YES 
user_id=6380

Set status to closed -- no need to delete it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-17 14:16

Message:
Logged In: YES 
user_id=31435

Why?  It's allowed to signal the condition whether or not 
the mutex is held.  Since changing this can have visible 
effects on thread scheduling, I'm reluctant to change it 
without a good reason.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=433625&group_id=5470