Lock acquisition by the same thread - deadlock protection

Wed Mar 11 17:37:36 EDT 2020

> On 11 Mar 2020, at 14:24, Yonatan <yon.goldschmidt at gmail.com> wrote:
> 
> That code I'm talking about didn't require a reentrant lock - the
> algorithm really wasn't reentrant.
> 
> Let me clarify my point: I'm wondering why the non-reentrant lock
> doesn't raise an exception immediately on this
> erroneous situation.
> I thought it could be altered, or at least we could add an option to
> let a `threading.Lock` behave like a pthread
> mutex in mode `PTHREAD_MUTEX_ERRORCHECK`: Disallow double locking by
> same thread, disallow unlocking
> by another thread.
> However, after searching a bit more, I found a reference mentioning
> current behavior in a docstring defined
> in `_threadmodule.c`: "A lock is not owned by the thread that locked
> it; another thread may unlock it.".
> It was added in 75e9fc31d3a18068, a commit from 1998...
> 
> Since it's a well-documented behavior I guess it's here to stay. At
> least the "unlock by another thread" part.
> But I question the double locking.

Reading the CPYTHON code I find that the type of mutex is NORMAL
and man pthread_mutex_lock documents that it will deadlock if the same
thread calls lock a second time. This is what you are seeing it seems.

The code I found was in ./Python/thread_pthread.h

Sounds to me like the library has a bug in its locking. Getting a exception
would help track down the bug faster. But still sounds like a bug.

It used to be the case that using the NORMAL locks was higher performance then
the ERRORCHECK or RECURSIVE locks. No idea if this is still true of that it matters
for cpython. Maybe changing from NORMAL to ERRORCHECK would be a benefit.

Barry

> 
> 
> On Tue, Mar 10, 2020 at 5:07 PM Barry Scott <barry at barrys-emacs.org> wrote:
>> 
>> 
>> 
>>> On 9 Mar 2020, at 22:53, Yonatan Goldschmidt <yon.goldschmidt at gmail.com> wrote:
>>> 
>>> I recently debugged a program hang, eventually finding out it's a deadlock of a single thread,
>>> resulting from my usage of 2 libraries. One of them - call it library A - is reentrant & runs code in
>>> GC finalizers, while the other - library B - is not reentrant at all.
>>> Library B held one of its `threading.Lock` locks, and during this period, GC was invoked, running
>>> finalizers of library A which call back into library B, now attempting to take the lock again,
>>> locking the thread forever.
>>> 
>>> Considering how relatively common this scenario might be (Python, by design, can preempt any user code
>>> to run some other user code, due to GC finalizers), I was surprised Python code is not protected
>>> from this simple type of deadlock. It makes sense that while `threading.RLock` allows for recursive
>>> locking, `threading.Lock` will prevent it - raising an exception if you attempt it.
>>> 
>>> I might be missing something, but why isn't it the status? Why taking a `threading.Lock` twice from
>>> the same thread just hangs, instead of raising a friendly exception?
>>> I ran a quick search in bpo but found nothing about this topic. I also tried to
>>> search this mailing list but couldn't find how to, so I grepped a few random archives
>>> but found nothing about it.
>>> 
>>> Would be happy if anyone could shed some light on it...
>> 
>> threading.Lock is not reentrant and its implementation does not allow detection of the problem
>> from what I recall.
>> 
>> In this case the code might want to use the threading.RLock that is reentrant.
>> Of course there may be other issues in the code that prevent the finalizers working if it holds the lock.
>> 
>> Barry
>> 
>> 
>> 
>