This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PyThreadState_SetAsyncExc segfault
Type: Stage:
Components: Interpreter Core Versions: Python 2.4
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: aleax, arigo, gvanrossum, jvr, rhettinger, tim.peters
Priority: high Keywords:

Created on 2004-11-19 01:48 by tim.peters, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
async-exc-lock.diff arigo, 2005-02-02 11:19 HEAD_LOCK/HEAD_UNLOCK
asyncexctest.py jvr, 2005-02-07 19:17 depends on ctypes
Messages (16)
msg23192 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-11-19 01:48
PyThreadState_SetAsyncExc() crawls over the list of 
tstates without holding a mutex.  Python implementation 
code has tried to get away with this kind of thing before 
(& more than once <wink>:  and segfaults are always 
eventually reported.

A common cause is that PyThreadState_DeleteCurrent() 
is typically called *without* the GIL held, so any thread 
can try to delete its own tstate *while* 
PyThreadState_SetAsyncExc() is trying to muck with 
that tstate.  That the GIL is held by 
PyThreadState_SetAsyncExc's caller is no protection.  
Instead the code has to serialize against tstate chain 
mutations by acquiring head_mutex for the duration.

Of course this is rare and can be virtually impossible to 
reproduce, but that makes the bug worse, not "better".



msg23193 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-11-19 15:32
Logged In: YES 
user_id=31435

Changed Category to Threads.
msg23194 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-02-02 11:19
Logged In: YES 
user_id=4771

I guess that the patch you have in mind is just to protect the loop with HEAD_LOCK/HEAD_UNLOCK.  I cannot test this patch (attached), though, as I don't have any code that uses this function.  I can only tell that it also looks like a reasonable thing to do to me.
msg23195 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-02-03 16:41
Logged In: YES 
user_id=31435

Yup, that patch is all I had in mind.  It's a shame that nobody 
cares enough about this function to test it (which isn't a dig 
at you, Armin -- I don't care enough either <0.5 wink>).

BTW, I have no idea why this function warns about return 
values greater than 1.  It's possible that the original author 
realized the tstate chain could mutate while the loop was 
running, and that's where "greater than 1" came from.  If so, 
locking head_mutex should make that impossible ... OK, some 
quality time with CVS shows that Guido checked this function 
in, so assigning the report to him.
msg23196 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2005-02-07 00:44
Logged In: YES 
user_id=6380

But it was Just's idea, so assigning to him. Q for Just:
would applying the patch cause any problems in your code
that's using this feature?

(Q. for Tim: hoes does the head_mutex in this file relate to
the GIL? The ZAP() call would seem to require that the GIL
is already head when this fn. is called, because it could
invoke arbitrary Python __del__ code. Is there any other
call that could acquire these locks in the opposite order,
causing deadlocks?)
msg23197 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-02-07 01:51
Logged In: YES 
user_id=31435

It's already documented that PyThreadState_SetAsyncExc() 
must be called with the GIL held.  head_mutex is only used to 
protect traversal/mutation of the threadstate chain, so is 
only held internally by functions that create, or destroy, 
thread states or interpreter states.  It's certainly possible to 
get a deadlock then after the patch, not due to the GIL, but 
due to head_mutex alone.  I doubt that any __del__ method 
would create/destroy thread/interpreter states on its own, 
the bigger danger is that once a __del__ method is invoked, 
other threads can run too, and they may do anything 
(including calling PyThreadState_SetAsyncExc() again!  that 
seem the most likely way to deadlock).

It would be good to have a clue about "I have no idea why 
this function warns about return values greater than 1".  If in 
fact it should be impossible to have a count greater than 1 
given that concurrent mutations to the tstate chain are 
locked out after the patch, then the patch could be rewritten 
to release head_mutex when it found p->thread_id == id the 
first time, before the ZAP.  After the ZAP, the function would 
just return:

    if (p->thread_id == id) {
        PyObject *temp = p->async_exc;
        Py_XINCREF(exc);
        p->async_exc = exc;
        HEAD_UNLOCK();
        Py_XDECREF(temp);
        return 1;
    }

Then deadlock couldn't occur.

I should note that PyInterpreterState_Clear() is prone to the 
same kind of deadlocks on head_mutex (it calls 
PyThreadState_Clear() with head_mutex held, and the latter 
does a lot of ZAPping).
msg23198 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2005-02-07 09:04
Logged In: YES 
user_id=92689

I'm not able to judge the patch. I'm only the one who had the feature 
request (together with Alex Martelli), but I know next to nothing about 
the implementation and all the subtle details. If you need a volunteer to 
apply a patch, sure, I'll gladly help, but the discussion is way over my 
head here.
msg23199 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-02-07 11:27
Logged In: YES 
user_id=4771

I didn't worry too much about ZAP deadlocks precisely because this 
problem was already there elsewhere.  I guess that if this is a problem, it 
should be discussed and fixed in another bug report, after we apply the 
obvious two-liner patch of the current tracker. 
msg23200 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-02-07 15:41
Logged In: YES 
user_id=31435

Just, this function isn't called anywhere in the Python 
distribution, so Armin and Guido and I have no certain way to 
know that it doesn't break the feature completely.

The presumption here is that you do have code that uses this 
feature.  If so, we're just asking that you verify your code 
still works after applying the patch.  OTOH, if you don't have 
any code that uses this function, then there's no reason for 
you to get sucked into this.
msg23201 - (view) Author: Just van Rossum (jvr) * (Python triager) Date: 2005-02-07 19:17
Logged In: YES 
user_id=92689

To be honest: I don't yet have any code that relies on the feature, so 
there's nothing to break. Don't know about Alex, I'll ask him.

I did a quick test (through ctypes), and the function works well with and 
without patch. I've attached the test script; it's mostly due to Peter 
Hansen: http://groups-beta.google.com/group/comp.lang.python/msg/
d310502f7c7133a9
msg23202 - (view) Author: Alex Martelli (aleax) * (Python committer) Date: 2005-02-07 20:11
Logged In: YES 
user_id=60314

I've checked the current codebase of the client for whom I originally 
wrote code using PyThreadState_SetAsyncExc, and that code is not 
currently used in production.  It did that call while holding the GIL, FWIW.
msg23203 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-02-07 20:24
Logged In: YES 
user_id=31435

So we're peeing away time on a Mystery Function with no 
known actual uses in the entire world.  Cool <wink>.
msg23204 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2005-02-08 02:08
Logged In: YES 
user_id=6380

OK, to prevent further waste of time I've checked in the
proposed fix.

I have no idea what the count>1 comment was referring to any
more; I'm sure I wrote it though.
msg23205 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-02-08 02:19
Logged In: YES 
user_id=80475

Should the head lock's be backported to Py2.4?
msg23206 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-02-08 02:27
Logged In: YES 
user_id=31435

Backport:  sure, why not.  Just don't ask anyone to test it 
<wink>.
msg23207 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-08-10 22:55
Logged In: YES 
user_id=31435

For Python 2.5 (rev 51195), I changed the code as suggested
in my 2005-02-06 comment; changed the docs to say the return
value is always 0 or 1; and added a (ctypes-based) test case
to test_threading.py.
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41192
2004-11-19 01:48:41tim.peterscreate