[issue18808] Thread.join returns before PyThreadState is destroyed

Antoine Pitrou report at bugs.python.org
Sat Aug 31 12:13:48 CEST 2013


Antoine Pitrou added the comment:

> I don't understand the need for all the indirections in the second
> patch.  Like, why use a weakref?  It's not like we have to worry about
> an immortal tstate keeping a measly little lock object alive forever,
> right?  Seems to me that if the tstate object held the new lock
> directly, the tstate destructor could release the lock directly (and
> so also skip the new tstate->on_delete and tstate->on_delete_data
> indirections too).

The problem is if the tstate holds the last reference to the lock. It
will destroy the lock at the end, but the lock could have weakrefs
attached to it with arbitrary callbacks. Those callbacks will run
without a current thread state and may crash. OTOH, this can't happen
with a private weakref.

You may suggest we only keep the Py_thread_lock in the tstate, rather
than the enclosing PyObject. But it has lifetime problems of its owns
(the Py_thread_lock doesn't have a refcount, and it is shared with a
PyObject who thinks it owns it and will destroy it on dealloc... adding
the necessary logic to circumvent this would make the final solution
more complicated than the weakref one).

> Then again, I'd settle for Py_EndInterpreter simply sleeping for a
> second and trying again when it finds "another thread" hanging around
> (effectively moving Tamas's sleep() into Py_EndInterpreter, but only
> sleeping if needed).  Yes, that's theoretically insecure.

Well, that sounds less predictable. Depending on machine load,
Py_EndInterpreter might succeed or crash with a fatal error. Users may
not like this :-)

> But if we're worried about wildly improbable timing problems, then the
> patched code can appear not to honor a non-None Thread.join()
> `timeout` argument too.  That is, the first call to the new `pred()`
> can theoretically consume any amount of time, due to its
> self._tstate_lock.acquire().

Ah, well, good point. It's weird it didn't get caught by the unit
tests... I'll take a look.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18808>
_______________________________________


More information about the Python-bugs-list mailing list