[Python-Dev] PEP 554 v3 (new interpreters module)

Wed Oct 4 01:36:37 EDT 2017

On 3 October 2017 at 11:31, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> There shouldn't be a need to synchronize on INCREF.  If both
> interpreters have at least 1 reference then either one adding a
> reference shouldn't be a problem.  If only one interpreter has a
> reference then the other won't be adding any references.  If neither
> has a reference then neither is going to add any references.  Perhaps
> I've missed something.  Under what circumstances would INCREF happen
> while the refcount is 0?

The problem relates to the fact that there aren't any memory barriers
around CPython's INCREF operations (they're implemented as an ordinary
C post-increment operation), so you can get the following scenario:

* thread on CPU A has the sole reference (ob_refcnt=1)
* thread on CPU B acquires a new reference, but hasn't pushed the
updated ob_refcnt value back to the shared memory cache yet
* original thread on CPU A drops its reference, *thinks* the refcnt is
now zero, and deletes the object
* bad things now happen in CPU B as the thread running there tries to
use a deleted object :)

The GIL currently protects us from this, as switching CPUs requires
switching threads, which means the original thread has to release the
GIL (flushing all of its state changes to the shared cache), and the
new thread has to acquire it (hence refreshing its local cache from
the shared one).

The need to switch all incref/decref operations over to using atomic
thread-safe primitives when removing the GIL is one of the main
reasons that attempting to remove the GIL *within* an interpreter is
expensive (and why Larry et al are having to explore completely
different ref count management strategies for the GILectomy).

By contrast, if you rely on a new memoryview variant to mediate all
data sharing between interpreters, then you can make sure that *it* is
using synchronisation primitives as needed to ensure the required
cache coherency across different CPUs, without any negative impacts on
regular single interpreter code (which can still rely on the cache
coherency guarantees provided by the GIL).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia