[Python-Dev] Playing games with reference counts (was Re: PyWeakref_GetObject() borrows its reference from... whom?)

Larry Hastings larry at hastings.org
Thu Oct 13 07:41:14 EDT 2016


On 10/10/2016 10:38 PM, Chris Angelico wrote:
> On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings <larry at hastings.org> wrote:
>> These hacks where we play games with the
>> reference count are mostly removed in my branch.
> That's exactly what I would have said, because I was assuming that
> refcounts would be accurate. I'm not sure what you mean by "play games
> with",

By "playing games with reference counts", I mean code that purposely 
doesn't follow the rules of reference counting.  Sadly, there are 
special cases that apparently *are* special enough to break the rules.  
Which made implementing "buffered reference counting" that much harder.

I currently know of two examples of this in CPython.  In both instances, 
an object has a reference to another object, but *deliberately* does not 
increase the reference count of the object, in order to prevent keeping 
the other object alive.  The implementation relies on the GIL to 
preserve correctness; without a GIL, it was much harder to ensure this 
code was correct.  (And I'm still not 100% I've done it.  More thinking 
needed.)

Those two examples are:

 1. PyWeakReference objects.  The wr_object pointer--the "reference"
    held by the weak reference object--points to an object, but does not
    increment the reference count.  Worse yet, as already observed,
    PyWeakref_GetObject() and PyWeakref_GET_OBJECT() don't increment the
    reference count, an inconvenient API decision from my perspective.
 2. "Interned mortal" strings.  When a string is both interned *and*
    mortal, it's stored in the static "interned" dict in
    unicodeobject.c--as both key and value--and then its's DECREF'd
    twice so those two references don't count.  When the string is
    destroyed, unicode_dealloc resurrects the string, reinstating those
    two references, then removes it from the "interned" dict, then
    destroys the string as normal.

To support these, I've implemented what is effectively a secondary, 
atomic-only reference count.  It seems to work.  (And yes that means all 
objects are now 8 bytes bigger.  Let me worry about memory consumption 
later, m'kay?)


Resurrecting object also gave me a headache in the Gilectomy with this 
buffered reference counting scheme, but I think I have that figured out 
too.  When you resurrect an object, it's generally because you're going 
to expose it to other subsystems that may incr / decr / otherwise 
inspect the reference count.  Which means that code may buffer reference 
count changes.  Which means you can't immediately destroy the object 
anymore.  So: when you resurrect, you set the new reference count, you 
also set a flag saying "I've already been resurrected", you pass it in 
to that other code, you then drop your references with Py_DECREF, and 
you exit.  Your dealloc function will get called again later; you then 
see you've already done that first resurrection, and you destroy as 
normal.  Curiously enough, the typeobject actually needs to do this 
twice: once for tp_finalize, once for tp_del.  (Assuming I didn't 
completely misunderstand what the code was doing.)


My struggles continue,


//arry/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20161013/282f9564/attachment.html>


More information about the Python-Dev mailing list