Puzzling behaviour of Py_IncRef

Tony Flury tony.flury at btinternet.com
Wed Jan 26 03:03:36 EST 2022


On 26/01/2022 01:29, MRAB wrote:
> On 2022-01-25 23:50, Tony Flury via Python-list wrote:
>>
>> On 25/01/2022 22:28, Barry wrote:
>>>
>>>> On 25 Jan 2022, at 14:50, Tony Flury via 
>>>> Python-list<python-list at python.org>  wrote:
>>>>
>>>> 
>>>>> On 20/01/2022 23:12, Chris Angelico wrote:
>>>>>> On Fri, 21 Jan 2022 at 10:10, Greg 
>>>>>> Ewing<greg.ewing at canterbury.ac.nz>  wrote:
>>>>>> On 20/01/22 12:09 am, Chris Angelico wrote:
>>>>>>> At this point, the refcount has indeed been increased.
>>>>>>>
>>>>>>>>            return self;
>>>>>>>>       }
>>>>>>> And then you say "my return value is this object".
>>>>>>>
>>>>>>> So you're incrementing the refcount, then returning it without
>>>>>>> incrementing the refcount. Your code is actually equivalent to 
>>>>>>> "return
>>>>>>> self".
>>>>>> Chris, you're not making any sense. This is C code, so there's no
>>>>>> way that "return x" can change the reference count of x.
>>>>> Yeah, I wasn't clear there. It was equivalent to *the Python code*
>>>>> "return self". My apologies.
>>>>>
>>>>>>   > The normal thing to do is to add a reference to whatever you're
>>>>>>   > returning. For instance, Py_RETURN_NONE will incref None and 
>>>>>> then
>>>>>>   > return it.
>>>>>>   >
>>>>>>
>>>>>> The OP understands that this is not a normal thing to do. He's
>>>>>> trying to deliberately leak a reference for the purpose of 
>>>>>> diagnosing
>>>>>> a problem.
>>>>>>
>>>>>> It would be interesting to see what the actual refcount is after
>>>>>> calling this function.
>>>> After calling this without a double increment in the function the 
>>>> ref count is still only 1 - which means that the 'return self' 
>>>> effectively does a double decrement. My original message includes 
>>>> the Python code which calls this 'leaky' function and you can see 
>>>> that despite the 'leaky POC' doing an increment ref count drops 
>>>> back to one after the return.
>>>>
>>>> You are right this is not a normal thing to do, I am trying to 
>>>> understand the behaviour so my library does the correct thing in 
>>>> all cases - for example - imagine you have two nodes in a tree :
>>>>
>>>> A --- > B
>>>>
>>>> And your Python code has a named reference to A, and B also 
>>>> maintains a reference to A as it's parent.
>>>>
>>>> In this case I would expect A to have a reference count of 2 
>>>> (counted as 3 through sys.getrefcount() - one for the named 
>>>> reference in the Python code - and one for the link from B back to 
>>>> A; I would also expect B to have a reference count here of 1 (just 
>>>> the reference from A - assuming nothing else referenced B).
>>>>
>>>> My original code was incrementing the ref counts of A and B and 
>>>> then returning A. within the Python test code A had a refcount of 1 
>>>> (and not the expected 2), but the refcount from B was correct as 
>>>> far as I could tell.
>>>>
>>>>
>>>>> Yes, and that's why I was saying it would need a *second* incref.
>>>>>
>>>>> ChrisA
>>>> Thank you to all of you for trying to help - I accept that the only 
>>>> way to make the code work is to do a 2nd increment.
>>>>
>>>> I don't understand why doing a 'return self' would result in a 
>>>> double decrement - that seems utterly bizzare behaviour - it 
>>>> obviously works, but why.
>>> The return self in C will not change the ref count.
>>>
>>> I would suggest setting a break point in your code and stepping out 
>>> of the function and seeing that python’s code does to the ref count.
>>>
>>> Barry
>>
>> Barry,
>>
>> something odd is going on because the Python code isn't doing anything
>> that would cause the reference count to go from 3 inside the C function
>> to 1 once the method call is complete.
>>
>> As far as I know the only things that impact the reference counts are :
>>
>>    * Increments due to assigning a new name or adding it to a container.
>>    * Increment due to passing the object to a function (since that binds
>>      a new name)
>>    * Decrements due to deletion of a name
>>    * Decrement due to going out of scope
>>    * Decrement due to being removed from a container.
>>
>> None of those things are happening in the python code.
>>
>> As posted in the original message - immediately before the call to the C
>> function/method sys.getrefcount reports the count to be 2 (meaning it is
>> actually a 1).
>>
>> Inside the C function the ref count is incremented and the Py_REFCNT
>> macro reports the count as 3 inside the C function as expected (1 for
>> the name in the Python code, 1 for the argument as passed to the C
>> function, and 1 for the increment), so outside the function one would
>> expect the ref count to now be 2 (since the reference caused by calling
>> the function is then reversed).
>>
>> However - Immediately outside the C function and back in the Python code
>> sys.getrefcount reports the count to be 2 again - meaning it is now
>> really 1. So that means that the refcount has been decremented twice
>> in-between the return of the C function and the execution of the
>> immediate next python statement. I understand one of those decrements -
>> the parameter's ref count is incremented on the way in so the same
>> object is decremented on the way out (so that calls don't leak
>> references) but I don't understand where the second decrement is coming
>> from.
>>
>> Again there is nothing in the Python code that would cause that
>> decrement - the decrement behavior is in the Python runtime.
>>
> The function returns a result, an object.
>
> The calling code is discarding the result, so it's being DECREFed.
>
> For example:
>
>     def foo():
>         return Node()
>
> returns a new node, so its refcount is 1.
>
> Calling 'foo' as statement:
>
>     foo()
>
> discards the result; the result is DECREFed back to 0 and garbage 
> collected.
>
> If you wanted your C function to return None, you'd have:
>
>     Py_INCREF(Py_NONE);
>     return Py_None;
>
> or, more succinctly:
>
>     Py_RETURN_NONE;
>
> But you're returning the object itself, and you're INCREFing it first, 
> which is what you need to do anyway.
>
> The 'extra' DECREF is coming from the result (i.e. self) being discarded.
>
> If it wasn't DECREFed, a function could create a new object (refcount 
> == 1) and return it, and if the function was being called as a 
> statement, it would be discarded with the refcount still == 1, leading 
> to a memory leak.

So according to that I should increment twice if and only if the calling 
code is using the result - which you can't tell in the C code - which is 
very odd behaviour.

There is clearly something very deep here that I am simply not 
understanding.

-- 
Anthony Flury
email : anthony.flury at btinternet.com



More information about the Python-list mailing list