[Doc-SIG] Essay on Reference Counts (long), second try

Mon, 09 Jul 2001 18:42:14 -0400

I have always been confused by reference counts. I put this together to 
try to clear up my confusion.

You are welcome to use any part of this document in writing Python 
documentation.

Ed Jones

EE is "Extending and Embedding the Python Interpreter".
API is "Python/C API Reference Manual".
Paragraphs starting with expressions like "{EE 1.10}" are very similar
to paragraphs in the Python documentation.

==========================================================================
==========================================================================

Metasummary

Some of the Python source code documentation can be usefully duplicated
in the API documentation.

I find the "own", "borrow", "steal" metaphor confusing. I prefer to
restrict the use of "reference" to the meaning "pointer to an
PyObject". If Py_INCREF() has been called for a reference, the
reference is "protected".  From the Python coder's point of view, a
reference is an object. Therefore it is reasonable to use "object"
interchangeably with "reference".

1. Summary

1. Every Python object contains a reference counter which is
incremented by Py_INCREF() and decremented by Py_DECREF(). If the
counter becomes zero, Python might delete the object.

2. For every call to Py_INCREF(), there should eventually be a call to
Py_DECREF(). Call Py_INCREF() for objects that you want to keep
around for a while. Call Py_DECREF() when you are done with them. A
pointer to an object that has been INCREFed is said to be
"protected".

3. Most functions INCREF any object that should continue to exist after
the function exits. This includes objects returned via arguments or
a return statement. In these cases the calling function usually has
responsibility for calling Py_DECREF(). The calling function can, in
turn, pass responsibility for the DECREF to _its_ caller.

4. Some functions violate rule 3 by not INCREFing objects they return.
The standard example is PyTuple_GetItem().

5. It is not necessary to increment an object's reference count for
every local variable that contains a pointer to an object. But don't
leave objects unprotected if there is any chance that the object
will be DECREFed elsewhere.

6. Most functions assume that each argument passed to them is a
protected object. Therefore the code for the function does not
INCREF the argument.

7. There are exactly two important functions that are exceptions to
rule 6: PyTuple_SetItem() and PyList_SetItem(). These functions take
over responsibility of the item passed to them -- even if they fail!

2. Background

1.1 Memory Problems with C/C++

{EE 1.10} In languages like C or C++, the programmer is responsible for
dynamic allocation and deallocation of memory on the heap. In C, this
is done using the functions malloc() and free(). In C++, the operators
new and delete are used with essentially the same meaning; they are
actually implemented using malloc() and free(), so we'll restrict the
following discussion to the latter.

{EE 1.10} Every block of memory allocated with malloc() should
eventually be returned to the pool of available memory by exactly one
call to free(). It is important to call free() at the right time. If a
block's address is forgotten but free() is not called for it, the
memory it occupies cannot be reused until the program terminates. This
is called a memory leak. On the other hand, if a program calls free()
for a block and then continues to use the block, it creates a conflict
with re-use of the block through another malloc() call. This is called
using freed memory. It has the same bad consequences as referencing
uninitialized data -- core dumps, wrong results, mysterious crashes.

{EE 1.10} Common causes of memory leaks are unusual paths through the
code.  For instance, a function may allocate a block of memory, do some
calculation, and then free the block again. Now a change in the
requirements for the function may add a test to the calculation that
detects an error condition and can return prematurely from the
function. It's easy to forget to free the allocated memory block when
taking this premature exit, especially when it is added later to the
code. Such leaks, once introduced, often go undetected for a long time:
the error exit is taken only in a small fraction of all calls, and most
modern machines have plenty of virtual memory, so the leak only becomes
apparent in a long-running process that uses the leaking function
frequently. Therefore, it's important to prevent leaks from happening
by having a coding convention or strategy that minimizes this kind of
errors.

{EE 1.10} Since Python makes heavy use of malloc() and free(), it needs
a strategy to avoid memory leaks as well as the use of freed memory.
The chosen method is called reference counting. The principle is
simple: every object contains a counter, which is incremented when a
pointer to the object is stored somewhere, and which is decremented
when the pointer is deleted.  When the counter reaches zero, the last
pointer to the object has been deleted and the object is freed.

1.2 Python Objects

{object.h} Python objects are structures allocated on the heap. They
have type "PyObject". They are accessed through pointers of type
"PyObject*".  Special rules apply to the use of objects to ensure they
are properly garbage-collected. Objects are never allocated statically
or on the stack; they must be accessed through special macros and
functions only.  (Type objects are exceptions to the first rule; the
standard types are represented by statically initialized type objects.)

{object.h} An object has a 'reference count' that is increased or
decreased when a pointer to the object is copied or deleted; when the
reference count reaches zero there are no references to the object left
and it can be removed from the heap.

Every Python object has a reference count and a type (and possibly
more).  Here is the relevant code from "object.h", one of the Python
header files.  "PyObject" is a C struct. Each instance of "PyObject"
contains the variable "ob_refcnt" where the "reference count" is kept
and "ob_type" where the type is kept.

#define PyObject_HEAD \
int ob_refcnt; \
struct _typeobject *ob_type;

typedef struct _object {
PyObject_HEAD
} PyObject;

{object.h} The "type" of an object determines what it represents and
what kind of data it contains. An object's type is fixed when it is
created.  Types themselves are represented as objects; an object
contains a pointer to the corresponding type object.  The type itself
has a type pointer pointing to the object representing the type 'type',
which contains a pointer to itself!).

{object.h} Objects do not float around in memory; once allocated an
object keeps the same size and address. Objects that must hold
variable-size data can contain pointers to variable-size parts of the
object. Not all objects of the same type have the same size; but the
size cannot change after allocation. (These restrictions are made so a
reference to an object can be simply a pointer -- moving an object
would require updating all the pointers, and changing an object's size
would require moving it if there was another object right next to it.)

{object.h} Objects are always accessed through pointers of the type
'PyObject *'. The type 'PyObject' is the structure shown above that
only contains the reference count and the type pointer. The actual
memory allocated for an object contains other data that can only be
accessed after casting the pointer to a pointer to a longer structure
type. This longer type must start with the reference count and type
fields; the macro PyObject_HEAD above should be used for this (to
accommodate for future changes). The implementation of a particular
object type can cast the object pointer to the proper type and back.

{object.h} A standard interface also exists for objects that contain an
array of items whose size is determined when the object is allocated.
It looks like

#define PyObject_VAR_HEAD \
PyObject_HEAD \
int ob_size; /* Number of items in variable part */

typedef struct _object {
PyObject_HEAD
} PyObject;

If the reference count for some object becomes zero, Python will
usually reclaim the memory the object uses, making the object
disappear. If an object is used in a piece of code, the code should
make sure that the reference count cannot drop to zero, usually by
adding one to it.

If the reference count never becomes zero, the memory is never
reclaimed. If it was not intended for the object to be permanent, there
is a memory leak.

3. Py_INCREF() and Py_DECREF()

{object.h} The macros Py_INCREF(op) and Py_DECREF(op) are used to
increment or decrement reference counts. Py_DECREF() also calls the
object's deallocator function; for objects that don't contain
references to other objects or heap memory this can be the standard
function free(). Both macros can be used wherever a void expression is
allowed. The argument shouldn't be a NIL pointer.

{object.h} We assume that the reference count field can never overflow;
this can be proven when the size of the field is the same as the
pointer size but even with a 16-bit reference count field it is pretty
unlikely so we ignore the possibility.

To prevent memory leaks, corresponding to each call to Py_INCREF(),
there must be a call to Py_DECREF(): for each call to Py_INCREF(),
there is a "responsibility" to call Py_DECREF(). This responsibility
can be passed around between functions. The last user of the reference
must call Py_DECREF().

4. When to Use Py_INCREF() and Py_DECREF()

4.1 Objects Returned from Functions

Most Python objects in C code are created by calls to functions in the
"Python/C API". These functions have prototypes that look like:

PyObject* Py_Something(arguments);

These functions usually (but not always!) call Py_INCREF() before
returning (a pointer to) the new object. Generally, the function that
called PySomething has the responsibility to call Py_DECREF(). If, in
turn, the function returns the object to its caller, it passes on the
responsibility for the reference.  Some things are best understood by
example pseudo-code:

void MyCode(arguments) {
PyObject* pyo;
...
pyo = Py_Something(args);

MyCode has responsibility for the reference passed to it by
Py_Something.  When MyCode is done with "pyo", it must call:

Py_DECREF(pyo);

On the other hand, if MyCode returns "pyo", there must not be a call to
Py_DECREF().

PyObject* MyCode(arguments) {
PyObject* pyo;
...
pyo = Py_Something(args);
...
return pyo;
}

In this situation, MyCode has "passed on the responsibility" for
DECREFing the reference.

Note: if a function is to return None, the C code should look like:

Py_INCREF(Py_None);
return Py_None;

Remember to INCREF Py_None!

So far, only the most common case has been discussed, where
"Py_Something" creates a reference and passes responsibility for it to
the caller. In the Python documentation, this is called a "new
reference". For example the documentation says:

PyObject* PyList_New(int len)
Return value: New reference.
Returns a new list of length len on success, or NULL on failure.

The documentation uses the word "reference" is two closely related
ways: a pointer to a PyObject and the responsibility to DECREF the
object. I will try to use "reference" in the first sense only. When a
reference has been INCREFed, I prefer to say that the reference is
"protected". I will often get sloppy and say "object" when I mean
"reference". (After all, what a C programmer sees as a "PyObject*" the
Python programmer sees as an "object".)

But sometimes the Python source code DOES NOT CALL Py_DECREF():

PyObject *
PyTuple_GetItem(register PyObject *op, register int i)
{
if (!PyTuple_Check(op)) {
PyErr_BadInternalCall();
return NULL;
}
if (i < 0 || i >= ((PyTupleObject *)op) -> ob_size) {
PyErr_SetString(PyExc_IndexError,
"tuple index out of range");
return NULL;
}
return ((PyTupleObject *)op) -> ob_item[i];
}

In the documentation, this is referred to as "borrowing" a reference:

PyObject* PyTuple_GetItem(PyObject *p, int pos)
Return value: Borrowed reference.
Returns the object at position pos in the tuple pointed to by
p.  If pos is out of bounds, returns NULL and sets an
IndexError exception.

I prefer to say that the the reference (in the sense I use it) is left
"unprotected".

Functions returning unprotected referencess (borrowing a reference)
are: PyTuple_GetItem(), PyList_GetItem(), PyList_GET_ITEM(),
PyList_SET_ITEM(), PyDict_GetItem(), PyDict_GetItemString(),
PyErr_Occurred(), PyFile_Name(), PyImport_GetModuleDict(),
PyModule_GetDict(), PyImport_AddModule(), PyObject_Init(),
Py_InitModule(), Py_InitModule3(), Py_InitModule4(), and
PySequence_Fast_GET_ITEM().

{EE 10.1.2} The function PyImport_AddModule() does not INCREF the
reference it returns even though it may actually create the object the
reference refers to: this is possible because the object is INCREFed
when it is stored in sys.modules.

See also PyArg_ParseTuple() in Extending and Embedding, 1.7. This
function sometimes returns PyObjects back to the caller through its
arguments. An example from sysmodule.c is:

static PyObject *
sys_getrefcount(PyObject *self, PyObject *args)
{
PyObject *arg;
if (!PyArg_ParseTuple(args, "O:getrefcount", &arg))
return NULL;
return PyInt_FromLong(arg->ob_refcnt);
}

"arg" is an unprotected object. It should not be DECREFed before
leaving the function (because it was never INCREFed!).

{API 1.2.1.1} Here is an example of how you could write a function that
computes the sum of the items in a list of integers, here using
PySequence_GetItem() and later using PyList_GetItem().

long sum_sequence(PyObject *sequence)
{
int i, n;
long total = 0;
PyObject *item;
n = PySequence_Length(sequence);
if (n < 0)
return -1; /* Has no length. */
/* Caller should use PyErr_Occurred() if a -1 is returned. */
for (i = 0; i < n; i++) {
/* PySequence_GetItem INCREFs item. */
item = PySequence_GetItem(sequence, i);
if (item == NULL)
return -1; /* Not a sequence, or other failure */
if (PyInt_Check(item))
total += PyInt_AsLong(item);
Py_DECREF(item);
}
return total;
}

4.1.1 When to be sloppy with unINCREFed objects

{API 1.2.1} It is not necessary to increment an object's reference
count for every local variable that contains a pointer to an object. In
theory, the object's reference count should be increased by one when
the variable is made to point to it and decreased by one when the
variable goes out of scope. However, these two cancel each other out,
so at the end the reference count hasn't changed. The only real reason
to use the reference count is to prevent the object from being
deallocated as long as our variable is pointing to it. If we know that
there is at least one other reference to the object that lives at least
as long as our variable, there is no need to increment the reference
count temporarily.

{API 1.2.1} An important situation where this arises is for objects
that are passed as arguments to C functions in an extension module that
are called from Python; the call mechanism guarantees to hold a
reference to every argument for the duration of the call.

Here is the "sum_list" example again, this time using PyList_GetItem().

long sum_list(PyObject *list)
{
int i, n;
long total = 0;
PyObject *item;

n = PyList_Size(list);
if (n < 0)
return -1; /* Not a list */
/* Caller should use PyErr_Occurred() if a -1 is returned. */
for (i = 0; i < n; i++) {
/* PyList_GetItem does not INCREF "item".
"item" is unprotected. */
item = PyList_GetItem(list, i); /* Can't fail */
if (PyInt_Check(item))
total += PyInt_AsLong(item);
}
return total;
}

4.1.2 Thin Ice: When not to be sloppy with INCREF

Don't leave an object unprotected if there is any chance that it
will be DECREFed.

{API 1.2.1} Subtle bugs can occur when when a reference is unprotected.
A common example is to extract an object from a list and hold on to it
for a while without incrementing its reference count. Some other
operation might conceivably remove the object from the list,
decrementing its reference count and possible deallocating it. The real
danger is that innocent-looking operations may invoke arbitrary Python
code which could do this; there is a code path which allows control to
flow back to the Python user from a Py_DECREF(), so almost any
operation is potentially dangerous. For example:

bug(PyObject *list) {
PyObject *item = PyList_GetItem(list, 0);

PyList_SetItem(list, 1, PyInt_FromLong(0L));
PyObject_Print(item, stdout, 0); /* BUG! */
}

{EE 1.10.3} The PyObject "item" is gotten using PyList_GetItem and left
unprotected. The code then replaces list[1] with the value 0, and
finally prints "item". Looks harmless, right? But it's not!

{EE 1.10.3} Let's follow the control flow into PyList_SetItem(). The
list has protected references to all its items, so when item 1 is
replaced, it has to dispose of the original item 1 by DECREFing it. Now
let's suppose the original item 1 was an instance of a user-defined
class, and let's further suppose that the class defined a __del__()
method. If this class instance has a reference count of 1, disposing of
it will call its __del__() method.

{EE 1.10.3} Since it is written in Python, the __del__() method can
execute arbitrary Python code. Could it perhaps do something to
invalidate the reference to item in bug()? You bet! Assuming that the
list passed into bug() is accessible to the __del__() method, it could
execute a statement to the effect of "del list[0]", and assuming this
was the last reference to that object, it would free the memory
associated with it, thereby invalidating item.

{EE 1.10.3} The solution, once you know the source of the problem, is
easy: temporarily increment the reference count. The correct version of
the function reads:

no_bug(PyObject *list) {
PyObject *item = PyList_GetItem(list, 0);
Py_INCREF(item); /* Protect item. */

PyList_SetItem(list, 1, PyInt_FromLong(0L));
PyObject_Print(item, stdout, 0);
Py_DECREF(item);
}

{EE 1.10.3} This is a true story. An older version of Python contained
variants of this bug and someone spent a considerable amount of time in
a C debugger to figure out why his __del__() methods would fail...

{EE 1.10.3} The second case of problems with unprotected objects is a
variant involving threads. Normally, multiple threads in the Python
interpreter can't get in each other's way, because there is a global
lock protecting Python's entire object space. However, it is possible
to temporarily release this lock using the macro
Py_BEGIN_ALLOW_THREADS, and to re-acquire it using
Py_END_ALLOW_THREADS. This is common around blocking I/O calls, to let
other threads use the CPU while waiting for the I/O to complete.
Obviously, the following function has the same problem as the previous
one:

bug(PyObject *list) {
PyObject *item = PyList_GetItem(list, 0);
Py_BEGIN_ALLOW_THREADS
...some blocking I/O call...
Py_END_ALLOW_THREADS
PyObject_Print(item, stdout, 0); /* BUG! */
}

4.2 Objects Passed to Functions

So far, we have looked at references to objects returned from
functions.  Now consider what happens when an object is passed to a
function. To fix the ideas consider:

int Caller(void)
{
PyObject* pyo;
Function(pyo);

Most functions assume that the arguments passed to them are already
protected. Therefore Py_INCREF() is not called inside "Function" unless
"Function" wants the argument to continue to exist after "Caller"
exits". In the documentation, "Function" is said to "borrow" a
reference:

{EE 10.1.2} When you pass an object reference into another
function, in general, the function borrows the reference from you
-- if it needs to store it, it will use Py_INCREF() to become an
independent owner.

"PyDict_SetItem()" can serve as an example of normal behavior. Putting
something in a dictionary is "storing" it. Therefore "PyDict_SetItem()"
INCREFs both the key and the value.

{EE 10.1.2} There are exactly two important functions that do not
behave in this normal way: PyTuple_SetItem() and PyList_SetItem().
These functions take over responsibility of the item passed to them --
even if they fail!  The Python documentation uses the phrase "steal a
reference" to mean "takes over responsibility".

Here is what PyTuple_SetItem(atuple, i, item) does: If "atuple[i]"
currently contains a PyObject, that PyObject is DECREFed. Then
"atuple[i]" is set to "item". "item" is *not* INCREFed. If
PyTuple_SetItem() fails to insert "item", it *decrements* the reference
count for "item". Similarly, PyTuple_GetItem() does not increment the
returned item's reference count.  Metaphorically, PyTuple_SetItem()
grabs responsibility for a reference to "item" from you. If "item" is
unprotected, PyTuple_SetItem() might DECREF it anyway which can crash
Python.

Other exceptions are PyList_SET_ITEM(), PyModule_AddObject(), and
PyTuple_SET_ITEM().

Look at this piece of code:

PyObject *t;
PyObject *x;

x = PyInt_FromLong(1L);

At this point x has a reference count of one. When you are done with
it, you normally would call Py_DECREF(x). But if if PyTuple_SetItem is
called:

PyTuple_SetItem(t, 0, x);

you must not call Py_DECREF(). PyTuple_SetItem() will call it for you:
when the tuple is DECREFed, the items will be also.

{API 1.2.1.1} PyTuple_SetItem(), et. al, were designed to take over
responsibility for a reference because of a common idiom for populating
a tuple or list with newly created objects; for example, the code to
create the tuple (1, 2, "three") could look like this (forgetting about
error handling for the moment). It is better coding practice to use the
less confusing PySequence family of functions as below.

PyObject *t;

t = PyTuple_New(3);
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
PyTuple_SetItem(t, 2, PyString_FromString("three"));

{API 1.2.1.1} Incidentally, PyTuple_SetItem() is the only way to set
tuple items; PySequence_SetItem() and PyObject_SetItem() refuse to do
this since tuples are an immutable data type. You should only use
PyTuple_SetItem() for tuples that you are creating yourself.

{API 1.2.1.1} Equivalent code for populating a list can be written
using PyList_New() and PyList_SetItem(). Such code can also use
PySequence_SetItem(); this illustrates the difference between the two
(the extra Py_DECREF() calls):

PyObject *l, *x;

l = PyList_New(3);
x = PyInt_FromLong(1L);
PySequence_SetItem(l, 0, x); Py_DECREF(x);
x = PyInt_FromLong(2L);
PySequence_SetItem(l, 1, x); Py_DECREF(x);
x = PyString_FromString("three");
PySequence_SetItem(l, 2, x); Py_DECREF(x);

{API 1.2.1.1} You might find it strange that the 'better coding
practice' takes more code. However, you are unlikely to often use these
ways of creating and populating a tuple or list. There's a generic
function, Py_BuildValue(), that can create most common objects from C
values, directed by a format string. For example, the above two blocks
of code could be replaced by the following (which also takes care of
the error checking):

PyObject *t, *l;

t = Py_BuildValue("(iis)", 1, 2, "three");
l = Py_BuildValue("[iis]", 1, 2, "three");

{API 1.2.1.1} It is much more common to use PyObject_SetItem() and
friends with protected objects (ie, the reference count was incremented
before passing the item to you), a typical example being arguments that
were passed in to the function you are writing. In that case, their
behaviour regarding reference counts has a simpler appearance, since
you don't have to do anything at all with reference counts. For
example, this function sets all items of a list (actually, any mutable
sequence) to a given item:

int set_all(PyObject *target, PyObject *item)
{
int i, n;

n = PyObject_Length(target);
if (n < 0)
return -1;
for (i = 0; i < n; i++) {
if (PyObject_SetItem(target, i, item) < 0)
return -1;
}
return 0;
}

5. Two Examples

Example 1.

This is a pretty standard example of C code using the Python API.

PyObject*
MyFunction(void)
{
PyObject* temporary_list=NULL;
PyObject* return_this=NULL;

temporary_list = PyList_New(1);          /* Note 1 */
if (temporary_list == NULL)
return NULL;

return_this = PyList_New(1);             /* Note 1 */
if (return_this == NULL)
Py_DECREF(temporary_list);           /* Note 2 */
return NULL;
}

Py_DECREF(temporary_list);               /* Note 2 */
return return_this;
}

Note 1: The object returned by PyList_New has a reference count of 1.
Note 2: Since "temporary_list" should disappear when MyFunction exits,
it must be DECREFed before any return from the function. If a return
can be reached both before or after "temporary_list" is created, then
initialize "temporary_list" to NULL and use "Py_XDECREF()".

Example 2.

This is the same as Example 1 except PyTuple_GetItem() is used.

PyObject*
MyFunction(void)
{
PyObject* temporary=NULL;
PyObject* return_this=NULL;
PyObject* tup;
PyObject* num;
int err;

tup = PyTuple_New(2);
if (tup == NULL)
return NULL;

err = PyTuple_SetItem(tup, 0, PyInt_FromLong(222L));  /* Note 1 */
if (err) {
Py_DECREF(tup);
return NULL;
}
err = PyTuple_SetItem(tup, 1, PyInt_FromLong(333L));  /* Note 1 */
if (err) {
Py_DECREF(tup);
return NULL;
}

temporary = PyTuple_Getitem(tup, 0);          /* Note 2 */
if (temporary == NULL) {
Py_DECREF(tup);
return NULL;
}

return_this = PyTuple_Getitem(tup, 1);        /* Note 3 */
if (return_this == NULL) {
Py_DECREF(tup);
/* Note 3 */
return NULL;
}

/* Note 3 */
Py_DECREF(tup);
return return_this;
}

Note 1: If "PyTuple_SetItem" fails or if the tuple it created is
DECREFed to 0, then the object returned by "PyInt_FromLong" is
DECREFed.
Note 2: "PyTuple_Getitem" does not increment the reference count for
the object it returns.
Note 3: You have no responsibility for DECFREFing "temporary".