[Cython] Hooking tp_clear()

Stefan Behnel stefan_ml at behnel.de
Fri Sep 7 00:35:07 EDT 2018


Jeroen Demeyer schrieb am 06.09.2018 um 22:54:
> Cython's __dealloc__ special method is meant to deal with cleaning up
> instances of cdef classes. However, this hooks tp_dealloc() and does not
> have meaningful access to Python attributes, since those might have been
> cleared by tp_clear().
> 
> I have a concrete use case where I want something like __dealloc__ but
> *before* Python attributes are cleared. So this really belongs in tp_clear().
> 
> Using a PyObject* attribute in the cdef class with manual reference
> counting is not a solution since this attribute could genuinely occur in a
> reference cycle.
> 
> So I would suggest to support a __clear__ special method, which would then
> be called both by tp_clear() and tp_dealloc(). It's important to note that
> this should be idempotent: it will be called at least once before Python
> attributes are cleared but it may also be called later.

Maybe you actually want "tp_finalize"?

https://www.python.org/dev/peps/pep-0442/

Cython moves "__del__" methods there in Py3.4+.


> PS: I never really understood the technical difference between tp_clear()
> and tp_dealloc(). It seems to me that these serve a very similar purpose:
> why can't the garbage collector just call tp_dealloc()?

The problem are reference cycles, in which there definitely is a life
reference to the object *somewhere* else. Thus, the GC cannot simply
deallocate the object, it must try to delete the references instead. This
is what "tp_clear" is used for, it clears all references that an object
inside of a reference cycle has towards other objects (or at least those
that can participate in that cycle). This will (hopefully) trigger a
cascade of deallocations along the cycle. If that isn't enough, and there
is still a cycle, then the clearing needs to be repeated until all
references to the last object in the cycle are cleared.

AFAIR, tp_clear() is *only* called by the cyclic garbage collector and not
during normal refcounting deallocation. The GC process is: tp_visit() to
detect cycles, tp_clear() to break them. tp_dealloc() is then only called
indirectly by the normal refcounting cleanup, not directly by the GC.

Stefan


More information about the cython-devel mailing list