[Python-Dev] Bizarre new test failure

Guido van Rossum guido@python.org
Sat, 08 Jun 2002 22:10:44 -0400


Offline, Jeremy, Neil, Tim & I figured out what was really the case
here.  I believe the last thing we reported here was that we'd found a
sample program that required several gc.collect() calls to clean out
all its garbage, which surprised Neil and Tim.  Since they know the
collector inside out, if it surprises them, there's probably a bug.
Here's the analysis.

A new-style instance increfs its class (its ob_type), to make sure
that the class doesn't go away as long as the instance exists.  But
the tp_traverse handler for new-style instances didn't visit the
ob_type reference, so the collector thinks there are "outside"
references to the class, and doesn't collect it.  When the last
instance goes away, the refcnt to the class finally matches what the
collector sees, and it can collect the class (once all other
references to it are gone of course).

I was tempted to say "this ain't so bad, let's keep it that way."  But
Jeremy and Tim came up with counterexamples involving a cycle between
a class and its instance, where the cycle would never be broken.  Tim
found that this program grows without bounds, though ever slower,
since it spends more and more time in the 2nd generation collector,
where all the uncollected objects eventually end up:

  while 1:
      class A(object): pass
      A.a = A()

I tried the simplest possible fix, which was to visit self->ob_type in
the new-style instance tp_traverse handler (subtype_traverse() in
typeobject.c).  But this caused assertions to fail all over the place.
It turns out that when the collector decides to break a cycle like
this, it calls the tp_clear handler for each object in the cycle, and
then the subsequent deletion of the instance references the type in
ways that have been made invalid by the clearing of the type.  So this
was a dead end.

So here's a patch that does do the right thing.  It adds a new field
to type objects, tp_dependents.  This is essentially a second
reference count, counting the instances and direct subclasses.  As
long as tp_dependents is nonzero, this means there are still instances
or subclasses, and then the type's tp_clear handler doesn't do
anything.

A consequence of the patch is that there will always be examples that
take more than one collection to clean their garbage -- but eventually
all garbage will be cleared out.  (I suppose a worst-case example
would be a very long chain of subclasses, which would be cleared out
once class per collection.)  Consequently, I'm patching test_gc.py to
get rid of garbage left behind by previous tests in a loop.

A downside of the patch is that it adds a new field to the type object
structure; I believe this prevents it from being a backport candidate
to 2.2.2.  For a half blissful hour I believed that it would be
possible to do something much simpler by doubling the regular refcnt;
then I realized that the double refcnt would mean the collector would
never break the cycle.  Still, I am wishing for a solution that avoids
adding a new field.

Please comment on this patch!

Index: Include/object.h
===================================================================
RCS file: /cvsroot/python/python/dist/src/Include/object.h,v
retrieving revision 2.101
diff -c -r2.101 object.h
*** Include/object.h	12 Apr 2002 01:57:06 -0000	2.101
--- Include/object.h	8 Jun 2002 13:09:14 -0000
***************
*** 292,297 ****
--- 292,298 ----
  	PyObject *tp_cache;
  	PyObject *tp_subclasses;
  	PyObject *tp_weaklist;
+ 	int tp_dependents;
  
  #ifdef COUNT_ALLOCS
  	/* these must be last and never explicitly initialized */
Index: Objects/typeobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v
retrieving revision 2.148
diff -c -r2.148 typeobject.c
*** Objects/typeobject.c	4 Jun 2002 19:52:53 -0000	2.148
--- Objects/typeobject.c	8 Jun 2002 13:09:15 -0000
***************
*** 218,225 ****
  
  	memset(obj, '\0', size);
  
! 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
  		Py_INCREF(type);
  
  	if (type->tp_itemsize == 0)
  		PyObject_INIT(obj, type);
--- 218,227 ----
  
  	memset(obj, '\0', size);
  
! 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_INCREF(type);
+ 		type->tp_dependents++;
+ 	}
  
  	if (type->tp_itemsize == 0)
  		PyObject_INIT(obj, type);
***************
*** 290,295 ****
--- 292,303 ----
  		}
  	}
  
+ 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
+ 		int err = visit((PyObject *)type, arg);
+ 		if (err)
+ 			return err;
+ 	}
+ 
  	if (basetraverse)
  		return basetraverse(self, visit, arg);
  	return 0;
***************
*** 464,469 ****
--- 472,478 ----
  	/* Can't reference self beyond this point */
  	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_DECREF(type);
+ 		type->tp_dependents--;
  	}
  }
  
***************
*** 1170,1175 ****
--- 1179,1192 ----
  	Py_INCREF(base);
  	type->tp_base = base;
  
+ 	/* Incref the bases' tp_dependents count */
+ 	for (i = 0; i < nbases; i++) {
+ 		PyTypeObject *b;
+ 		b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
+ 		if (PyType_Check(b) && (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
+ 			b->tp_dependents++;
+ 	}
+ 
  	/* Initialize tp_dict from passed-in dict */
  	type->tp_dict = dict = PyDict_Copy(dict);
  	if (dict == NULL) {
***************
*** 1431,1441 ****
--- 1448,1475 ----
  static void
  type_dealloc(PyTypeObject *type)
  {
+ 	PyObject *bases;
  	etype *et;
  
  	/* Assert this is a heap-allocated type object */
  	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  	_PyObject_GC_UNTRACK(type);
+ 
+ 	/* Decref the bases' tp_dependents count */
+ 	bases = type->tp_bases;
+ 	if (bases) {
+ 		int i, nbases;
+ 		assert(PyTuple_Check(bases));
+ 		nbases = PyTuple_GET_SIZE(bases);
+ 		for (i = 0; i < nbases; i++) {
+ 			PyTypeObject *b;
+ 			b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
+ 			if (PyType_Check(b) &&
+ 			    (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
+ 				b->tp_dependents--;
+ 		}
+ 	}
+ 
  	PyObject_ClearWeakRefs((PyObject *)type);
  	et = (etype *)type;
  	Py_XDECREF(type->tp_base);
***************
*** 1495,1502 ****
  	etype *et;
  	int err;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1529,1535 ----
  	etype *et;
  	int err;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  
  	et = (etype *)type;
  
***************
*** 1524,1533 ****
  type_clear(PyTypeObject *type)
  {
  	etype *et;
! 	PyObject *tmp;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1557,1583 ----
  type_clear(PyTypeObject *type)
  {
  	etype *et;
! 	PyObject *tmp, *bases;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
! 
! 	if (type->tp_dependents)
! 		return 0; /* Not yet, there are still instances */
! 
! 	/* Decref the bases' tp_dependents count */
! 	bases = type->tp_bases;
! 	if (bases) {
! 		int i, nbases;
! 		assert(PyTuple_Check(bases));
! 		nbases = PyTuple_GET_SIZE(bases);
! 		for (i = 0; i < nbases; i++) {
! 			PyTypeObject *b;
! 			b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
! 			if (PyType_Check(b) &&
! 			    (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 				b->tp_dependents--;
! 		}
! 	}
  
  	et = (etype *)type;
  
***************
*** 1754,1763 ****
--- 1804,1815 ----
  	}
  	if (new->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_INCREF(new);
+ 		new->tp_dependents++;
  	}
  	self->ob_type = new;
  	if (old->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_DECREF(old);
+ 		old->tp_dependents--;
  	}
  	return 0;
  }
Index: Lib/test/test_gc.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_gc.py,v
retrieving revision 1.14
diff -c -r1.14 test_gc.py
*** Lib/test/test_gc.py	28 Mar 2002 21:22:25 -0000	1.14
--- Lib/test/test_gc.py	8 Jun 2002 13:09:15 -0000
***************
*** 220,228 ****
  def test():
      if verbose:
          print "disabling automatic collection"
      enabled = gc.isenabled()
      gc.disable()
!     verify(not gc.isenabled() )
      debug = gc.get_debug()
      gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak
  
--- 220,229 ----
  def test():
      if verbose:
          print "disabling automatic collection"
+     while gc.collect(): pass # collect garbage from previous tests
      enabled = gc.isenabled()
      gc.disable()
!     verify(not gc.isenabled())
      debug = gc.get_debug()
      gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak
  


--Guido van Rossum (home page: http://www.python.org/~guido/)