[Patches] Fix for memory leak in _PyResize_Tuple

Jeremy Hylton jeremy@alum.mit.edu
Sun, 16 Apr 2000 16:53:43 -0400 (EDT)


>>>>> "CGW" == Charles G Waldman <cgw@alum.mit.edu> writes:

  CGW> The basic problem is that the 3-tuples are being added to the
  CGW> cache but never picked up again, since _PyTuple_Resize doesn't
  CGW> make use of the free_tuples cache.  If you are resizing a
  CGW> 5-tuple to a 3-tuple and there is already a 3-tuple in
  CGW> free_tuples[3], instead of using this tuple, _PyTuple_Resize
  CGW> will realloc the 5-tuple to a 3-tuple.  It would more efficient
  CGW> to use the existing 3-tuple and cache the 5-tuple.

  CGW> By making _PyTuple_Resize aware of the free_tuples (just as
  CGW> PyTuple_New), we not only save a few calls to realloc, but also
  CGW> prevent this misbehavior whereby tuples are being added to the
  CGW> free_tuples list but never properly "recycled".

Question about this patch: Is it always more efficient to re-use a
tuple (and update all the ob_item pointers)?  Or are there cases where
it would be more efficient to realloc and avoid the cost of updating
the pointers?  

There's another problem with the current strategy for reusing tuples
that we might fix at the same time.  If a program creates a large
number of tuples, later releases them all, and never creates any more
tuples of the same size, that memory will live forever on the free
list.  If I have a program that creates a million 3-tuples at one
blow, that memory will never get freed.

It should be possible to fix this problem by keeping track of the size
of each freelist and never letting it exceed some maximum size.  If it
actually ends up being more efficient to realloc than copy, then this
patch would limit the amount of loss caused by oddball cases like
calling PyObject_Sequence on an instance that overstates its size.

Jeremy

Index: Objects/tupleobject.c
===================================================================
RCS file: /projects/cvsroot/python/dist/src/Objects/tupleobject.c,v
retrieving revision 2.30
diff -c -r2.30 tupleobject.c
*** tupleobject.c	2000/03/13 16:01:29	2.30
--- tupleobject.c	2000/04/16 20:51:36
***************
*** 42,47 ****
--- 42,49 ----
     tuple () of which at most one instance will be allocated.
  */
  static PyTupleObject *free_tuples[MAXSAVESIZE];
+ static int free_length[MAXSAVESIZE];
+ #define MAX_FREE_LENGTH 100
  #endif
  #ifdef COUNT_ALLOCS
  int fast_tuple_allocs;
***************
*** 71,76 ****
--- 73,79 ----
  	    (op = free_tuples[size]) != NULL)
  	{
  		free_tuples[size] = (PyTupleObject *) op->ob_item[0];
+ 		free_length[size]--;
  #ifdef COUNT_ALLOCS
  		fast_tuple_allocs++;
  #endif
***************
*** 178,186 ****
  		while (--i >= 0)
  			Py_XDECREF(op->ob_item[i]);
  #if MAXSAVESIZE > 0
! 		if (op->ob_size < MAXSAVESIZE) {
  			op->ob_item[0] = (PyObject *) free_tuples[op->ob_size];
  			free_tuples[op->ob_size] = op;
  			goto done; /* return */
  		}
  #endif
--- 181,195 ----
  		while (--i >= 0)
  			Py_XDECREF(op->ob_item[i]);
  #if MAXSAVESIZE > 0
! 		if ((op->ob_size < MAXSAVESIZE)
! 		    && (free_tuples[op->ob_size] != NULL
! 			&& free_length[op->ob_size] < MAX_FREE_LENGTH)) {
  			op->ob_item[0] = (PyObject *) free_tuples[op->ob_size];
  			free_tuples[op->ob_size] = op;
+ 			if (free_tuples[op->ob_size] == NULL)
+ 			    free_length[op->ob_size] = 1;
+ 			else
+ 			    free_length[op->ob_size]++;
  			goto done; /* return */
  		}
  #endif