[Python-Dev] suggestion for smarter garbage collection in function of size (gc.set_collect_mem_growth(2))

Andrea Arcangeli andrea at suse.de
Thu Dec 29 03:36:58 CET 2005


On Wed, Dec 28, 2005 at 07:14:32PM -0700, Neil Schemenauer wrote:
> [This message has also been posted.]
> Martin v. Löwis <martin at v.loewis.de> wrote:
> > One challenge is that PyObject_GC_Del doesn't know how large the memory
> > block is that is being released. So it is difficult to find out how
> > much memory is being released in the collection.
> 
> Another idea would be to add accounting to the PyMem_* interfaces.
> It could be that most memory is used by objects that are not tracked
> by the GC (e.g. strings).  I guess you still have the same problem
> in that PyMem_Free may not know how large the memory block is.

In ram_size (per my pseudocode) we have to account everything that can
be possibly released by the "gc" by an inovcation of a deep gc.collect().
So if strings can't be freed by the gc (as a side effect of releasing
other objects), then we don't necessairly need to account for them in
the algorithm, otherwise we have to. I guess some strings can be freed
by the gc too so it sounds like PyMem_ may be a more correct hooking
point.

We definitely must know the size of the free operation. The sad thing is
that glibc knows it.

	size_t free_size(void * ptr) /* free and return size of freed object */

An API like the above would be able to answer our question at very
little cost, but it requires changing glibc, and we'd need to make sure
it's really the more efficient way of doing it before considering it,
because I've some doubt at the moment (otherwise I wonder why something
like the above doesn't already exist in glibc?!?). OTOH I guess not many
apps are doing their own garbage collection, and the ones that do it,
may be using their own allocators instead of the glibc ones. This
reminds me about the pymalloc thing I heard about over time. That should
be able to provide a pymalloc_free_size kind of thing returning the size
of the object freed, we could start with that assuming it's more
efficient than doing the accounting in the upper layer.

PS. your mail client looks broken the way it handles CC ;)


More information about the Python-Dev mailing list