[Python-Dev] Py_Finalize does not release all memory, not even closely

Tim Peters tim.peters at gmail.com
Sun Apr 16 01:00:27 CEST 2006


[Martin]
> Running Py_Initialize/Py_Finalize once leaves 2150 objects behind (on
> Linux). The second run adds 180 additional objects; each subsequent
> run appears to add 156 more.

One thing I notice is that they're all > 0 :-)  I believe that, at one
time, the second and subsequent numbers were 0, but maybe that's just
old age pining for the days when kids listened to good music.

Because new-style classes create cycles that Py_Finalize() doesn't
clean up, it may make analysis easier to stick a PyGC_Collect() call
(or two!  repeat until it returns 0) inside the loop now.

...

>> Not unless the module has a finalization function called by
>> Py_Finalize() that frees such things (like PyString_Fini and
>> PyInt_Fini).

> How should the module install such a function?

There is no way at present, short of editing the source for
Py_Finalize and recompiling.  Presumably this is something that should
be addressed in the module initialization/finalization PEP, right?  I
suppose people may want a way for Python modules to provide
finalization functions too.

>> I'm not clear on whether, e.g., init_socket() may get called more than
>> once if socket-slinging code appears in a Py_Initialize() ...
>> Py_Finalize().

> Module initialization functions are called each time. Py_Finalize
> "forgets" which modules had been loaded, and reloads them all.

OK, so things like the previous example (socketmodule.c's unconditional

	socket_gaierror = PyErr_NewException(...);

in init_socket()) are guaranteed to leak "the old" socket_gaierror
object across multiple  socket module initializations.

... [later msg] ...

> With COUNT_ALLOCS, I get the following results: Ignoring the two initial
> rounds of init/fini, each subsequent init/fini pair puts this number
> of objects into garbage:
>
> builtin_function_or_method 9
> cell 1
> code 12
> dict 23
> function 12
> getset_descriptor 9
> instancemethod 7
> int 9
> list 6
> member_descriptor 23
> method_descriptor 2
> staticmethod 1
> str 86
> tuple 78
> type 14
> weakref 38
> wrapper_descriptor 30

FYI, from debugging Zope C-code leaks, staring at leftover strings,
dict keys, and tuple contents often helps identify the sources.  types
too.

> This totals to 360, which is for some reason higher than the numbers
> I get when counting the objects on the global list of objects.

How much higher?

Last time I looked at this stuff (2.3b1, I think), the "all" in "the
global list of all objects" wasn't true, and I made many changes to
the core at the time to make it "more true".  For example, all static
C singleton objects were missing from that list (from Py_None and
Py_True to all builtin static type objects).

As SpecialBuilds.txt says, it became true in 2.3 that "a static type
object T does appear in this list if at least one object of type T has
been created" when COUNT_ALLOCS is defined.  I expect that for _most_
static type objects, just doing initialize/finalize will not create
objects of that type, so they'll be missing from the global list of
objects.

It used to be a good check to sum ob_refcnt over all objects in the
global list, and compare that to _Py_RefTotal.  The difference was a
good clue about how many objects were missing from the global list,
and became a better clue after I added code to inc_count() to call
_Py_AddToAllObjects() whenever an incoming type object has tp_allocs
== 0.  Because of the latter, every type object with a non-zero
tp_allocs count is in the list of all objects now (but only when
COUNT_ALLOCS is defined).

It's possible that other static singletons (like Py_None) have been
defined since then that didn't grow code to add them to the global
list, although by construction _PyBuiltin_Init() does add all
available from __builtin__.

> Is it not right to obtain the number of live object by computing
> tp->tp_allocs-tp->tp_frees?

That's the theory ;-)  This code is never tested, though, and I bet is
rarely used.  Every time I've looked at it (most recently for 2.3b1,
about 3 years ago), it was easy to find bugs.  They creep in over
time.  Common historical weak points are "fancy" object-creation code,
and the possibility of resurrection in a destructor.  The special
builds need special stuff at such times, and exactly what's needed
isn't obvious.


More information about the Python-Dev mailing list