[ python-Bugs-1565525 ] gc allowing tracebacks to eat up memory
SourceForge.net
noreply at sourceforge.net
Wed Sep 27 09:49:54 CEST 2006
Bugs item #1565525, was opened at 2006-09-26 08:58
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1565525&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Hazel (ghazel)
Assigned to: Nobody/Anonymous (nobody)
Summary: gc allowing tracebacks to eat up memory
Initial Comment:
Attached is a file which demonstrates an oddity about
traceback objects and the gc.
The observed behaviour is that when the tuple from
sys.exc_info() is stored on an object which is inside
the local scope, the object, thus exc_info tuple, are
not collected even when both leave scope.
If you run the test with "s.e = sys.exc_info()"
commented out, the observed memory footprint of the
process quickly approaches and sits at 5,677,056
bytes. Totally reasonable.
If you uncomment that line, the memory footprint
climbs to 283,316,224 bytes quite rapidly. That's a
two order of magnitude difference!
If you uncomment the "gc.collect()" line, the process
still hits 148,910,080 bytes.
This was observed in production code, where exc_info
tuples are saved for re-raising later to get the stack-
appending behaviour tracebacks and 'raise' perform.
The example includes a large array to simulate
application state. I assume this is bad behaviour
occurring because the traceback object holds frames,
and those frames hold a reference to the local
objects, thus the exc_info tuple itself, thus causing
a circular reference which involves the entire stack.
Either the gc needs to be improved to prevent this
from growing so wildly, or the traceback objects need
to (optionally?) hold frames which do not have
references or have weak references instead.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2006-09-27 09:49
Message:
Logged In: YES
user_id=21627
I'm still having problems figuring out what the bug is that
you are reporting. Ok, in this case, it consumes a lot of
memory. Why is that a bug?
----------------------------------------------------------------------
Comment By: Greg Hazel (ghazel)
Date: 2006-09-27 05:20
Message:
Logged In: YES
user_id=731668
I have read the exc_info suggestions before, but they have
never made any difference. Neither change you suggest
modifies the memory footprint behaviour in any way.
Weakrefs might be slow, I offered them as an alternative to
just removing the references entirely. I understand this
might cause problems with existing code, but the current
situation causes a problem which is more difficult to work
around. Code that needs locals and globals can explicity
store a reference to eat - it is impossible to dig in to
the traceback object and remove those references.
The use-case of storing the exc_info is fairly simple, for
example:
Two threads. One queues a task for the other to complete.
That task fails an raises an exception. The exc_info is
caught, passed back to the first thread, the exc_info is
raised from there. The goal is to get the whole execution
stack, which it does quite nicely, except that it has this
terrible memory side effect.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2006-09-26 12:04
Message:
Logged In: YES
user_id=31435
Your memory bloat is mostly due to the
d = range(100000)
line. Python has no problem collecting the cyclic trash,
but you're creating 100000 * 100 = 10 million integer
objects hanging off trash cycles before invoking
gc.collect(), and those integers require at least 10 million
* 12 ~= 120MB all by themselves. Worse, memory allocated to
"short" integers is both immortal and unbounded: it can be
reused for /other/ integer objects, but it never goes away.
Note that memory usage in your program remains low and
steady if you force gc.collect() after every call to bar().
Then you only create 100K integers, instead of 10M, before
the trash gets cleaned up.
There is no simple-minded way to "repair" this, BTW. For
example, /of course/ a frame has to reference all its
locals, and moving to weak references for those instead
would be insanely inefficient (among other, and deeper,
problems).
Note that the library reference manual warns against storing
the result of exc_info() in a local variable (which you're
/effectively/ doing, since the formal parameter `s` is a
local variable within foo()), and suggests other approaches.
Sorry, but I really couldn't tell from your description why
you want to store this stuff in an instance attribute, so
can't guess whether another more-or-less obvious approach
would help.
For example, no cyclic trash is created if you add this
method to your class O:
def get_traceback(self):
self.e = sys.exc_info()
and inside foo() invoke:
s.get_traceback()
instead of doing:
s.e = sys.exc_info()
Is that unreasonable? Perhaps simpler is to define a
function like:
def get_exc_info():
return sys.exc_info()
and inside foo() do:
s.e = get_exc_info()
No cyclic trash gets created that way either. These are the
kinds of things the manual has suggested doing for the last
10 years ;-)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1565525&group_id=5470
More information about the Python-bugs-list
mailing list