memory use debugging

Joshua Rodman joshua_rodman at yahoo.com
Tue Jan 29 07:35:54 EST 2002


I must admit I'm being a little lazy here.

I have a program which, when run, balloons to enormous size.
Depending upon input, it can range from 2 megs, to 40 megs
to exhausting swapspace on my little machine at over 200 megs.

I _could_ rewrite the program in various ways in an attempt
to see where the memory is going.  That's where I'm lazy.

I don't want to.

My main path is a big recursive tree building structure where
about three different types of things are all being allocated
at once.  I expect verious aspects of the system to be 
automatically deallocated as they are plucked out of lists
and as they fall out of scope.  Changing the program to
exercise only one part of the application will be 
somewhat difficult, and may not reveal the problem
which might be an out-and-out bug somewhere.

I'd really love a tool that would collect all objects in
each stack frame and tell me how large they are.  I'd
love a tool that would tell me how much memory I'm using
by type.  I would love a tool that would tell me how
much memory I'm using by object.  If I could find out
how much the entries and keys seperately are taking
up in my one gigantic hashtable of a dictionary,
that would be _fantastic_!

But I really would settle for a function which returns the
size of a speicific argument.
Or I could even scrape by with something that returned
the size of the python heap at the moment of the call.

Do such things exist?  I can't find them.

I read about pymalloc, though I'm not sure if it will show
me what I need.  It seems my memory use never really goes
down.  I don't know if python isn't calling free, or if
nothing ever gets deallocated (that sounds unlikely).
I suspect pymalloc is going to teach me all about python's
memory allocation strategies, but show very little of how
my pure-python program is leaking.  

More importantly pymalloc is off the air:
   TCP connection to 'starship.python.net' failed: Connection reset by peer. 

I tried telling Plumbo to look for circular references.  I didn't
read the code enough to be sure I was using it correctly, but the only
big structure I've got a is a branching tree.  I don't think anything is
circular.

I'm not sure if python 2.2 has anything to offer here (I'm using 2.0 as 
shipped with my copy of Loonix.)  Is the gc debug stuff relevant, or is
that for use with debugging the gc implementation itself?

Is python _really_ without any sort of rudimentary tools for looking
at memory usage of actual python code without writing 
   for i in range(10000):
      doSomething()
while staring at 'top' in another window?

There are a few old mails and so on in the newsgroup archives which 
ask this question that went unanswered.  I feel a bit sad for them.
And me.

But there is hope! 

I'm sure I'll get at least one very helpful response.  And I'll be
able to write up my experiences and tool behaviors in a nice document.
And I'll be able to post it -- Where?

Is there some python wiki or manual / faq with edit capability?
I think that form of discussion is often more useful than the 
drinking-from-the-fire-hose strategy we refer to as 'subscribing
to an email list'.  I'm likely to have to unsubscribe from
this one before tomorrow lest I exhaust my quota.  :-)

Hope it was, at least a little, amusing.

-josh

__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions! 
http://auctions.yahoo.com




More information about the Python-list mailing list