best way to discover this process's current memory usage, cross-platform?

Alex Martelli aleax at mail.comcast.net
Tue Nov 15 11:48:25 EST 2005


Neal  Norwitz <nnorwitz at gmail.com> wrote:

> Alex Martelli wrote:
> >
> > So, I thought I'd turn to the "wisdom of crowds"... how would YOU guys
> > go about adding to your automated regression tests one that checks that
> > a certain memory leak has not recurred, as cross-platform as feasible?
> > In particular, how would you code _memsize() "cross-platformly"?  (I can
> > easily use C rather than Python if needed, adding it as an auxiliary
> > function for testing purposes to my existing extension).
> 
> If you are doing Unix, can you use getrusage(2)?

On Unix, I could; on Linux, nope.  According to man getrusage on Linux,

"""
       The above struct was taken from BSD 4.3 Reno.  Not all fields are
meaningful under Linux.   Right  now  (Linux  2.4,  2.6)  only  the
fields ru_utime,  ru_stime, ru_minflt, ru_majflt, and ru_nswap are
maintained.
"""
and indeed the memory-usage parts are zero.


> >>> import resource
> >>> r = resource.getrusage(resource.RUSAGE_SELF)
> >>> print r[2:5]
> 
> I get zeroes on my gentoo amd64 box.  Not sure why.  I thought maybe it
> was Python, but C gives the same results.

Yep -- at least, on Linux, this misbehavior is clearly documented in the
manpage; on Darwin, aka MacOSX, you _also_ get zeros but there is no
indication in the manpage leading you to expect that.

Unfortunately I don't have any "real Unix" box around -- only Linux and
Darwin... I could try booting up OpenBSD again to check that it works
there, but given that I know it doesn't work under the most widespread
unixoid systems, it wouldn't be much use anyway, sigh.


> Another possibiity is to call sbrk(0) which should return the top of
> the heap.  You could then return this value and check it.  It requires
> a tiny C module, but should be easy and work on most unixes.  You can

As I said, I'm looking for leaks in a C-coded module, so it's no problem
to add some auxiliary C code to that module to help test it --
unfortunately, this approach doesn't work, see below...

> determine direction heap grows by comparing it with id(0) which should
> have been allocated early in the interpreters life.
> 
> I realize this isn't perfect as memory becomes fragmented, but might
> work.  Since 2.3 and beyond use pymalloc, fragmentation may not be much
> of an issue.  As memory is allocated in a big hunk, then doled out as
> necessary.

But exactly because of that, sbrk(0) doesn't mean much.  Consider the
tiny extension which I've just uploaded to
http://www.aleax.it/Python/memtry.c -- it essentially exposes a type
that does malloc when constructed and free when freed, and a function
sbrk0 which returns sbrk(0).  What I see on my MacOSX 10.4, Python
2.4.1, gcc 4.1, is (with a little auxiliary memi.py module that does
from memtry import *
import os
def memsiz():
    return int(os.popen('ps -p %d -o vsz|tail -1' % os.getpid()).read())
)...:

Helen:~/memtry alex$ python -ic 'import memi'
>>> memi.memsiz()
35824
>>> memi.sbrk0()
16809984
>>> a=memi.mem(999999)
>>> memi.sbrk0()
16809984
>>> memi.memsiz()
40900

See?  While the process's memory size grows as expected (by 500+ "units"
when allocating one meg, confirming the hypothesis that a unit is
2Kbyte), sbrk(0) just doesn't budge.

As the MacOSX "man sbrk" says,
"""
The brk and sbrk functions are historical curiosities left over from
earlier days before the advent of virtual memory management.
"""
and apparently it's now quite hard to make any USE of those quaint
oddities, in presence of any attempt, anywhere in any library linked
with the process, to do some "smart" memory allocation &c.

 
> These techniques could apply to Windows with some caveats.  If you are
> interested in Windows, see:
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnucmg/ht
> ml/UCMGch09.asp
> 
> Can't think of anything fool-proof though.

Fool-proof is way beyond what I'm looking for now -- I'd settle for
"reasonably clean, works in Linux, Mac and Windows over 90% of the time,
and I can detect somehow when it isn't working";-)


Since people DO need to keep an eye on their code's memory consumption,
I'm getting convinced that the major functional lack in today's Python
standard library is some minimal set of tools to help with that task.
PySizer appears to be a start in the right direction (although it may be
at too early a stage to make sense for the standard library of Python
2.5), but (unless I'm missing something about it) it won't help with
memory leaks not directly related to Python.  Maybe we SHOULD have some
function in sys to return the best guess at current memory consumption
of the whole process, implemented by appropriate techniques on each
platform -- right now, though, I'm trying to find out which these
appropriate techniques are on today's most widespread unixoid systems,
Linux and MacOSX.  (As I used to be a Win32 API guru in a previous life,
I'm confident that I can find out about _that_ platform by sweating
enough blood on MSDN -- problem here is I don't have any Windows machine
with the appropriate development system to build Python, so testing
would be pretty hard, but maybe I can interest somebody who DOES have
such a setup...;-)


Alex



More information about the Python-list mailing list