Limiting Python's Memory Use

Wed Aug 30 12:19:16 EDT 2000

Here's a few questions for those knowledgeable in Python 1.5.2 source
code internals. I could really use some help limiting the memory that
Python holds on to.   Since I know Python is reference counted and not
garbage-collected, it should be simple to avoid memory leaks, as long as
there are no refcounting bugs. I believe I've found any refcounting bugs
that would cause the system to release memory prematurely, but I may
have orphans (leaked objects), or that Python itself consumes an
unbounded amount of "working memory".

Here are my questions:

1. Suppose I were to, one at a time over 24 hours, store rows into a
disk database, using an auto-incrementing integer key to the database.
Over 24 hours, the key value goes from 1 to over 200,000.  Would all
200,000 "integer objects" be created as needed, and then freed, or are
the integer objects (200,000 of them) around for the duration of the
python interpreter's lifetime?

2. Suppose over 200,000 unique string constants were handled, are any of
them free'd or are they kept around? I saw a cache for commonly used
strings was created in the string object module, but I am unsure how
this cache is limited, in size.

3. Is there any way to have Python report how much memory its modules
are using?

4. Can you tell Python to "suck itself in" somehow, or place limits, by
recompiling python, on how many "spare" bits of memory it hangs on to?

Some Details on What we're Doing:

We are now making heavy use of Python on our embedded system, which is a
PC-104 based embedded system. Our development targets have 32 mb RAM,
and my goal is a product that is functional at only 16 mb ram if
necessary, and 32 mb systems will only be used when cost factors permit
it.  These are not desktop PCs, and our embedded CPUs are neither
cheap, nor user expandable.

So far we are well in line, using only 16 mb RAM or less in all cases,
except when the system makes heavy use of the python interpreter. I
believe it should be possible to get Python running and using < 10 MB
of heap memory for its own purposes, but I'm unsure how to go about
it.  We have noticed that our system consumes 8 mb of heap at boot up,
and upon heavy use of Python, the heap memory usage will within 24
hours creep up over 24 mb, and had we been actually using the field-
configuration system at 16 mb, we would have run out of memory.   I
believe, although I'm still working to establish this, that over the 10
mb of growth in the heap memory use overnight is primarily caused by
the python scripts we're using. We're running a script which stores
data into a database, and which we hope, does not keep all this
information in memory.  However, a lot of different objects ( >200,000
tuples, lists, dictionaries, etc) have been created, and hopefully
disposed of during that time.  I suspect that Python isn't really
releasing a lot of those disposed objects. Instead, it seems it keeps a
cached list of freed objects, to re-use, saving the processing overhead
of malloc'ing  and free'ing objects.    It seems to me a better method,
or at least a "bounded" method should be possible, for this type of
circumstance.

Any ideas?

Warren Postma
if you reply in email use wpostma at ztr dot com

Sent via Deja.com http://www.deja.com/
Before you buy.