Placing limits on Python's memory use?

Wed Aug 30 10:08:08 EDT 2000

Here's a few questions for those knowledgeable in Python 1.5.2 source code
internals. I could really use some help limiting the memory that Python
holds on to.   Since I know Python is reference counted and not
garbage-collected, it should be simple to avoid memory leaks, as long as
there are no refcounting bugs. I believe I've found any refcounting bugs
that would cause the system to release memory prematurely, but I may have
orphans (leaked objects), or that Python itself consumes an unbounded amount
of "working memory".

Here are my questions:

1. Suppose I were to, one at a time over 24 hours, store rows into a disk
database, using an auto-incrementing integer key to the database. Over 24
hours, the key value goes from 1 to over 200,000.  Would all 200,000
"integer objects" be created as needed, and then freed, or are the integer
objects (200,000 of them) around for the duration of the python
interpreter's lifetime?

2. Suppose over 200,000 unique string constants were handled, are any of
them free'd or are they kept around? I saw a cache for commonly used strings
was created in the string object module, but I am unsure how this cache is
limited, in size.

3. Is there any way to have Python report how much memory its modules are
using?

4. Can you tell Python to "suck itself in" somehow, or place limits, by
recompiling python, on how many "spare" bits of memory it hangs on to?

Some Details on What we're Doing:

We are now making heavy use of Python on our embedded system, which is a
PC-104 based embedded system. Our development targets have 32 mb RAM, and my
goal is a product that is functional at only 16 mb ram if necessary, and 32
mb systems will only be used when cost factors permit it.  These are not
desktop PCs, and our embedded CPUs are neither cheap, nor user expandable.

So far we are well in line, using only 16 mb RAM or less in all cases,
except when the system makes heavy use of the python interpreter. I believe
it should be possible to get Python running and using < 10 MB of heap memory
for its own purposes, but I'm unsure how to go about it.  We have noticed
that our system consumes 8 mb of heap at boot up, and upon heavy use of
Python, the heap memory usage will within 24 hours creep up over 24 mb, and
had we been actually using the field-configuration system at 16 mb, we would
have run out of memory.   I believe, although I'm still working to establish
this, that over the 10 mb of growth in the heap memory use overnight is
primarily caused by the python scripts we're using. We're running a script
which stores data into a database, and which we hope, does not keep all this
information in memory.  However, a lot of different objects ( >200,000
tuples, lists, dictionaries, etc) have been created, and hopefully disposed
of during that time.  I suspect that Python isn't really releasing a lot of
those disposed objects. Instead, it seems it keeps a cached list of freed
objects, to re-use, saving the processing overhead of malloc'ing  and
free'ing objects.    It seems to me a better method, or at least a "bounded"
method should be possible, for this type of circumstance.

Any ideas?

Warren Postma