Possible to set cpython heap size?

Chris Mellon arkanes at gmail.com
Thu Feb 22 14:58:18 EST 2007


On 22 Feb 2007 11:28:52 -0800, Andy Watson <aldcwatson at gmail.com> wrote:
> On Feb 22, 10:53 am, a bunch of folks wrote:
>
> > Memory is basically free.
>
> This is true if you are simply scanning a file into memory.  However,
> I'm storing the contents in some in-memory data structures and doing
> some data manipulation.   This is my speculation:
>
> Several small objects per scanned line get allocated, and then
> unreferenced.  If the heap is relatively small, GC has to do some work
> in order to make space for subsequent scan results.  At some point, it
> realises it cannot keep up and has to extend the heap.  At this point,
> VM and physical memory is committed, since it needs to be used.  And
> this keeps going on.  At some point, GC will take a good deal of time
> to compact the heap, since I and loading in so much data and creating
> a lot of smaller objects.
>
> If I could have a heap that is larger and does not need to be
> dynamically extended, then the Python GC could work more efficiently.
>

I haven't even looked at Python memory management internals since 2.3,
and not in detail then, so I'm sure someone will correct me in the
case that I am wrong.

However, I believe that this is almost exactly how CPython GC does not
work. CPython is refcounted with a generational GC for cycle
detection. There's a memory pool that is used for object allocation
(more than one, I think, for different types of objects) and those can
be extended but they are not, to my knowledge, compacted.

If you're creating the same small objects for each scanned lines, and
especially if they are tuples or new-style objects with __slots__,
then the memory use for those objects should be more or less constant.
Your memory growth is probably related to the information you're
saving, not to your scanned objects, and since those are long-lived
objects I simple don't see how heap pre-allocation could be helpful
there.



More information about the Python-list mailing list