memory leak with large list??

Sat Jan 25 19:29:22 EST 2003

On Sat, 25 Jan 2003 07:41:42 -0500, Tim Peters <tim.one at comcast.net> wrote:

>[someone]
>>> Then I generate a large list (12 million floats), either using map,
>>> a list comprehension, or preallocating a list, then filling it using a
>>> for loop.
>
>[Terry Reedy]
>> 12 million different floats use 12 million * sizeof float object
>> bytes. Float object is at least 8 bytes for float + object overhead
>> (about 4 bytes?)
>
>Each object has at least a pointer to the object's type object, and a
>refcount, for at least 8 bytes of overhead.  So a float object consumes at
>least 16 bytes.
Why do you need to duplicate the type pointer for all those? I.e., if
you allocated space in decent-size arrays of object representations without
type pointer, and just reserved a header slot in front of the array, and
allocated the whole so array headers are aligned on, say, 128k addresses,
then you could look up the same type info with just an address mask and gain
4 bytes per float. Ditto for any homogeneous allocation arenas. IWT you
could allocate big blocks from a memory mapped file area if there's no other
way to get aligned large blocks. Or you could allocate really large virtual
space and sacrifice space at the front if necessary to start arena allocations
aligned as desired. Or if some platform is stubborn, virtualize id as a
composite of arena-selector and offset fields, and use zero offset to get
to the header and the type info. Anyway, you know what I'm saying.

Of course you could have a space (it wouldn't be an array per se)  with current-style
objects with individual type slots too, just by making the arena header type info say
"see object-specific type slot". Just that freeing/allocating space is more complicated
Than

I see from below[1] that allocation is already from a pool of type-specific
free lists for float, so IWT those lists could be linked within the
arrays just mentioned (you just can't use the type slot, because there wouldn't
be any individual ones, so maybe use the value slot instead).

[...]
>
>> Deleting frees mem for reuse by Python but does not necessarily return
>> memory to system for reuse by other processes.  (Exact behavior is
>> *very* system specific, according to Tim Peters' war stories.)
>
>Its worse in this case:  int and float objects come out of special internal
>type-specific "free lists", and there's no bound on how large those can get.
>Here's the deallocator for floats:
>
>static void
>float_dealloc(PyFloatObject *op)
>{
>	if (PyFloat_CheckExact(op)) {
>		op->ob_type = (struct _typeobject *)free_list;
>		free_list = op;
>	}
>	else
>		op->ob_type->tp_free((PyObject *)op);
>}
>
>IOW, memory once allocated for a float can never be reused for any other
>kind of object, and isn't returned to the platform C library until Python
>shuts down.  The list memory did get returned to the platform C library, and
>in this case it looks like the latter did return that chunk to the OS.  If
>the OP had allocated some other "large object" after allocating the list,
>chances are good that the platform C would not have returned the list memory
>to the OS.  Even then, it doesn't much matter -- the OS will simply page out
>that part of the address space if it's not used again.  The VM highwater
>mark doesn't have a primary effect on performance.
>
You'd have to change the above to use the ref count slot or the first 4 bytes
of the value slot instead of op->ob_type to link the free list though,
since ob_type would be coming from, e.g., OB_TYPE(op) with the appropriate
address masking etc in an OB_TYPE macro.

I wonder what the performance hit would be, given processor speeds and caching etc.
Or has this been tried and rejected for Python?

Regards,
Bengt Richter