"High water" Memory fragmentation still a thing?

Fri Oct 3 16:01:50 EDT 2014

On Fri, Oct 3, 2014 at 1:36 PM, Croepha <croepha at gmail.com>
wrote:

> Long running Python jobs that consume a lot of memory while
> running may not return that memory to the operating system
> until the process actually terminates, even if everything is
> garbage collected properly.

(I see Antoine replied that this is mostly fixed starting in
3.3.  My response below was written with my Unix/Python 2
rose-colored glasses firmly affixed to my face.)

Unless you are blessed with a long-running program which does
everything exactly right, I'd be surprised if any such program
behaved the way you want. Python may be extreme in this regard,
since it often allcates small objects, and its special object
allocator (obmalloc - is it still used?) might mitigate the
problem by collecting allocations of similar size together
(malloc implementations probably do some of this as well), but
it's still going to have these issues.

The problem boils down to how the program dynamically allocates
and frees memory, and how the malloc subsystem interacts with
the kernel through the brk and sbrk system calls.  (Anywhere I
mention "brk", you can mentally replace it with "sbrk". They do
the same thing - ask for memory from or return memory to the
kernel - using a different set of units, memory addresses or
bytes.)  In the general case, programmers call malloc (or
calloc, or realloc) to allocate a chunk of storage from the
heap.  (I'm ignoring anything special which Python has layered
on top of malloc.  It can mitigate problems, but I don't think
it will fundamentally change the way malloc interacts with the
kernel.)  The malloc subsystem maintains a free list (recently
freed bits of memory) from which it can allocate memory without
traipsing off to the kernel.  If it can't return a chunk of
memory from the free list, it will (in the most simpleminded
malloc implementation) call brk to grab a new (large) chunk of
memory.  The system simply moves the end of the program's
"break", effectively increasing or decreasing the (virtual) size
of the running program.  That memory is then doled out to the
user by malloc.  If, and only if, every chunk of memory in the
last chunk allocated by a call to brk is placed on malloc's free
list, *and* if the particular malloc implementation on your box
is smart enough to coalesce adjacent chunks of freed memory back
into brk-sized memory chunks, can brk be called once again to
reduce the program's footprint.

Example...  I don't know how much memory malloc requests from
brk, so let's make things easy, and make the following
assumptions:

* Assume malloc calls brk to allocate 1MB chunks.

* Assume the program only ever calls malloc(1024).

* Assume malloc's own overhead (free list, etc) is zero.

So, starting afresh, having no free memory, the first time we
call malloc(1024), it calls brk asking for 1MB, then returns its
caller a pointer to that 1024-byte chunk. Now call malloc(1024)
1023 more times.  We have used up that entire 1MB chunk of
memory.  Now free each of those chunks by calling free() 1024
times. We are left, once again, with a 1MB chunk of free memory.
It might be stitched together by malloc into one single block of
memory, or it might appear on the free list as 1024 chunks of
memory.  Should malloc return it to the system with a call to
brk?  Maybe.  Maybe not.  Maybe the malloc subsystem goes
through all the necessary bookkeeping to hand that memory back
to the system, only to find that the next thing the program does
is make another malloc(1024) call.  Whoops. Now you have to call
brk again.  And system calls are expensive.

Now, consider a similar case. You make 1024 calls to
malloc(1024). Then you free all of them except the 512th
one. Now you have a 1MB chunk of memory on malloc's free list
which is entirely free, except for a small chunk in the middle.
That chunk is broken into three fragments, two free fragments,
separated by one chunk which is still in use.  Can't free that.
Well, perhaps you could, but only the top half of it. And you'd
have the same dilemma as before. Should you return it or not?

Final consideration.  Suppose your program makes 1023 calls to
malloc(1024) and frees them all, but sometime during that work,
a low-level library far from the programmer's view also calls
malloc to get a 1KB chunk.  Even if your program (e.g., the
Python runtime) was perfectly well-behaved and returned all of
the 1023 chunks of memory it had requested, it has no control
over this low-level library.  You're still stuck with a
fragmented chunk of memory, free except for a hole somewhere in
the middle.

Long story short, there's only so much you can do to try to
return memory to the system.  I am not up-to-date on the latest
malloc intelligence, but it's a challenging enough problem that
it was a long time before any malloc implementation attempted to
automatically return memory pages to the system.

Finally, as this is not really Python-specific, you might want
to do some reading on how malloc is implemented. It can be
pretty arcane, but I'm sure it would make for fascinating
cocktail party conversation. <wink>

Skip