[Python-ideas] Preventing out of memory conditions

Wed Jan 2 13:06:14 CET 2013

On Mon, Dec 31, 2012 at 7:22 PM, Gregory P. Smith <greg at krypto.org> wrote:

> Within CPython the way the C API is today it is too late by the time the
> code to raise a MemoryError has been called so capturing all places that
> could occur is not easy.
> Implementing this at the C level malloc later makes
> more sense. Have it dip into a reserved low memory pool to satisfy the
> current request and send the process a signal indicating it is running low.
> This approach would also work with C extension modules or an embedded
> Python.

Regarding the C malloc solution, wouldn't a callback be preferable to
a signal? If I understood you correctly, signal implies that a
different thread will handle it. At any reasonable size of the
emergency memory pool is, there will be situations when the next
memory allocation is greater than that size, leading to the very same
problem you described later in your message when you talked about the
disadvantage of polling. In addition, if the signal processing happens
a bit slow (perhaps simply due to the thread scheduler being slow to
switch), by the time enough memory is released, it may be too late -
the next memory allocation may have already come in. Unless I'm
missing something, the (synchronous) callback seems to be a strictly
better than the (asynchronous) signal.

As to your main point that this functionality should be inside C
malloc rather than pymalloc, I agree, but only if the objective is to
provide an all-purpose, highly general "low memory condition"
handling. (I'm not sure if malloc knows enough about the OS to define
"low memory condition" well; but it's certain that pymalloc doesn't).

But I was going for a more modest goal. Rather than be warned of the
pending for MemoryError exception, a developer could simply be
notified via callback when the maximum absolute memory used by his app
exceeds a certain limit. pymalloc could very easily call back a
designated function when when the next memory allocation exceeds this
threshold.

In many real-life situations, it's not that hard to estimate how much
RAM the application should be allowed to consume. Sure, the developer
would need to learn a little about the platforms his app is running
on, and use OS-specific rules to set the memory limit, but that effort
is modest, and the payoff is huge. Not to mention, a developer with a
particularly technically savvy end users could even skip this work
entirely by letting his end users set the memory limit per-session.

There is a huge advantage of the pymalloc solution (with the set
memory limit) vs. the C malloc solution (with the generic low memory
condition). On my system, I don't want the application to use (almost)
all the available memory before it starts to manage its cache. In
fact, by the time the physical memory use approaches my total physical
RAM, the system slows down considerably as many other applications get
swapped to disk by the OS. With a set memory limit, I can provide a
much more granular control over the memory used by the application.

Of course, the set memory limit could also be implemented inside C
malloc rather than inside pymalloc. But this requires that developers
rewrite C runtime's memory manager on every platform, and then
recompile their Python with it. The changes to pymalloc, on the other
hand, would be relatively small.

> I'd expect this already exists but I haven't looked for one.

All I found is this comment in XEmacs documentation about vm-limit.c:
http://www.xemacs.org/Documentation/21.5/html/internals_17.html, but
I'm not sure if it's XEmacs feature or if malloc itself supports it.

> Having a thread polling memory use it not generally wise as that is polling
> rather than event driven and could easily miss low memory situations before
> it is too late and a failure has already happened (allocation demand can
> come in large spikes depending on the application).

Precisely. That's the problem with the best existing solutions (e.g.,
http://stackoverflow.com/a/7332782/336527).

> OSes running processes in constrained environments or ones where the
> resources available can be reduced by the OS later may already send their
> own warning signals prior to outright killing the process but that should
> not preclude an application being able to monitor and constrain itself on
> its own without needing the OS to do it.

I was thinking about regular desktop OS, which certainly doesn't warn
the process sufficiently in advance. The MemoryError exception
basically tells the process that it's going to die soon, and there's
nothing it can do about it.

Max