The OOM-Killer vs. Python
Jim Dennis
jimd at vega.starshine.org
Mon Mar 25 04:58:15 EST 2002
In article <Xns91DBED451517cliechtigmxnet at 62.2.16.82>, Chris Liechti wrote:
> gerson.kurz at t-online.de (Gerson Kurz) wrote in news:3c9e2b5f.9993062
> @news.t-online.de:
>> I have a python-based SMTP server (see http://p-nand-
>> q.com/shicks.htm)
>> running on our server, and in general it has worked flawless (since
>> about nov 2001). However, in the recent week that dreaded linux OOM
>> killer twice killed the python process. [The machine has 768mb ram
>> call me oldfashioned but that SHOULD be enough for both Linux &
>> Python to get along, really. OK, its running KDE, and has only 256mb
>> for swap, but still...]
> maybe you have some other leaking app and the OOM killer just picks
> the largest. if you want a reliable server i wouldn't use it as
> workstation too :-)
>> Anyway, even though I believe that this is more of a fault of the
>> Linux Kernel VM quality than the script (the system has been
>> running fine for months and now two kills in one week - that smells
>> fishy
About two years ago (March of 2000) there was a huge flamewar
on the Linux kernel mailing list (LKML) about a proposed set of
patches to allow users (admins) to disable "overcommit."
When your program uses malloc (implicitly done by your Python
processes, of course) then the Linux kernel will return success
even if there isn't physically enough memory+swap to guarantee
that the whole block of memory can be supplied. This is called
"overcommit" (and is common among UNIX and other general purpose
operating systems).
Searching Google's Linux pages on "overcommit" or "disable overcommit"
or even "disable overcommit" and "patch" will quickly bring up various
subsets of that discussion.
Unfortunately I don't know the current status of Linux sysctl to control
the "overcommit" vm features. I'm cross posting this to c.o.l.d.s
(comp.os.linux.dev.system) in hopes of getting a clue.
I see a /proc/sys/vm/overcommit_memory entry on my 2.4.9 kernel which
might be either to a magic sysctl node. You might be able to
echo -1 > into that node (virtual file) to disable overcommit. That
should cause malloc()'s to fail when there isn't enough RAM+swap to
satisfy a request.
Note that this might not actually solve your problem. It should
prevent the OOM killer from becoming active (since you won't truly
be "out-of-memory") but might cause programs to abend (abnormally
terminate themselves) when their malloc's return -ENOMEM. If you have
swap active it also might make the system go into vm thrashing and it
might make the whole system seem much worse because many of the
(formerly innocuous) malloc()s will be going unsatisfied.
Personally I think the various libraries, compilers and applications
which are blindly allocating far more memory then they actually use are
really the heart of the whole overcommit problem. That might not be
Python (or it might be indirectly due to Python's use of some libraries
on your system, possibly it could be glibc and libm bloating).
Unfortunately memory is a shared resource, so it only takes one bad
appl. (so to speak) to ruin the whole basket for all of us.
More information about the Python-list
mailing list