[Tutor] garbage collecting
Steven D'Aprano
steve at pearwood.info
Wed Jan 8 12:18:22 CET 2014
On Tue, Jan 07, 2014 at 11:41:53PM -0500, Keith Winston wrote:
> Iirc, Python periodically cleans memory of bits & pieces that are no longer
> being used. I periodically do something stupid -- I mean experimental --
> and end up with a semi-locked up system. Sometimes it comes back,
> sometimes everything after that point runs very slowly, etc.
You can control the garbage collector with the gc module, but you
shouldn't have to. It runs automatically, whenever your code is done
with an object, the garbage collector reclaims the memory used. In
particular, the garbage collector is not the solution to your problem.
As Mark says, the symptoms you describe suggest you have run out of
memory and are hammering the swap space. This will slow everything
down by a lot. Virtual memory is *millions* of times slower than real
memory (RAM).
None of this is specific to Python, but here's a quick (and simplified)
explanation, as best I understand it myself. Modern operating systems,
which include Windows, Linux and OS X, have a concept of "virtual
memory", which you can consider to be a great big file on your hard
disk. (To be pedantic, it's a single file in Windows, usually a separate
partition on Linux, and I don't know about OS X.) When an application
requests some memory, if there is not enough memory available, the OS
will grab some chunk of memory which is not in use (we'll call it "B"),
copy it to the swap file (the virtual memory), then free it up for the
application to use. Then when the app tries to use memory B back again,
the OS sees that it's in swap, grab another chunk of real memory, copy
it to swap, then move memory B back into RAM.
So think of virtual memory (swap space) as a great big cheap but
astonishingly slow storage area. When you run out of space in RAM, stuff
gets copied to swap. When you try to use that stuff in swap, something
else has to get moved to swap first, to free up space in RAM to move the
first chunk back into RAM. The OS is always doing this, and you hardly
ever notice.
Except, when an app asks for a lot of memory, or lots and lots and lots
of apps ask for a little bit all at the same time, you can end up with a
situtation where the OS is spending nearly all its time just copying
stuff from RAM to swap and back again. So much time is being spent doing
this that the application itself hardly gets any time to run, and so
everything slows down to a crawl. You can often actually *hear this
happening*, as the hard drive goes nuts jumping backwards and forwards
copying data to and fro swap. This is called "thrashing", and it isn't
fun for anyone.
The only cure for a thrashing system is to sit back and wait, possibly
shut down a few applications (which may temporarily make the thrashing
worse!) and eventually the whole thing will settle down again, in five
minutes or five days.
I'm not exaggerating -- I once made a mistake with a list, ran
something stupid like mylist = [None]*(10**9), and my system was
thrashing so hard the operating system stopped responding. I left it for
16 hours and it was still thrashing, so I shut the power off. That's a
fairly extreme example though, normally a thrashing system won't be
*that* unresponsive, it will just be slow.
If you are using Linux, you can use the ulimit command to limit how much
memory the OS will allow an application to use before shutting it off.
So for example:
[steve at ando ~]$ ulimit -v 10000
[steve at ando ~]$ python
Python 2.7.2 (default, May 18 2012, 18:25:10)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> l = range(200000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
> I just saw where I could do os.system('python'), but in restarting the
> interpreter I'd lose everything currently loaded: my real question involves
> merely pushing the garbage collector into action, I think.
As Alan has said, os.system('python') doesn't restart the interpreter,
it pauses it and starts up a new one.
--
Steven
More information about the Tutor
mailing list