[Tutor] garbage collecting

Steven D'Aprano steve at pearwood.info
Wed Jan 8 12:18:22 CET 2014


On Tue, Jan 07, 2014 at 11:41:53PM -0500, Keith Winston wrote:
> Iirc, Python periodically cleans memory of bits & pieces that are no longer
> being used. I periodically do something stupid -- I mean experimental --
> and end up with a semi-locked up system. Sometimes it comes back,
> sometimes everything after that point runs very slowly, etc.

You can control the garbage collector with the gc module, but you 
shouldn't have to. It runs automatically, whenever your code is done 
with an object, the garbage collector reclaims the memory used. In 
particular, the garbage collector is not the solution to your problem.

As Mark says, the symptoms you describe suggest you have run out of 
memory and are hammering the swap space. This will slow everything 
down by a lot. Virtual memory is *millions* of times slower than real 
memory (RAM).

None of this is specific to Python, but here's a quick (and simplified) 
explanation, as best I understand it myself. Modern operating systems, 
which include Windows, Linux and OS X, have a concept of "virtual 
memory", which you can consider to be a great big file on your hard 
disk. (To be pedantic, it's a single file in Windows, usually a separate 
partition on Linux, and I don't know about OS X.) When an application 
requests some memory, if there is not enough memory available, the OS 
will grab some chunk of memory which is not in use (we'll call it "B"), 
copy it to the swap file (the virtual memory), then free it up for the 
application to use. Then when the app tries to use memory B back again, 
the OS sees that it's in swap, grab another chunk of real memory, copy 
it to swap, then move memory B back into RAM.

So think of virtual memory (swap space) as a great big cheap but 
astonishingly slow storage area. When you run out of space in RAM, stuff 
gets copied to swap. When you try to use that stuff in swap, something 
else has to get moved to swap first, to free up space in RAM to move the 
first chunk back into RAM. The OS is always doing this, and you hardly 
ever notice.

Except, when an app asks for a lot of memory, or lots and lots and lots 
of apps ask for a little bit all at the same time, you can end up with a 
situtation where the OS is spending nearly all its time just copying 
stuff from RAM to swap and back again. So much time is being spent doing 
this that the application itself hardly gets any time to run, and so 
everything slows down to a crawl. You can often actually *hear this 
happening*, as the hard drive goes nuts jumping backwards and forwards 
copying data to and fro swap. This is called "thrashing", and it isn't 
fun for anyone.

The only cure for a thrashing system is to sit back and wait, possibly 
shut down a few applications (which may temporarily make the thrashing 
worse!) and eventually the whole thing will settle down again, in five 
minutes or five days.

I'm not exaggerating -- I once made a mistake with a list, ran 
something stupid like mylist = [None]*(10**9), and my system was 
thrashing so hard the operating system stopped responding. I left it for 
16 hours and it was still thrashing, so I shut the power off. That's a 
fairly extreme example though, normally a thrashing system won't be 
*that* unresponsive, it will just be slow.

If you are using Linux, you can use the ulimit command to limit how much 
memory the OS will allow an application to use before shutting it off. 
So for example:

[steve at ando ~]$ ulimit -v 10000
[steve at ando ~]$ python
Python 2.7.2 (default, May 18 2012, 18:25:10)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> l = range(200000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError



> I just saw where I could do os.system('python'), but in restarting the
> interpreter I'd lose everything currently loaded: my real question involves
> merely pushing the garbage collector into action, I think.

As Alan has said, os.system('python') doesn't restart the interpreter, 
it pauses it and starts up a new one.



-- 
Steven


More information about the Tutor mailing list