python resource management

Terry Reedy tjreedy at udel.edu
Mon Jan 19 18:27:21 EST 2009


S.Selvam Siva wrote:
> Hi all,
> 
> I am running a python script which parses nearly 22,000 html files 
> locally stored using BeautifulSoup.
> The problem is the memory usage linearly increases as the files are 
> being parsed.
> When the script has crossed parsing 200 files or so, it consumes all the 
> available RAM and The CPU usage comes down to 0% (may be due to 
> excessive paging).

I have to guess that you are somehow holding on to data associated with 
each file.

> We tried 'del soup_object'  and used 'gc.collect()'. But, no improvement.

'del ob' only deletes the association between name 'ob' and the object 
it was associated with.  The object itself cannot disappear until all 
associations are gone.

gc.collect only deletes circularly associated objects that collectively 
are isolated.




More information about the Python-list mailing list