[Tutor] memory error

Alan Gauld alan.gauld at btinternet.com
Wed Jul 1 01:54:08 CEST 2015


On 30/06/15 16:10, Joshua Valdez wrote:
> So I wrote this script to go over a large wiki XML dump and pull out the
> pages I want. However, every time I run it the kernel displays 'Killed' I'm
> assuming this is a memory issue after reading around but I'm not sure where
> the memory problem is in my script

That's quite a big assumption.
How big is the wiki file? How much RAM do you have?
What do your system resource monitoring tools (eg top) say?

> and if there were any tricks to reduce
> the virtual memory usage.

Of course, but as always be sure what you are tweaking before you start. 
Otherwise you can waste a lot of time doing nothing useful.

> from bs4 import BeautifulSoup
> import sys
>
> pages_file = open('pages_file.txt', 'r')
>
....
>
> #####################################
>
> with open(sys.argv[1], 'r') as wiki:
>      soup = BeautifulSoup(wiki)
> wiki.closed

Is that really what you mean? Or should it be

wiki.close()?

> wiki_page = soup.find_all("page")
> del soup
> for item in wiki_page:
>      title = item.title.get_text()
>      if title in page_titles:
>          print item
>          del title

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list