Taking data from a text file to parse html page

Roberto Bonvallet rbonvall at cern.ch
Thu Aug 24 11:41:39 EDT 2006


DH wrote:
>> > I'm trying to strip the html and other useless junk from a html page..
>> > Id like to create something like an automated text editor, where it
>> > takes the keywords from a txt file and removes them from the html page
>> > (replace the words in the html page with blank space)
[...]
> I've looked into using BeatifulSoup but came to the conculsion that my
> idea would work better in the end.

You could use BeautifulSoup anyway for the junk-removal part and then do
your magic.  Even if it is not exactly what you want, it is a good idea to
try to reuse modules that are good at what they do.

-- 
Roberto Bonvallet



More information about the Python-list mailing list