Taking data from a text file to parse html page
Fredrik Lundh
fredrik at pythonware.com
Thu Aug 24 11:58:19 EDT 2006
DH wrote:
> I have a plain text file containing the html and words that I want
> removed(keywords) from the html file, after processing the html file it
> would save it as a plain text file.
>
> So the program would import the keywords, remove them from the html
> file and save the html file as something.txt.
>
> I would post the data but it's secret. I can post an example:
>
> index.html (html page)
>
> "
> <div><p><em>"Python has been an important part of Google since the
> beginning, and remains so as the system grows and evolves.
> "</em></p>
> <p>-- Peter Norvig, <a class="reference"
> "
>
> replace.txt (keywords)
> "
> <div id="quote" class="homepage-box">
>
> <div><p><em>"
>
> "</em></p>
>
> <p>-- Peter Norvig, <a class="reference"
>
> "
>
> something.txt(file after editing)
>
> "
>
> Python has been an important part of Google since the beginning, and
> remains so as the system grows and evolves.
> "
reading and writing files is described in the tutorial; see
http://pytut.infogami.com/node9.html
(scroll down to "Reading and Writing Files")
to do the replacement, you can use repeated calls to the "replace" method
http://pyref.infogami.com/str.replace
but that may cause problems if the replacement text contains things that
should be replaced. for an efficient way to do a "parallel" replace, see:
http://effbot.org/zone/python-replace.htm#multiple
</F>
More information about the Python-list
mailing list