how to user handle_data() to extract text from a html file

koko kokohh at hotmail.com
Mon Oct 21 18:36:36 EDT 2002


I am haveing some trouble for extract text from a html file.
example

import htmllib, urllib, formatter

class webParser(htmllib.HTMLParser):
 def __init__(self, base):
  htmllib.HTMLParser.__init__(self, formatter.NullFormatter())
  self.BaseURL = base
 def handle_data(self,text):
            htmllib.HTMLParser.handle_data(self,text)

url="http://www.uchicago.edu"
u=urllib.urlopen(url)
urlread=u.read()
print urlread

# but how can I ouput the content of the html link





More information about the Python-list mailing list