Parsing an HTML a tag

George buffer_88 at hotmail.com
Sat Sep 24 13:13:30 EDT 2005


How can I parse an HTML file and collect only that the A tags. I have a
start for the code but an unable to figure out how to finish the code.
HTML_parse gets the data from the URL document. Thanks for the help

def HTML_parse(data):
 from HTMLParser import HTMLParser
 parser = MyHTMLParser()

 parser.feed(data)

 class MyHTMLParser(HTMLParser):

  def handle_starttag(self, tag, attrs):

  def handle_endtag(self, tag):

def read_page(URL):
 "this function returns the entire content of the specified URL
document"
 import urllib
 connect = urllib.urlopen(url)
 data = connect.read()
 connect.close()
 return data




More information about the Python-list mailing list