confused by HTMLParser class

alex23 wuwei23 at gmail.com
Tue May 27 22:19:19 EDT 2008


On May 28, 11:20 am, globalrev <skanem... at yahoo.se> wrote:
> tried all kinds of combos to get this to work.

Did you try searching this group? There were recent posts discussing
basic usage of HTMLParser.

Throwing random code together is the least likely way to actually get
it to work.

> x = MyHTMLParser(HTMLParser())
> site = urllib.urlopen("http://docs.python.org/lib/module-
> HTMLParser.html")
> for row in site:
>     print x.handle_starttag()

Why are you passing HTMLParser in to initialise MyHTMLParser?

Why are you iterating over site and expecting your instance of
MyHTMLParser to magically know about it?

Why haven't you read the urllib.urlopen docs, to see you need to do
a .read() to actually get the page data?

Why are you so resistant to reading some basic tutorials?



More information about the Python-list mailing list