Parsing HTML with JavaScript
John J. Lee
jjl at pobox.com
Fri May 13 16:29:19 EDT 2005
mtfulmer at tacobell.land writes:
> I am trying to extract some information from a few web pages, and I was
> using the HTMLParser module. It worked fine until it got to the
> javascript, at which it gave a parse error. Is there a good way to work
> around this or should I just preparse the file to remove the javascript
> manually? This is my first python program.
sgmllib is very similar to HTMLParser, but doesn't break so easily
(but sgmllib has some problems with XHTML -- swings and roundabouts).
Or, try BeautifulSoup.
John
More information about the Python-list
mailing list