minidom and pulldom
John J. Lee
jjl at pobox.com
Sun Dec 14 19:37:12 EST 2003
pinto at map.com (David Pinto) writes:
> I'm trying to use either the minidom or pulldom to find table tags in
> html web pages. I've tried parsing two web pages that show up fine in
> my browser, but I get errors when I call minidom.parse, or try to get
> events with pulldom. Is there a parser that is as forgiving as web
> browsers?
Didn't this get answered just the other day?
minidom and pulldom are built on XML parsers. HTML is not XML.
If you want a tree, I recommend using pushing the HTML through mxTidy
or uTidylib, and feeding the resultant XHTML to the XML API of your
choice.
John
More information about the Python-list
mailing list