HTML Parsing

Sun Feb 25 07:35:06 EST 2007

John Machin wrote:
> One can even use ElementTree, if the HTML is well-formed. See below.
> However if it is as ill-formed as the sample (4th "td" element not
> closed; I've omitted it below), then the OP would be better off
> sticking with Beautiful Soup :-)

Or (as we were talking about the best of both worlds already) use lxml's HTML
parser, which is also capable of parsing pretty disgusting HTML-like tag soup.

Stefan