Parsing HTML, extracting text and changing attributes.

Stefan Behnel stefan.behnel-n05pAM at web.de
Mon Jun 18 12:51:21 EDT 2007


Jay Loden wrote:
> Someone else mentioned lxml but as I understand it lxml will only work if
> it's valid XHTML that they're working with.

No, it was meant as the OP requested. It even has a very good parser from
broken HTML.

http://codespeak.net/lxml/dev/parsing.html#parsing-html

Stefan



More information about the Python-list mailing list