Noob trying to parse bad HTML using xml.etree.ElementTree

Chris Angelico rosuav at gmail.com
Sun Dec 30 05:07:39 EST 2012


On Sun, Dec 30, 2012 at 8:52 PM, Morten Guldager
<morten.guldager at gmail.com> wrote:
> Question is if it's possible to tweak xml.etree.ElementTree to accept, and
> understand sloppy html, or if you have suggestions for similar easy to use
> framework, preferably among the included batteries?
>

Check out BeautifulSoup, it's fairly good at dealing with messy input.

ChrisA



More information about the Python-list mailing list