HTMLParser rejects real-life tagsoup
Rene Pijlman
reageer.in at de.nieuwsgroep
Wed Feb 12 17:09:38 EST 2003
Gerhard Häring:
>Rene Pijlman wrote:
>> I've been using the HTMLParser module to process external web
>> pages that I don't control. HTMLParser seems to be rather strict
>> [...]
>> Any suggestions on how to handle this? [...]
>
>I'd try tidying up the HTML first:
>http://www.lemburg.com/files/python/mxTidy.html
Great idea, it works fine now. Thanks!
--
René Pijlman
Wat wil jij leren? http://www.leren.nl
More information about the Python-list
mailing list