urllib2.urlopen(url) pulling something other than HTML

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Aug 21 23:30:21 EDT 2007


On 21 ago, 18:36, j... at pobox.com (John J. Lee) wrote:
> Gabriel Genellina <gagsl-... at yahoo.com.ar> writes:
>
> [...]> Don't even try to understand it - it's a mess. Use the HTMLParser
> > module instead.
>
> [...]
>
> Module sgmllib (and therefore module htmllib also) is more tolerant of
> bad HTML than module HTMLParser.

I had the impression it was the opposite; anyway, neither of them can
handle really bad html.
I just don't *like* htmllib.HTMLParser - but that's only a matter of
taste.

--
Gabriel Genellina




More information about the Python-list mailing list