Beginner: HTML Parsing
Ian Bicking
ianb at colorstudy.com
Fri May 17 05:12:20 EDT 2002
On Fri, 2002-05-17 at 02:14, Kragen Sitaker wrote:
> "J. David Lashar" <dlashar at sprynet.com> writes:
> > As a beginner, I'm working through the O'Reilly books mentioned in an
> > earlier posting, but I haven't found much guidance on parsing an HTML file
> > once I've pulled it down with httplib. And I'm finding the Python Library
> > Reference to be a bit cryptic. Could someone point to resources or provide
> > examples?
>
> If possible, use Perl and HTML::Parser (or HTML::LinkExtor if that's
> what you want) instead. Python doesn't yet have anything nearly as
> good.
Well, if you're going to talk like that, you can't just stop there. How
is HTML::Parser better than, say, htmllib? Or mxTidy (to translate to
XHTML) with an XML parser?
I notice some extra features in the way the argspec is defined, which
seems convenient, but not huge. Is there something else I'm missing?
Ian
More information about the Python-list
mailing list