HTML parsing confusion
Jerry Hill
malaclypse2 at gmail.com
Wed Jan 23 10:33:43 EST 2008
On Jan 23, 2008 7:40 AM, Alnilam <alnilam at gmail.com> wrote:
> Skipping past html validation, and html to xhtml 'cleaning', and
> instead starting with the assumption that I have files that are valid
> XHTML, can anyone give me a good example of how I would use _ htmllib,
> HTMLParser, or ElementTree _ to parse out the text of one specific
> childNode, similar to the examples that I provided above using regex?
Have you looked at any of the tutorials or sample code for these
libraries? If you had a specific question, you will probably get more
specific help. I started writing up some sample code, but realized I
was mostly reprising the long tutorial on SGMLLib here:
http://www.boddie.org.uk/python/HTML.html
--
Jerry
More information about the Python-list
mailing list