Parsing HTML?

Thu Apr 3 02:39:51 EDT 2008

BeautifulSoup does what I need it to.  Though, I was hoping to find
something that would let me work with the DOM the way JavaScript can
work with web browsers' implementations of the DOM.  Specifically, I'd
like to be able to access the innerHTML element of a DOM element.
Python's built-in HTMLParser is SAX-based, so I don't want to use
that, and the minidom doesn't appear to implement this part of the
DOM.

On Wed, Apr 2, 2008 at 10:37 PM, Daniel Fetchinson
<fetchinson at googlemail.com> wrote:
> > I'm trying to parse an HTML file.  I want to retrieve all of the text
>  > inside a certain tag that I find with XPath.  The DOM seems to make
>  > this available with the innerHTML element, but I haven't found a way
>  > to do it in Python.
>
>  Have you tried http://www.google.com/search?q=python+html+parser ?
>
>  HTH,
>  Daniel
>