Parsing HTML?

benash at gmail.com benash at gmail.com
Thu Apr 3 02:39:51 EDT 2008


BeautifulSoup does what I need it to.  Though, I was hoping to find
something that would let me work with the DOM the way JavaScript can
work with web browsers' implementations of the DOM.  Specifically, I'd
like to be able to access the innerHTML element of a DOM element.
Python's built-in HTMLParser is SAX-based, so I don't want to use
that, and the minidom doesn't appear to implement this part of the
DOM.

On Wed, Apr 2, 2008 at 10:37 PM, Daniel Fetchinson
<fetchinson at googlemail.com> wrote:
> > I'm trying to parse an HTML file.  I want to retrieve all of the text
>  > inside a certain tag that I find with XPath.  The DOM seems to make
>  > this available with the innerHTML element, but I haven't found a way
>  > to do it in Python.
>
>  Have you tried http://www.google.com/search?q=python+html+parser ?
>
>  HTH,
>  Daniel
>



More information about the Python-list mailing list