parse a table in HTML page.

Stefan Behnel stefan_ml at behnel.de
Tue Oct 28 16:15:34 EDT 2008


antonio_wn8 wrote:
> I have a need to read and parse a table in HTML page.
> 
> I’m using the following script:
> http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py
> 
> It works fine  aside from  link in href.
> 
> Example:
> 
> String to parse:
> <tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>
> 
> Output:
> [[['elog', 'normal text']]]

You should try lxml.html. It gives you various tools like XPath to look for
specific elements and helper functions to find the links in an HTML document.

http://codespeak.net/lxml/

Stefan



More information about the Python-list mailing list