extracting html table rows into a list

Tue Dec 4 20:02:10 EST 2001

I just wanted to add that your code is shorter than using the HTMLParser
class to do this task.

"Walter Dörwald" <walter at livinglogic.de> wrote in message
news:mailman.1006449030.30723.python-list at python.org...
damien Wetzel wrote:

> hi ,
> does any body has a script which parse a big table from an html file
> and create a list of rows ?

You could give XIST a try (http://www.livinglogic.de/Python/xist/)

Code might look like this:

from xist import parsers
from xist.ns import html

doc = parsers.parseTidyURL("http://www.freshmeat.net/",
defaultEncoding="latin-1")

firsttable = doc.find(type=html.table, searchchildren=1)[0]

rows = firsttable.find(type=html.tr)

for row in rows:
    print row.asPlainString()

HTH,
   Walter Dörwald