getting tables out

Phil Hunt philh at vision25.demon.co.uk
Sun May 23 15:58:42 EDT 1999


In article <87u2t4apxm.fsf at schwinger.harvard.edu>
           mspal at sangria.harvard.edu "Michael Spalinski" writes:
> I would like to write a Python script that would read an HTML document and
> extract table contents from it. Eg. each table could be a list of tuples
> with data from the rows. I thought htmllib would provide the basic tools
> for this, but I can't find any example that would be of use. 
> 
> So - does anyone have a Python snippet that looks for tables and gets at
> the data?

I'm not aware of anything that does this, but it shouldn't be 
particularly hard to write one. Get it to look for <table> tags,
and then within these, look for <tr>, <th> and <td>, & use
the contents of these to build up a List containing a List containing
table items.

-- 
Phil Hunt....philh at vision25.demon.co.uk





More information about the Python-list mailing list