[Tutor] Question regarding parsing HTML with BeautifulSoup
Kent Johnson
kent37 at tds.net
Thu Jan 4 03:14:15 CET 2007
Shuai Jiang (Runiteking1) wrote:
> Hello,
>
> I'm working on a program that need to parse a financial document on the
> internet
> using BeautifulSoup. Because of the nature of the information, it is all
> grouped
> as a table. I needed to get 3 types of info and have succeeded quite
> well using
> BeautifulSoup, but encountered problems on the third one.
>
> My question is that is there any easy way to parse an HTML tables column
> easily using BeautifulSoup. I copied the table here and I need to
> extract the EPS. The numbers are
> every sixth one from the <tr> tag ex 2.27, 1.86, 1.61...
Here is one way, found with a little experimenting at the command prompt:
In [1]: data = '''<table id="INCS" style="width:580px" class="f10y"
cellspacing="0">
<snip the rest of your data>
...: </table>'''
In [3]: from BeautifulSoup import BeautifulSoup as BS
In [4]: soup=BS(data)
In [11]: for tr in soup.table.findAll('tr'):
....: print tr.contents[11].string
....:
....:
EPS
2.27
1.86
1.61
1.27
1.18
0.84
0.73
0.46
0.2
0.0
Kent
More information about the Tutor
mailing list