Beautiful parse joy - Oh what fun

rh0dium steven.klass at gmail.com
Tue May 16 17:11:25 EDT 2006


Hi all,

I am trying to parse into a dictionary a table and I am having all
kinds of fun.  Can someone please help me out.

What I want is this:

dic={'Division Code':'SALS','Employee':'LOO ABLE'}

Here is what I have..

    html="""<table> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt=""
/></td><td width="129"><b><font size="2" face="Arial">Division Code:
</font></b></td><td width="693"><font size="2"
face="Arial">SALS</font></td></tr> <tr valign="top"><td width="24"><img
src="/icons/ecblank.gif" border="0" height="1" width="1" alt="" /> <td
width="129"><b><font size="2" face="Arial">Employee:
</font></b></td> <td width="693"><font size="2"
face="Arial">LOO</font><b><font size="2" face="Arial"> </font></b><font
size="2" face="Arial">ABLE</font></td></tr></table> """


    from BeautifulSoup import BeautifulSoup
    soup = BeautifulSoup()
    soup.feed(html)

    dic={}
    for row in soup('table')[0]('tr'):
        column = row('td')
        print column[1].findNext('font').string.strip(),
column[2].findNext('font').string.strip()
        dic[column[1].findNext('font').string.strip()]=
column[2].findNext('font').string.strip()

    for key in dic.keys():
        print key,  dic[key]

 The problem is I am missing the last name ABLE.  How can I get "ALL"
of the text.  Clearly I have something wrong with my font string..  but
what it is I am not sure of.

Please and thanks!!




More information about the Python-list mailing list