Beautiful parse joy - Oh what fun

George Sakkis george.sakkis at gmail.com
Wed May 17 13:54:44 EDT 2006


Here's one way to do it:

import re
_any_re = re.compile('.+')

d = {}
for row in BeautifulSoup(html).fetch('tr'):
    columns = row.fetch('td')
    field = columns[1].firstText(_any_re).rstrip(' \t\n:')
    value = ' '.join(text.rstrip()
                        for text in columns[2].fetchText(_any_re))
    d[field] = value
print d

George




More information about the Python-list mailing list