HTML to dictionary
WEINHANDL Herbert
weinhand at unileoben.ac.at
Tue Feb 27 06:14:55 EST 2007
Tina I schrieb:
> Hi everyone,
>
> I have a small, probably trivial even, problem. I have the following HTML:
>> <b>
>> METAR:
>> </b>
>> ENBR 270920Z 00000KT 9999 FEW018 02/M01 Q1004 NOSIG
>> <br />
...
BeautifulSoup is really fun to work with ;-)
> I have played around with BeautifulSoup but I'm stuck at stripping off
> the tags and chop it up to what I need to put in the dict. If someone
> can offer some hints or example to get me going I would greatly
> appreciate it.
>
> Thanks!
> Tina
#!/usr/bin/python
# -*- coding: utf-8 -*-
from BeautifulSoup import BeautifulSoup, Tag, NavigableString
html = """<html> <head><title>Title</title> </head>
<body>
<b> METAR: </b> ENBR 270920Z 00000KT 9999 ... <br />
<b> short-TAF:</b> ENBR 270800Z 270918 VRB05KT ... <br />
<b> long-TAF: </b> ENBR 271212 VRB05KT 9999 ... <br />
</body>
</html>
"""
soup = BeautifulSoup( html, convertEntities='html' )
bolds = soup.findAll( 'b' )
dict = {}
for b in bolds :
key = b.next.strip()
val = b.next.next.strip()
print 'key=', key
print 'val=', val, '\n'
dict[key] = val
print dict
#---- end ----
happy pythoning
Herbert
More information about the Python-list
mailing list