HTML to dictionary
Nikita the Spider
NikitaTheSpider at gmail.com
Tue Feb 27 12:41:24 EST 2007
In article <brednaJRF_OSnnnYRVnzvA at telenor.com>,
Tina I <tinaweb at bestemselv.com> wrote:
> Hi everyone,
>
> I have a small, probably trivial even, problem. I have the following HTML:
> > <b>
> > METAR:
> > </b>
> > ENBR 270920Z 00000KT 9999 FEW018 02/M01 Q1004 NOSIG
> > <br />
> > <b>
> > short-TAF:
> > </b>
> > ENBR 270800Z 270918 VRB05KT 9999 FEW020 SCT040
> > <br />
> > <b>
> > long-TAF:
> > </b>
> > ENBR 271212 VRB05KT 9999 FEW020 BKN030 TEMPO 2012 2000 SNRA VV010 BECMG
> > 2124 15012KT
> > <br />
>
> I need to make this into a dictionary like this:
>
> dictionary = {"METAR:" : "ENBR 270920Z 00000KT 9999 FEW018 02/M01 Q1004
> NOSIG" , "short-TAF:" : "ENBR 270800Z 270918 VRB05KT 9999 FEW020 SCT040"
> , "long-Taf:" : "ENBR 271212 VRB05KT 9999 FEW020 BKN030 TEMPO 2012 2000
> SNRA VV010 BECMG 2124 15012KT"}
Tina,
In addition to Beautiful Soup which others have mentioned, Connelly
Barnes' HTMLData module will take (X)HTML and convert it into a
dictionary for you:
http://oregonstate.edu/~barnesc/htmldata/
THe dictionary won't have the exact format you want, but I think it
would be fairly easy for you to convert to what you're looking for.
I use HTMLData a lot. Beautiful Soup is great for parsing iteratively,
but if I just want to throw some HTML at a function and get data back,
HTMLData is my tool of choice.
Good luck with whatever you choose
--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
More information about the Python-list
mailing list