HTML to dictionary

Nikita the Spider NikitaTheSpider at gmail.com
Tue Feb 27 12:41:24 EST 2007


In article <brednaJRF_OSnnnYRVnzvA at telenor.com>,
 Tina I <tinaweb at bestemselv.com> wrote:

> Hi everyone,
> 
> I have a small, probably trivial even, problem. I have the following HTML:
> > <b>
> >  METAR:
> > </b>
> > ENBR 270920Z 00000KT 9999 FEW018 02/M01 Q1004 NOSIG
> > <br />
> > <b>
> >  short-TAF:
> > </b>
> > ENBR 270800Z 270918 VRB05KT 9999 FEW020 SCT040
> > <br />
> > <b>
> >  long-TAF:
> > </b>
> > ENBR 271212 VRB05KT 9999 FEW020 BKN030 TEMPO 2012 2000 SNRA VV010 BECMG 
> > 2124 15012KT
> > <br />
> 
> I need to make this into a dictionary like this:
> 
> dictionary = {"METAR:" : "ENBR 270920Z 00000KT 9999 FEW018 02/M01 Q1004 
> NOSIG" , "short-TAF:" : "ENBR 270800Z 270918 VRB05KT 9999 FEW020 SCT040" 
> , "long-Taf:" : "ENBR 271212 VRB05KT 9999 FEW020 BKN030 TEMPO 2012 2000 
> SNRA VV010 BECMG 2124 15012KT"}

Tina,
In addition to Beautiful Soup which others have mentioned, Connelly 
Barnes' HTMLData module will take (X)HTML and convert it into a 
dictionary for you:
http://oregonstate.edu/~barnesc/htmldata/

THe dictionary won't have the exact format you want, but I think it 
would be fairly easy for you to convert to what you're looking for.

I use HTMLData a lot. Beautiful Soup is great for parsing iteratively, 
but if I just want to throw some HTML at a function and get data back, 
HTMLData is my tool of choice.

Good luck with whatever you choose

-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more



More information about the Python-list mailing list