beutifulsoup

Kay Schluehr kay.schluehr at gmx.net
Thu Oct 30 02:39:08 EDT 2008


On 29 Okt., 17:45, luca72 <lucabe... at libero.it> wrote:
> Hello
> I try to use beautifulsoup
> i have this:
> sito = urllib.urlopen('http://www.prova.com/')
> esamino = BeautifulSoup(sito)
> luca = esamino.findAll('tr', align='center')
>
> print luca[0]
>
> >><tr align="center"><th width="5%"><a onclick="t('Only|G|BoT|05','#1');" href="#">#1</a></th><td width="10%">44.4MB</td><td width="90%" align="left"><font color="orange"> Pc-prova.rar </font></td></tr>
>
> I need to get the following information:
> 1)Only|G|BoT|05
> 2)#1
> 3)44.4MB
> 4)Pc-prova.rar
> with: print luca[0].a.string    i get #1
> with print luca[0].td.string    i get 44.4MB
> can you explain me how to get the others two value
> Thanks
> Luca

The same way you got `luca`

1,2) luca.find("a")["onclick"].split("'") and search through the
result list
3)   luca.find("td").string
4)   luca.find("font").string





More information about the Python-list mailing list