beautifulSoup 4.1

Sayth Renshaw flebber.crue at gmail.com
Fri Mar 20 03:18:33 EDT 2015


On Friday, 20 March 2015 15:20:41 UTC+11, Sayth Renshaw  wrote:
> HI
> 
> Probably very easy question.
> 
> If I have a section of html.
> 
> <tr>
> <td class="abbreviation">App</td>
> <td>Approaching</td>
> <td class="abbreviation">D/N</td>
> <td>Did nothing</td>
> <td class="abbreviation">DGO</td>
> <td>Didn't go on</td>
> <td class="abbreviation">DRO</td>
> <td>Didn't run on</td>
> <td class="abbreviation">H/In</td>
> <td>Hung in</td>
> <td class="abbreviation">H/Out</td>
> <td>Hung out</td></tr>
> <tr>
> 
> I can easily get the class values out.
> 
> In [69]: soup.find_all("td", class_="abbreviation")
> Out[69]:
> [<td class="abbreviation">App</td>,
>  <td class="abbreviation">D/N</td>,
>  <td class="abbreviation">DGO</td>,
>  <td class="abbreviation">DRO</td>,
>  <td class="abbreviation">H/In</td>,
>  <td class="abbreviation">H/Out</td>,
>  <td class="abbreviation">H/S</td>,
>  <td class="abbreviation">J awk</td>,
>  <td class="abbreviation">k-up</td>,
> 
> Or just values
> 
> In [70]: tds = soup.find_all("td", class_="abbreviation")
> 
> In [71]: for entry in tds:
>    ....:     print(entry.contents[0])
>    ....:
> App
> D/N
> DGO
> DRO
> H/In
> H/Out
> H/S
> J awk
> k-up
> 
> But how can I get the value of the following td. That is for 
> 
> class="abbreviation">App I would get <td>Approaching</td>
> 
> So when creating a csv I could use
> 
> print App Approaching
> 
> ______________________
> Abbr   | Meaning     |
> ______________________
> App    | Approaching  |
> 
> 
> I know how to do the csv writing but not quite the wizz with soup yet reading here http://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class
> 
> Thanks
> 
> Sayth

Just finding it odd that the next sibling is a "\n" and not the next <td> otherwise that would be the perfect solution.

In [72]: tds = soup.find("td", class_="abbreviation")

In [73]: tds.next_sibling
Out[73]: u'\n'

In [74]: tds
Out[74]: <td class="abbreviation">App</td>

Sayth



More information about the Python-list mailing list