beautifulSoup 4.1
Sayth Renshaw
flebber.crue at gmail.com
Fri Mar 20 03:18:33 EDT 2015
On Friday, 20 March 2015 15:20:41 UTC+11, Sayth Renshaw wrote:
> HI
>
> Probably very easy question.
>
> If I have a section of html.
>
> <tr>
> <td class="abbreviation">App</td>
> <td>Approaching</td>
> <td class="abbreviation">D/N</td>
> <td>Did nothing</td>
> <td class="abbreviation">DGO</td>
> <td>Didn't go on</td>
> <td class="abbreviation">DRO</td>
> <td>Didn't run on</td>
> <td class="abbreviation">H/In</td>
> <td>Hung in</td>
> <td class="abbreviation">H/Out</td>
> <td>Hung out</td></tr>
> <tr>
>
> I can easily get the class values out.
>
> In [69]: soup.find_all("td", class_="abbreviation")
> Out[69]:
> [<td class="abbreviation">App</td>,
> <td class="abbreviation">D/N</td>,
> <td class="abbreviation">DGO</td>,
> <td class="abbreviation">DRO</td>,
> <td class="abbreviation">H/In</td>,
> <td class="abbreviation">H/Out</td>,
> <td class="abbreviation">H/S</td>,
> <td class="abbreviation">J awk</td>,
> <td class="abbreviation">k-up</td>,
>
> Or just values
>
> In [70]: tds = soup.find_all("td", class_="abbreviation")
>
> In [71]: for entry in tds:
> ....: print(entry.contents[0])
> ....:
> App
> D/N
> DGO
> DRO
> H/In
> H/Out
> H/S
> J awk
> k-up
>
> But how can I get the value of the following td. That is for
>
> class="abbreviation">App I would get <td>Approaching</td>
>
> So when creating a csv I could use
>
> print App Approaching
>
> ______________________
> Abbr | Meaning |
> ______________________
> App | Approaching |
>
>
> I know how to do the csv writing but not quite the wizz with soup yet reading here http://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class
>
> Thanks
>
> Sayth
Just finding it odd that the next sibling is a "\n" and not the next <td> otherwise that would be the perfect solution.
In [72]: tds = soup.find("td", class_="abbreviation")
In [73]: tds.next_sibling
Out[73]: u'\n'
In [74]: tds
Out[74]: <td class="abbreviation">App</td>
Sayth
More information about the Python-list
mailing list