python-parser running Beautiful Soup only spits out one line of 10. What i have gotten wrong here?

John Nagle nagle at animats.com
Sat Dec 25 13:36:22 EST 2010


    Your program is doing what you asked it to do.  It finds the
first table with class 'bp_ergebnis_tab_info'.  Then it ignores
that results.  Then it finds the first "td" item in the document,
and prints the contents of that.  Then it exits.  What did
you want it to do?

    Try this.  It prints out the TD items on each
row of the table, in order.

import urllib2
from BeautifulSoup import BeautifulSoup
page = 
urllib2.urlopen("http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
soup = BeautifulSoup(page)
table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
for row in table.findAll('tr') : # for all TR items (table rows)
     for td in row.findAll('td') : # for TD items in row
         text = td.renderContents().strip()
         print(text)
     print('-----') # mark end of row

				John Nagle

On 12/25/2010 9:58 AM, Martin Kaspar wrote:
> Hello dear Community,.
> I am trying to get a scraper up and running: And keep running into
> problems.
>
> when I try what you have i have learned so far I only get:
> <strong>Schuldaten</strong>
>
> Here is the code that I used:
>
> import urllib2
> from BeautifulSoup import BeautifulSoup
> page = urllib2.urlopen("http://www.schulministerium.nrw.de/BP/
> SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
> soup = BeautifulSoup(page)
> table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
> first_td = soup.find('td')
> text = first_td.renderContents()
> trimmed_text = text.strip()
> print trimmed_text
>
>
> i run it in the template at http://scraperwiki.com/scrapers/new/python
>
> see the target: http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323
>
> What have I gotten wrong?
>
> Can anybody review the code -
>
> many thanks in Advance
>
> regards
> matze




More information about the Python-list mailing list