scraping from bundes-telefonbuch.de with python

Sat Jun 19 11:02:44 EDT 2010

On 06/19/2010 04:23 AM, davidgp wrote:
> opener = urllib2.build_opener()
> opener.addheaders = [('User-Agent', 'Mozilla/5.0 (compatible;
> Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko)')]
> urllib2.install_opener(opener)
> 
> data = urllib.urlencode({'F0': 'mySearchKeyword','B': 'T','F8': 'A ||
> G','W': '1','Z': '0','HA': '10','SAS_static_0_treffer_treffer': 'Suche
> starten','S': '1','translationtemplate': 'checkstrasse'})
> 
> url = 'http://www.bundes-telefonbuch.de/cgi-btbneu/chtml/chtml?WA=20'
> response = urllib2.urlopen(url, data)
> 
> this returns a page saying i have to reenter my search terms..
> what's going wrong here?

Most likely you need a cookie.  You'll probably have to set up a cookie
store for use with urllib2, then request the page that the search form
is on so that the cookie is generated, and then make your post with your
search terms.