TRying to read sercah results from googles web page

Wojtek Walczak gminick at bzt.bzt
Wed Aug 20 09:58:24 EDT 2008


On Wed, 20 Aug 2008 05:42:34 -0700 (PDT), tedpottel at gmail.com wrote:

> the  web page.  When I try to load in a url with the search results,
> http://www.google.com/search?hl=en&q=ted', I get a web page that says
> I do not have permissions.  Is theree a way around this, or is Google
> just to smart????

Try to imitate the web browser. Add 'User-Agent' (with add_header
method) to your http request. If it won't help, try to add more
browser-specific variables to your headers. Also, take a look at
mechanize and its Browser class:

http://wwwsearch.sourceforge.net/mechanize/

FYI and AFAIK, google doesn't allow to use their search engine
in this way. They even block certain IP addresses it it's constantly
abusing the search engine with too many requests.

-- 
Regards,
Wojtek Walczak,
http://tosh.pl/gminick/



More information about the Python-list mailing list