trouble getting google through urllib

Will McGugan will at willmcgugan.com
Tue Dec 19 09:13:05 EST 2006


Duncan Booth wrote:

> >
> > Google doesnt like Python scripts. You will need to pretend to be a
> > browser by setting the user-agent string in the HTTP header.
> >
> and possibly also run the risk of having your system blocked by Google if
> they figure out you are lying to them?

It is possible. I wrote a 'googlewhack' (remember them?) script a while
ago, which pretty much downloaded as many google pages as my adsl could
handle. And they didn't punish me for it. Although apparently they do
issue short term bans on IP's that abuse their service.

It is best to play nice of course. I would recommend using their
official APIs if possible!


Will McGugan
--
http://www.willmcgugan.com




More information about the Python-list mailing list