stealth screen scraping with python?

Dotan Cohen dotancohen at gmail.com
Fri May 11 15:39:35 EDT 2007


On 11 May 2007 12:32:55 -0700, different.engine at gmail.com
<different.engine at gmail.com> wrote:
> Folks:
>
> I am screen scraping a large volume of data from Yahoo Finance each
> evening, and parsing with Beautiful Soup.
>
> I was wondering if anyone could give me some pointers on how to make
> it less obvious to Yahoo that this is what I am doing, as I fear that
> they probably monitor for this type of activity, and will soon ban my
> IP.
>
> -DE
>

So long as you are sending a regular http request, as from a browser,
then they will have no way of knowing. Just keep your queries down to
no more than once every 3-5 seconds and you should be fine. Rotate
your IP, too, if you can.

Dotan Cohen

http://lyricslist.com/lyrics/artist_albums/110/carmen_eric.html
http://what-is-what.com/what_is/eula.html



More information about the Python-list mailing list