Network failure when using urllib2

jdvolz at gmail.com jdvolz at gmail.com
Mon Jan 8 19:30:03 EST 2007


I am fetching different web pages (never the same one) from a web
server.  Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happens in the script.  It is
almost like something is seeing a lot of traffic from my computer, and
cutting it off thinking it is some kind of virus or worm.  I am
starting to suspect my firewall.  Anyone else have this happen?

I am going to read over that documentation you suggested to see if I
can get any ideas.  Thanks for the link.

Shuad

On Jan 8, 4:15 pm, "Ravi Teja" <webravit... at gmail.com> wrote:
> jdv... at gmail.com wrote:
> > I have a script that uses urllib2 to repeatedly lookup web pages (in a
> > spider sort of way).  It appears to function normally, but if it runs
> > too long I start to get 404 responses.  If I try to use the internet
> > through any other programs (Outlook, FireFox, etc.) it will also fail.
> > If I stop the script, the internet returns.
>
> > Has anyone observed this behavior before?  I am relatively new to
> > Python and would appreciate any suggestions.
>
> > ShuadI am assuming that you are fetching the full page every little while.
> You are not supposed to do that. The admin of the web site you are
> constantly hitting probably configured his server to block you
> temporarily when that happens. But don't feel bad :-). This is a common
> Beginners mistake.
>
> Read here on the proper way to do this.http://diveintopython.org/http_web_services/review.html
> especially 11.3.3. Last-Modified/If-Modified-Since in the next page
> 
> Ravi Teja.




More information about the Python-list mailing list