How to test a URL request in a "while True" loop

Wed Dec 30 15:00:29 EST 2009

On Dec 30, 12:31 pm, Philip Semanchuk <phi... at semanchuk.com> wrote:
> On Dec 30, 2009, at 11:00 AM, Brian D wrote:
>
>
>
> > I'm actually using mechanize, but that's too complicated for testing
> > purposes. Instead, I've simulated in a urllib2 sample below an attempt
> > to test for a valid URL request.
>
> > I'm attempting to craft a loop that will trap failed attempts to
> > request a URL (in cases where the connection intermittently fails),
> > and repeat the URL request a few times, stopping after the Nth attempt
> > is tried.
>
> > Specifically, in the example below, a bad URL is requested for the
> > first and second iterations. On the third iteration, a valid URL will
> > be requested. The valid URL will be requested until the 5th iteration,
> > when a break statement is reached to stop the loop. The 5th iteration
> > also restores the values to their original state for ease of repeat
> > execution.
>
> > What I don't understand is how to test for a valid URL request, and
> > then jump out of the "while True" loop to proceed to another line of
> > code below the loop. There's probably faulty logic in this approach. I
> > imagine I should wrap the URL request in a function, and perhaps store
> > the response as a global variable.
>
> > This is really more of a basic Python logic question than it is a
> > urllib2 question.
>
> Hi Brian,
> While I don't fully understand what you're trying to accomplish by  
> changing the URL to google.com after 3 iterations, I suspect that some  
> of your trouble comes from using "while True". Your code would be  
> clearer if the while clause actually stated the exit condition. Here's  
> a suggestion (untested):
>
> MAX_ATTEMPTS = 5
>
> count = 0
> while count <= MAX_ATTEMPTS:
>     count += 1
>     try:
>        print 'attempt ' + str(count)
>        request = urllib2.Request(url, None, headers)
>        response = urllib2.urlopen(request)
>        if response:
>           print 'True response.'
>     except URLError:
>        print 'fail ' + str(count)
>
> You could also save the results  (untested):
>
> MAX_ATTEMPTS = 5
>
> count = 0
> results = [ ]
> while count <= MAX_ATTEMPTS:
>     count += 1
>     try:
>        print 'attempt ' + str(count)
>        request = urllib2.Request(url, None, headers)
>        f = urllib2.urlopen(request)
>        # Note that here I ignore the doc that says "None may be
>        # returned if no handler handles the request". Caveat emptor.
>        results.append(f.info())
>        f.close()
>     except URLError:
>        # Even better, append actual reasons for the failure.
>        results.append(False)
>
> for result in results:
>     print result
>
> I guess if you're going to do the same number of attempts each time, a  
> for loop would be more expressive, but you probably get the idea.
>
> Hope this helps
> Philip

Nice to have options, Philip. Thanks! I'll give your solution a try in
mechanize as well. I really can't thank you enough for contributing to
helping me solve this issue. I love Python.