How to processing multi redirect?

David Wahler dwahler at gmail.com
Wed Oct 26 14:25:43 EDT 2005


Gonnasi wrote:
> I want fetching some articles from nytimes.com for my Palm, and I want
> a clear, simple article too, my Palm has only 8M RAM.
>
> With the WGET, I can fetching the page like:
> "http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print",
> and when WGET works, I can see the URL have been redirect many times.
>
> When I run the below code with Python:
> >>> thing = urllib2.HTTPRedirectHandler()
> >>> opener = urllib2.build_opener(thing)
> >>> url = http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
> >>> page = opener.open(url)
>
> I just get a error message: "HTTP Error 302: The HTTP server returned a
> redirect error that would lead to an infinite loop. The last 30x error
> message was: Moved Temporarily"
>
> Why I can't fetching the page with python, but WGET can do it?
>
> Thanks for your help in advance!
>
> --
> Gonnasi

Hi,

Your problem is that you're not preserving cookies from one request to
the next. nytimes.com redirects you to an automatic login page which
sets a cookie; this cookie is required to view the original page, or
else it'll get stuck in a loop. This fixes the problem:

>>> thing = urllib2.HTTPRedirectHandler()
>>> thing2 = urllib2.HTTPCookieProcessor()
>>> opener = urllib2.build_opener(thing, thing2)
>>> url = 'http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
>>> page = opener.open(url)

Hope this helps, 

-- David




More information about the Python-list mailing list