[Python-checkins] python/dist/src/Lib urllib2.py,1.39,1.40

jhylton@users.sourceforge.net jhylton@users.sourceforge.net
Sun, 04 May 2003 16:44:51 -0700


Update of /cvsroot/python/python/dist/src/Lib
In directory sc8-pr-cvs1:/tmp/cvs-serv22613

Modified Files:
	urllib2.py 
Log Message:
Repair redirect handling and raise URLError on host-not-found.

The latest changes to the redirect handler couldn't possibly have been
tested, because they did not compute a newurl and failed with a
NameError.  The __name__ == "__main__": block has a test for
redirects.

Also, fix SF bug 723831.  A urlopen() that failed because the host was
not found raised a socket.gaierror unlike earlier versions of
urllib2.  The problem is that httplib actually establishes the
connection at a different point starting with Python 2.2.  Move the
try/except to endheaders(), which is where the connection gets
established.



Index: urllib2.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/urllib2.py,v
retrieving revision 1.39
retrieving revision 1.40
diff -C2 -d -r1.39 -r1.40
*** urllib2.py	24 Apr 2003 15:32:12 -0000	1.39
--- urllib2.py	4 May 2003 23:44:49 -0000	1.40
***************
*** 417,428 ****
          raise HTTPError if no-one else should try to handle this url.  Return
          None if you can't but another Handler might.
- 
          """
!         if (code in (301, 302, 303, 307) and req.method() in ("GET", "HEAD") or
!             code in (302, 303) and req.method() == "POST"):
!             # Strictly (according to RFC 2616), 302 in response to a POST
!             # MUST NOT cause a redirection without confirmation from the user
!            # (of urllib2, in this case).  In practice, essentially all clients
!             # do redirect in this case, so we do the same.
              return Request(newurl, headers=req.headers)
          else:
--- 417,439 ----
          raise HTTPError if no-one else should try to handle this url.  Return
          None if you can't but another Handler might.
          """
!         # XXX 301 and 302 errors must have a location or uri header.
!         # Not sure about the other error codes.
!         if "location" in headers:
!             newurl = headers["location"]
!         elif "uri" in headers:
!             newurl = headers["uri"]
!         else:
!             return
!         newurl = urlparse.urljoin(req.get_full_url(), newurl)
!         
!         m = req.get_method()
!         if (code in (301, 302, 303, 307) and m in ("GET", "HEAD")
!             or code in (302, 303) and m == "POST"):
!             # Strictly (according to RFC 2616), 302 in response to a
!             # POST MUST NOT cause a redirection without confirmation
!             # from the user (of urllib2, in this case).  In practice,
!             # essentially all clients do redirect in this case, so we
!             # do the same.
              return Request(newurl, headers=req.headers)
          else:
***************
*** 778,781 ****
--- 789,795 ----
  class AbstractHTTPHandler(BaseHandler):
  
+     # XXX Should rewrite do_open() to use the new httplib interface,
+     # would would be a little simpler.
+ 
      def do_open(self, http_class, req):
          host = req.get_host()
***************
*** 783,800 ****
              raise URLError('no host given')
  
!         try:
!             h = http_class(host) # will parse host:port
!             if req.has_data():
!                 data = req.get_data()
!                 h.putrequest('POST', req.get_selector())
!                 if not 'Content-type' in req.headers:
!                     h.putheader('Content-type',
!                                 'application/x-www-form-urlencoded')
!                 if not 'Content-length' in req.headers:
!                     h.putheader('Content-length', '%d' % len(data))
!             else:
!                 h.putrequest('GET', req.get_selector())
!         except socket.error, err:
!             raise URLError(err)
  
          scheme, sel = splittype(req.get_selector())
--- 797,811 ----
              raise URLError('no host given')
  
!         h = http_class(host) # will parse host:port
!         if req.has_data():
!             data = req.get_data()
!             h.putrequest('POST', req.get_selector())
!             if not 'Content-type' in req.headers:
!                 h.putheader('Content-type',
!                             'application/x-www-form-urlencoded')
!             if not 'Content-length' in req.headers:
!                 h.putheader('Content-length', '%d' % len(data))
!         else:
!             h.putrequest('GET', req.get_selector())
  
          scheme, sel = splittype(req.get_selector())
***************
*** 807,811 ****
          for k, v in req.headers.items():
              h.putheader(k, v)
!         h.endheaders()
          if req.has_data():
              h.send(data)
--- 818,825 ----
          for k, v in req.headers.items():
              h.putheader(k, v)
!         try:
!             h.endheaders()
!         except socket.error, err:
!             raise URLError(err)
          if req.has_data():
              h.send(data)