[ python-Bugs-745097 ] urllib2 doesn't handle urls without scheme

SourceForge.net noreply at sourceforge.net
Sun May 22 14:25:18 CEST 2005


Bugs item #745097, was opened at 2003-05-28 19:54
Message generated for change (Comment added) made by jjlee
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=745097&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2 doesn't handle urls without scheme

Initial Comment:
urllib2.urlopen does not handle URLs without a scheme, so the 
following code will not work:
    url = urllib.pathname2url('/etc/passwd')
    urllib2.urlopen(url)
The same code does work with urllib.urlopen.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2005-05-22 13:25

Message:
Logged In: YES 
user_id=261020

That sounds like a feature request to me, not a bug. 
 
I agree it's desirable to have a better pathname2url (I haven't 
submitted one partly because I'm scared of getting it wrong!). 
 
I disagree that it should be a method, since OpenerDirector has 
no knowledge of base URL (and urllib2.Request or the response 
class also seem like the wrong places for that method: the URLs 
they have aren't always the URL you want to use as the base 
URL).  It would be nice to have a couple of functions 
urlparse.urlfrompathname(pathname) and 
urlparse.absurlfrompathname(pathname, baseurl) (better 
names / places for those, anyone?). 
 
Or you could resubmit this as a bug in urllib for allowing relative 
URLs without knowing the base URL ;-) 
 

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2005-05-20 00:53

Message:
Logged In: YES 
user_id=45365

I'm not convinced it isn't a bug. I agree that the URL '/etc/passwd' isn't 
always a file: url, but I think that in that case urllib2 should get its own 
pathname2url() method that returns urls with the file: prefix.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2005-05-19 21:24

Message:
Logged In: YES 
user_id=261020

Could somebody close this?


----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-11-30 23:24

Message:
Logged In: YES 
user_id=261020

Is it wise to allow this?  Maybe it's unlikely to cause bugs, but 
"/etc/passwd" could refer to any URI scheme, not only file:. 
 
Since it seems reasonable to only allow absolute URLs, I think 
it's a bad idea to guess the scheme is file: when given a 
relative URL. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=745097&group_id=5470


More information about the Python-bugs-list mailing list