[issue9679] unicode DNS names in urllib, urlopen

John Nagle report at bugs.python.org
Wed Jun 13 20:51:10 CEST 2012


John Nagle <nagle at users.sourceforge.net> added the comment:

A "IRI library" is not needed to fix this problem.  It's already fixed in the sockets library and the http library.  We just need consistency in urllib2.  

urllib2 functions which take a "url" parameter should apply "encodings.idna.ToASCII" to each label of the domain name.  

urllib2 function which return a "url" value (such as "geturl()") should apply "encodings.idna.ToUnicode" to each label of the domain name.

Note that in both cases, the conversion function must be applied to each label (field between "."s) of the domain name only.  Applying it to the entire domain name or the entire URL will not work. 

If there are future changes to domain syntax, those should go into "encodings.idna", which is the proper library for domain syntax issues.

----------
nosy: +nagle

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9679>
_______________________________________


More information about the Python-bugs-list mailing list