Internationalized domain names not working with URLopen

Hemanth H.M hemanth.hm at gmail.com
Wed Jun 13 11:32:17 EDT 2012


Well not really! does not work with '☃.net'

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1170, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1116, in do_open
    h = http_class(host, timeout=req.timeout) # will parse host:port
  File "/usr/lib/python2.6/httplib.py", line 661, in __init__
    self._set_hostport(host, port)
  File "/usr/lib/python2.6/httplib.py", line 686, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port:


On Wed, Jun 13, 2012 at 12:17 PM, Виталий Волков <hash.3g at gmail.com> wrote:

> Answer in this topic should help you to solve issue.
>
>
> http://stackoverflow.com/questions/8152161/open-persian-url-domains-with-urllib2?answertab=active#tab-top
>
>
> Regards.
>
>
> 2012/6/13 John Nagle <nagle at animats.com>
>
>> I'm trying to open
>>
>> http://пример.испытание <http://xn--e1afmkfd.xn--80akhbyknj4f>
>>
>> with
>>
>> urllib2.urlopen(s1)
>>
>> in Python 2.7 on Windows 7. This produces a Unicode exception:
>>
>> >>> s1
>> u'http://\u043f\u0440\u0438\**u043c\u0435\u0440.\u0438\**
>> u0441\u043f\u044b\u0442\u0430\**u043d\u0438\u0435'
>> >>> fd = urllib2.urlopen(s1)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>>  File "C:\python27\lib\urllib2.py", line 126, in urlopen
>>    return _opener.open(url, data, timeout)
>>  File "C:\python27\lib\urllib2.py", line 394, in open
>>    response = self._open(req, data)
>>  File "C:\python27\lib\urllib2.py", line 412, in _open
>>    '_open', req)
>>  File "C:\python27\lib\urllib2.py", line 372, in _call_chain
>>    result = func(*args)
>>  File "C:\python27\lib\urllib2.py", line 1199, in http_open
>>    return self.do_open(httplib.**HTTPConnection, req)
>>  File "C:\python27\lib\urllib2.py", line 1168, in do_open
>>    h.request(req.get_method(), req.get_selector(), req.data, headers)
>>  File "C:\python27\lib\httplib.py", line 955, in request
>>    self._send_request(method, url, body, headers)
>>  File "C:\python27\lib\httplib.py", line 988, in _send_request
>>    self.putheader(hdr, value)
>>  File "C:\python27\lib\httplib.py", line 935, in putheader
>>    hdr = '%s: %s' % (header, '\r\n\t'.join([str(v) for v in values]))
>> UnicodeEncodeError: 'ascii' codec can't encode characters in position
>> 0-5: ordinal not in range(128)
>> >>>
>>
>> The HTTP library is trying to put the URL in the header as ASCII.  Why
>> isn't "urllib2" handling that?
>>
>> What does "urllib2" want?  Percent escapes?  Punycode?
>>
>>                                John Nagle
>> --
>> http://mail.python.org/**mailman/listinfo/python-list<http://mail.python.org/mailman/listinfo/python-list>
>>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>


-- 
*'I am what I am because of who we all are'*
h3manth.com <http://www.h3manth.com>
*-- Hemanth HM *
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120613/90b246e5/attachment.html>


More information about the Python-list mailing list