Problem: neither urllib2.quote nor urllib.quote encode the unicode strings arguments
Jerry Hill
malaclypse2 at gmail.com
Fri Oct 3 19:37:18 EDT 2008
On Fri, Oct 3, 2008 at 5:38 PM, Valery Khamenya <khamenya at gmail.com> wrote:
> Hi all
> things like urllib.quote(u"пиво Müller ") fail with error message:
> <type 'exceptions.KeyError'>: u'\u043f'
> Similarly with urllib2.
> Anyone got a hint?? I need it to form the URI containing non-ascii chars.
Do you know what, exactly, you'd like the result to be? The encoding
of unicode characters into URIs is not well defined. My understanding
is that the most common case is to percent-encode UTF-8, like this:
>>> u = u"Müller"
>>> import urllib
>>> urllib.quote(u.encode('utf8'))
'M%C3%BCller'
If you need to, you can encode your unicode string differently, like this:
>>> urllib.quote(u.encode('latin-1'))
'M%FCller'
--
Jerry
More information about the Python-list
mailing list