utf8 encoding problem
Erik Max Francis
max at alcyone.com
Thu Jan 22 06:07:31 EST 2004
Wichert Akkerman wrote:
> I'm struggling with what should be a trivial problem but I can't seem
> to
> come up with a proper solution: I am working on a CGI that takes utf-8
> input from a browser. The input is nicely encoded so you get something
> like this:
>
> firstname=t%C3%A9s
>
> where %C3CA9 is a single character in utf-8 encoding. Passing this
> through urllib.unquote does not help:
>
> >>> urllib.unquote(u't%C3%A9st')
> u't%C3%A9st'
Unquote it as a normal string, then convert it to Unicode.
>>> import urllib
>>> x = 't%C3%A9s'
>>> y = urllib.unquote(x)
>>> y
't\xc3\xa9s'
>>> z = unicode(y, 'utf-8')
>>> z
u't\xe9s'
--
__ Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ I do not promise to consider race or religion in my appointments.
I promise only that I will not consider them. -- John F. Kennedy
More information about the Python-list
mailing list