Unicode and UrlEncode!
Jeff Epler
jepler at unpythonic.net
Wed Mar 17 18:39:31 EST 2004
You should transform the byte string 'french' from whatever encoding
it's in (latin-1 according to your coding: directive) to unicode, if
you are going to tell google it's in Unicode.
Example:
"s\xc8dimentation".decode("latin-1").encode("utf-8")
Or, you can tell Python that the string is a Unicode literal, and it
will do the .decode() step for you:
u"sÈdimentation".encode("utf-8")
"" is always a bytestring literal, and u"" is always a unicode string
literal. If you have "<sequence of bytes>" then the string's value at
runtime is "<sequence of bytes>", and if you have u"<sequence of bytes>"
then the string's value is "<sequence of bytes>".encode(<file encoding>)
Jeff
More information about the Python-list
mailing list