utf8 silly question
Jeff Epler
jepler at unpythonic.net
Tue Jun 21 14:35:02 EDT 2005
If you want to work with unicode, then write
us = u"\N{COPYRIGHT SIGN} some text"
You can also write this as
us = unichr(169) + u" some text"
When you have a Unicode string, you can convert it to a particular
encoding stored in a byte string with
bs = us.encode("utf-8")
It's generally a mistake to use the .encode() method on a byte string,
but that's what code like
bs = "\xa9 some text"
bs = bs.encode("utf-8")
does. It can lull you into believing it works, if the test data only
has US ASCII contents, then break when you go into production and have
non-ASCII strings.
Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20050621/b951ee09/attachment.sig>
More information about the Python-list
mailing list