[XML-SIG] Re: XML Unicode and UTF-8

Fredrik Lundh fredrik at pythonware.com
Sat Aug 7 16:42:56 CEST 2004


Neil Youngman wrote:

> Yes, but it's being written out through a UTF-8 codec to a file which
> specifies 'charset="utf-8"'. AIUI the python UTF-8 codec can detect that it's
> got a unicode string and convert it to utf-8 with no futher programmer
> intervention.

Python's UTF-8 codec takes a Unicode object, and generates an 8-bit string
object.  If you attempt to "encode" an 8-bit string object, it is converted to a
Unicode object first.  This conversion only works if the 8-bit string contains
ASCII characters only.

There's no such thing as an 8-bit Unicode string.

</F>





More information about the XML-SIG mailing list