[issue6233] ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding

Fredrik Lundh report at bugs.python.org
Wed Jun 24 23:51:26 CEST 2009


Fredrik Lundh <fredrik at effbot.org> added the comment:

That's backwards, unless I'm missing something here: charrefs represent 
Unicode characters, not UTF-8 byte values.  The character "LATIN SMALL 
LETTER A WITH TILDE" with the character value 227 should be represented as 
"&#227;" if serialized to an encoding that doesn't support non-ASCII 
characters.

And there's no need to use RE:s to filter things under 3.X; those parts of 
ET 1.2 are there for pre-2.0 compatibility.

Did you try running the tests with the escape function I posted?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6233>
_______________________________________


More information about the Python-bugs-list mailing list