[issue1767933] Badly formed XML using etree and utf-16
Amaury Forgeot d'Arc
report at bugs.python.org
Sat Oct 2 11:56:31 CEST 2010
Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:
Python 3.1 improves the situation, the file looks more like utf-16, except that the BOM ("\xff\xfe") is repeated all the time, probably on every internal call to file.write().
Here is a test script that should work on both 2.7 and 3.1.
from io import BytesIO
from xml.etree.ElementTree import ElementTree
content = "<?xml version='1.0' encoding='UTF-16'?><html></html>"
input = BytesIO(content.encode('utf-16'))
tree = ElementTree()
tree.parse(input)
# Write content
output = BytesIO()
tree.write(output, encoding="utf-16")
assert output.getvalue().decode('utf-16') == content
----------
stage: unit test needed -> needs patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1767933>
_______________________________________
More information about the Python-bugs-list
mailing list