small inconsistency in ElementTree (1.2.6)

Damjan gdamjan at gmail.com
Thu Dec 8 22:16:18 EST 2005


Attached is the smallest test case, that shows that ElementTree returns
a
string object if the text in the tree is only ascii, but returns a
unicode
object otherwise.

This would make sense if the sting object and unicode object were
interchangeable... but they are not - one example, the translate method
is
completelly different.

I've tested with cElementTree (1.0.2) too, it has the same behaviour.

Any suggestions?
Do I need to check the output of ElementTree everytime, or there's some
hidden switch to change this behaviour?

from elementtree import ElementTree

xml = """\
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <p1> ascii </p1>
  <p2> \xd0\xba\xd0\xb8\xd1\x80\xd0\xb8\xd0\xbb\xd0\xb8\xd1\x86\xd0\xb0
</p2>
</root>
"""

tree = ElementTree.fromstring(xml)
p1, p2 = tree.getchildren()
print "type(p1.text):", type(p1.text)
print "type(p2.text):", type(p2.text)




More information about the Python-list mailing list