[XML-SIG] printing Unicode xml to StringIO

J.R. van Ossenbruggen Jacco.van.Ossenbruggen@cwi.nl
Thu, 27 Dec 2001 16:06:55 +0100


I'm using xml.dom.ext.Print() to print a dom node to a StringIO object, 
which fails when the data cannot be interpreted as ASCII.  After a lot of 
XML hacking, I think it is not in the XML code, but in the StringIO:

Python 2.2+ (#4, Dec 27 2001, 12:46:04) 
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import StringIO
>>> s=StringIO.StringIO()
>>> u=u'\xc9'
>>> s.write(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/StringIO.py", line 139, in write
    s = str(s)
UnicodeError: ASCII encoding error: ordinal not in range(128)

What is going wrong here?  According to the manual, StringIO should be able
to handle utf-8 strings... 
Also note that the different versions of python on my system
all seem to behave differently on the code above. 
For example, in 2.1, the write succeeds but the latin-1 encode seems
to go wrong:

Python 2.1.1 (#1, Aug 13 2001, 19:37:40) 
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-96)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import StringIO
>>> s=StringIO.StringIO()
>>> u=u'\xc9'
>>> s.write(u)
>>> s.getvalue().encode('latin-1')" # Expected 'É' here:
'\xc9'
>>> s.getvalue()                    # Still it claims to be unicode:
u'\xc9'

Any help would be appreciated,

	Jacco