minidom and åäö once again :-P

Martin v. Löwis loewis at informatik.hu-berlin.de
Wed Apr 17 11:24:46 EDT 2002


Magnus Heino <magnus.heino at pleon.sigma.se> writes:

> >>> d.toxml()
> '<?xml version="1.0" ?>\n<test>\xe5\xe4\xf6</test>'

This is not what I'm getting. I get

...
    writer.write(data)
  File "/usr/local/lib/python2.2/StringIO.py", line 139, in write
    s = str(s)
UnicodeError: ASCII encoding error: ordinal not in range(128)

Apparently, you've changed the system default encoding, so that the
conversion from Unicode strings to byte strings silently converted the
Unicode string.

The .toxml() method currently does not expect this to happen; instead,
it means to return a Unicode object as output. The problem is that
StringIO attempts a str() conversion of the Unicode object. That is a
bug in Python 2.2 which has been corrected in Python 2.2.1 (where
StringIO supports Unicode again).

In any case, your best bet is to use an explicitly-encoding stream:

>>> s=StringIO.StringIO()
>>> d.writexml(codecs.getwriter('utf-8')(s))  
>>> s.getvalue()
'<?xml version="1.0" ?>\n<test>e\xc3\xa4\xc3\xb6</test>'

HTH,
Martin



More information about the Python-list mailing list