Any reason why cStringIO in 2.5 behaves different from 2.4?

Stefan Behnel stefan.behnel-n05pAM at web.de
Thu Jul 26 10:06:15 EDT 2007


Stefan Scholl wrote:
> Stefan Behnel <stefan.behnel-n05pAM at web.de> wrote:
>> Stefan Scholl wrote:
>>> Well, http://docs.python.org/lib/module-xml.sax.html is missing
>>> the fact, that I can't use Unicode with parseString().
>>>
>>> This parseString() uses cStringIO.
>> Well, Python unicode is not a valid *byte* encoding for XML.
>>
>> lxml.etree can parse unicode, if you really want, but otherwise, you should
>> maybe stick to well-formed XML.
> 
> The XML is well-formed. Works perfect in Python 2.4 with Python
> unicode and Python sax parser.

The XML is *not* well-formed if you pass Python unicode instead of a byte
encoded string. Read the XML spec.

It would be well-formed if you added the proper XML declaration, but that is
system specific (UCS-4 or UTF-16, BE or LE). So don't even try.

Stefan



More information about the Python-list mailing list