Any reason why cStringIO in 2.5 behaves different from 2.4?

Michael L Torrie torriem at chem.byu.edu
Sat Jul 28 19:34:06 EDT 2007


Stefan Scholl wrote:
> Don't let the subject line fool you. I'm OK with cStringIO. The
> thread is now about xml.sax's parseString().

Giving you the benefit of the doubt here, despite the fact that Stefan
Behnel has state this over and over again and you just haven't listened.

xml.sax's use of parseString() is exactly correct.  xml.sax should
*never* parse python unicode strings as by definition XML must be
encoded as a *byte stream*, which is what a python string is.

A python /unicode/ string could be held internally in any number of
ways, 2, 3, 4, or even 8 bytes per character if the implementation
demanded it (a bit contrived, I admit).  Since the xml parser is only
ever intended to parse *XML*, why should it ever know what to do with
python unicode strings, which could be stored any number of ways, making
byte-parsing impossible.

So your code is faulty in its assumptions, not xml.sax.

> 
> 




More information about the Python-list mailing list