Pythjon and XML

Lars Marius Garshol larsga at garshol.priv.no
Wed Feb 2 06:55:30 EST 2000


* Fredrik Lundh
| 
| quick answer: the built-in string type only handles 8-bit
| characters, and most XML parsers (e.g. xmllib) cannot handle
| anything that doesn't use an 8-bit encoding.  in other words, ASCII,
| ISO Latin, UTF-8 (etc) works fine, but 16-bit encodings don't.

It's worth noting that Pyexpat supports UTF-16, and sends output to
Python applications as UTF-8. RXP also supports UTF-16 and also the
rest of the ISO 8859-x character sets. I assume it also sends output
as UTF-8.

--Lars M.



More information about the Python-list mailing list