XML international characters

"Martin v. Löwis" martin at v.loewis.de
Fri Mar 10 16:58:45 EST 2006


Andreas R. wrote:
> When parsing XML documents containing international characters, such as
> the Norwegian characters Æ, Ø, Å, I get an exception in Python's SAX
> module. What is the correct way to parse such characters in Python? I've
> searched for methods to somehow escape the characters, without any luck
> so far.

The correct way is to provide correct XML. If you get a parse error,
it really means that there is an error in your XML file. Most likely,
the encoding of the characters is inconsistent with the declared
encoding. Notice that the default encoding of XML (in absence of a
declaration) is UTF-8.

Regards,
Martin



More information about the Python-list mailing list