What is wrong? The minidom or the XML file?

Erik Max Francis max at alcyone.com
Wed Mar 10 20:06:41 EST 2004


Anthony Liu wrote:

> Another question: If I insert some Chinese characters
> in the sample xml document, then again the same python
> code cannot parse it. It python code got choked
> whenever it hits the 1st Chiese character.
> 
> Python says:
> 
> ExpatError: not well-formed (invalid token): line 3,
> column 7
> 
> The problem remains even if I try encoding="UTF-16" or
> encoding="GB2312" or encoding="GBK" in the xml
> document.
> 
> Note that GB2312 and GBK are Chinese encodings.

If you're getting errors, then the encoding specified in the XML
document type declaration doesn't patch what you've pasted in.  It
matters where these Chinese characters come from, since they're going to
be a part of some encoding and you haven't said what it was; if your
document is UTF-8, you need to paste UTF-8 in.

-- 
 __ Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
/  \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ Love is the true price of love.
    -- George Herbert



More information about the Python-list mailing list