[XML-SIG] Errors when using PrettyPrint Class (xml.dom.ext) and latin-1 characters (iso-8859-1) ...

Mike Williams mike.williams at globalgraphics.com
Mon Dec 12 18:27:23 CET 2005


Hi,

Michel Charest did utter on 12/12/2005 15:51:

> COMMENT: As can be seen, when using Method1 (default encoding with
> iso8859-1, I get
> a UnicodeDecodeError. And, when using Metho2, explicitely encoding using
> using
> unicode("élève", 'latin-1'), the PrettyPrint class does not raise an
> exception, but
> it garbles (does not correctly interpret) my latin-1 string (i.e. élève).

The xml.dom.ext PrettyPrint can only handle 7-bit ASCII or Unicode 
encoded text node strings.  Method1 will fail as your latin1 encoded 
string contains 8-bit values which are not valid utf-8 encodings, as the 
error message reports.  Method2 is in fact working - the output you see 
is the utf-8 encoding for your text node string.  If you look at the 
output generated in a Unicode editor you should see your original string.

> EXTRA DETAILS:
> ==============
> * Running on Windows XP (sp2)
> * Python 2.4.2
> * PyXML 0.8.4
> * 4Suite 1.0b1
> * I have tried many other encoding formats such as utf8, utf-16, utf16-le,
> etc. with no luck !

The xml.dom.ext PrettyPrint can only produce utf-8 output.  There is a 
bug report and patch on sourceforge to let it produce utf-16 output.

TTFN

Mike
-- 
I was just getting used to yesterday when today came.



More information about the XML-SIG mailing list