[XML-SIG] Processing xml files with ISO 8859-1 chars

Thomas B. Passin tpassin@home.com
Wed, 7 Nov 2001 19:19:25 -0500


[Martin v. Loewis]

> > It seems that this xml file should caused an exception, since it is
> > not well-formed: the actual encoding does not match the presumed
> > encoding (namely, utf-8).  The fact that the parse partially
> > succeeded is disturbing.
> 
> Indeed. IMO, Expat should detect the error, but it doesn't, instead it
> treats all contents >128 as proper UTF-8 (remember that all markup is
> ASCII). So Expat passes it to the application (pyexpat), which invokes
> the UTF-8 decoder, which fails. Due to a bug, this exception is lost,
> but the entire chunk of data reported by expat isn't reported to the
> Python application, either.
> 
> This is now fixed in pyexpat.c 1.42; thanks for the report.

Excellent. Thanks, Martin.

Tom P