Ignoring incorrect XML encoding declarations

Jeff Hinrichs jlh at cox.net
Wed Jan 15 19:10:37 EST 2003


.replace('UTF-16','UTF-8')
then let SAX at it..

"Peter Scott" <sketerpot at chase3000.com> wrote in message
news:b04roj$lif9o$1 at ID-174764.news.dfncis.de...
> I have an XML file which I'm trying to parse with xml.sax, but, for
> reasons beyond my control, it uses the UTF-8 character encoding but has
> this incorrect line at the beginning:
> <?xml version="1.0" encoding="UTF-16"?>
> The parser wisely spots this error and throws a SAXParseException, and I
> can't parse the file. I've tried just catching the exception and
> printing an error, then keeping on parsing, but it didn't work. Is there
> any way I can get the SAX parser to ignore the 'encoding="UTF-16"' and
> parse the file with the real encoding?
>
> Thanks,
> Peter
>






More information about the Python-list mailing list