[XML-SIG] 'utf8' codec can't decode byte 0xc3 - bug in xmlproc?

Mike Brown mike at skew.org
Thu Aug 4 20:59:17 CEST 2005


Anders wrote:
> Im having a hard time debugging this error:
> 
> <somefile>:<row>:<char>: character set conversion problem: 'utf8' codec can't decode byte 0xc3 in position 65535: unexpected end of data
> 
> The file Im trying to parse with xmlproc contains no illegal utf-8 byte 
> sequences and this error does not occur when I switch to pyexpat. This 
> is a hexdump of the row its complaining about:
> 00020030  64 65 73 20 6c c3 a8 76  72 65 73 20 42 6f 72 64  |des l..vres 
> Bord|
> Its nothing wrong with this bytesequence what I can see.
> 
> Has anyone else experienced this problem and found a solution, all help 
> appreciated.

Apparently it's a buffering issue; the stream it's decoding only consists of 
2^16 bytes, and the last one is that c3. What does your python code look like?
What platform/OS is this on, and what versions of Python and PyXML?




More information about the XML-SIG mailing list