[Expat-bugs] [ expat-Bugs-624251 ] Problem parsing Latin-1 symbols
noreply@sourceforge.net
noreply@sourceforge.net
Wed, 16 Oct 2002 12:23:49 -0700
Bugs item #624251, was opened at 2002-10-16 15:03
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=624251&group_id=10127
Category: None
Group: None
Status: Open
>Resolution: Rejected
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problem parsing Latin-1 symbols
Initial Comment:
Expat 1.95-5
Getting an extra character from the parser when parsing
extended ASCII characters (161 - 255 decimal).
The XML_CharacterDataHandler function reports 2
characters for every 1 extended character encountered.
Below is a small XML file demonstrating the problem.
The character handler function reports two charaters (0xC2
and 0xA9) when the xml file contains only one (0xA9).
<?xml version="1.0" encoding="ISO-8859-1"
standalone="yes"?>
<data>©</data>
Platform: Windows 2000 exe statically linked to expat.
----------------------------------------------------------------------
>Comment By: Karl Waclawek (kwaclaw)
Date: 2002-10-16 15:23
Message:
Logged In: YES
user_id=290026
Expat reports characters encoded as UTF-8 or UTF-16.
It does not generate ISO-8859-1 output.
What you are reporting looks lik UTF-8 encoding,
which means the character 0xA9 will be encoded
in two bytes. This does not appear to be a bug.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=624251&group_id=10127