[ expat-Bugs-579196 ] Memory corruption with non-ASCII names

noreply@sourceforge.net noreply@sourceforge.net
Tue Jul 9 13:55:02 2002


Bugs item #579196, was opened at 2002-07-09 13:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=579196&group_id=10127

Category: None
Group: None
>Status: Closed
Resolution: Accepted
Priority: 6
Submitted By: Karl Waclawek (kwaclaw)
Assigned to: Karl Waclawek (kwaclaw)
Summary: Memory corruption with non-ASCII names

Initial Comment:
I ran into a problem with Expat overwriting my
aplication memory. This happens when the content
model in an element declaration contains names
that are non-ASCII (e.g. Japanese).

This bug is hard to find, because it will not 
always bite.

I could trace this down to the following section
in the function doProlog, under switch case
XML_ROLE_CONTENT_ELEMENT_PLUS:
...
        el = getElementType(parser, enc, s, nxt);
        if (!el)
          return XML_ERROR_NO_MEMORY;
        dtd.scaffold[myindex].name = el->name;
        dtd.contentStringLen +=  nxt - s + 1;
...
dtd.contentStringLen is supposed to be incremented
by the length of el->name. However, for non-ASCII
names, the input length, nxt - s + 1, is not the 
same as the encoded length. The function 
poolStoreString within getElementType encodes the 
name from the input encoding to the working 
encoding of Expat (UTF-8 or UTF-16).

Specifically, in my test case, using a DTD encoded
in UTF-16BE and a working encoding of UTF-8, this
problem manifested itself in my app crashing badly.

Therefore I suggest this fix:
...
        const XML_Char *name;
        int nameLen;
...
        el = getElementType(parser, enc, s, nxt);
        if (!el)
          return XML_ERROR_NO_MEMORY;
        name = el->name;
        dtd.scaffold[myindex].name = name;
        nameLen = 0;
        for (; name[nameLen++]; );
        dtd.contentStringLen +=  nameLen;
...

Karl


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-07-09 16:54

Message:
Logged In: YES 
user_id=3066

Karl checked this in as lib/xmlparse.c revision 1.49.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-07-09 14:08

Message:
Logged In: YES 
user_id=3066

I like this change; feel free to check it in and close the bug.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=579196&group_id=10127