[Expat-bugs] [ expat-Bugs-1990430 ] Parser crash with specially formatted UTF-8 sequences

SourceForge.net noreply at sourceforge.net
Fri Jun 13 00:08:17 CEST 2008


Bugs item #1990430, was opened at 2008-06-10 22:45
Message generated for change (Comment added) made by petervalchev
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1990430&group_id=10127

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: www.libexpat.org
Group: None
Status: Open
>Resolution: None
Priority: 5
Private: Yes
Submitted By: Peter Valchev (petervalchev)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Parser crash with specially formatted UTF-8 sequences

Initial Comment:
I have discovered a way to crash libexpat's xml parser with certain specially formatted UTF-8 sequences. All applications that link w/ expat and use it to render user-provided XML files are affected. As far as I see, the issue is not exploitable, just denial of service.

This is the patch that I have come up with, also attached to this email:

+++ lib/xmltok_impl.c 2007-12-21 11:11:42.054417000 -0800
@@ -1745,6 +1745,9 @@
 switch (BYTE_TYPE(enc, ptr)) {
 #define LEAD_CASE(n) \
 case BT_LEAD ## n: \
+ if (end - ptr < n) { \
+   return; \
+ } \
 ptr += n; \
 break;
 LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4)

The parser's updatePosition function which keeps track of the current position pointer increments the ptr by {2, 3, 4} to skip past multibyte character ombinations, and this causes ptr in the "while (ptr != end)" loop to jump past the terminating condition, causing the loop to continue reading past 'end' and into out of bounds memory until a crash.

In general this parser does not appear the most robust and could be the source of some security issues.

A fault file is attached. To reproduce, compile examples/outline.c and run against it. This patch may not be 100% complete...

Contact:
Peter Valchev <pvalchev at google.com>

----------------------------------------------------------------------

>Comment By: Peter Valchev (petervalchev)
Date: 2008-06-12 16:08

Message:
Logged In: YES 
user_id=2114255
Originator: YES

Thanks.

Actually rechecking in this area again, I think I found another issue,
which seems to not be covered with the patch I provided :( I am attaching
the new test case.

I haven't found many other issues at this point... I stumbled upon this
one fairly quickly and my above observations were more general than
anything. If I do find more I'll be sure to tell you.

File Added: expat-fault2.xml

----------------------------------------------------------------------

Comment By: Karl Waclawek (kwaclaw)
Date: 2008-06-11 08:46

Message:
Logged In: YES 
user_id=290026
Originator: NO

Can reproduce. The problem is that this code can be called *after* an
error has been found (to report line and column number). Therefore it
should not rely on correct byte counts for multibyte characters.

Patch applied in xmltok_impl.c rev. 1.14.

Would you please also report all the other issues you have found?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1990430&group_id=10127


More information about the Expat-bugs mailing list