From noreply at sourceforge.net Sun Nov 8 12:06:53 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 11:06:53 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Tracker Item Submitted) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Nobody/Anonymous (nobody) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 12:09:59 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 11:09:59 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Nobody/Anonymous (nobody) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 12:34:30 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 11:34:30 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Nobody/Anonymous (nobody) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 12:40:41 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 11:40:41 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Nobody/Anonymous (nobody) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 12:49:42 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 11:49:42 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Nobody/Anonymous (nobody) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 14:06:44 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 13:06:44 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Settings changed) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) >Assigned to: Karl Waclawek (kwaclaw) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 15:23:11 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 14:23:11 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Karl Waclawek (kwaclaw) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 15:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Sun Nov 8 16:21:25 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 08 Nov 2009 15:21:25 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Karl Waclawek (kwaclaw) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 16:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 15:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Mon Nov 9 18:10:44 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 09 Nov 2009 17:10:44 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Settings changed) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None >Priority: 9 Private: Yes Submitted By: Jan Lieskovsky (iankko) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 18:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 16:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 15:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Mon Nov 9 18:38:14 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 09 Nov 2009 17:38:14 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 06:06 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 9 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 12:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 12:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 10:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 09:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Mon Nov 9 20:09:04 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 09 Nov 2009 19:09:04 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 06:06 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None >Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 14:09 Message: This is not a showstopper issue that happens all over the place. Priority reset to default. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 12:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 12:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 10:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 09:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Mon Nov 9 20:41:41 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 09 Nov 2009 19:41:41 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 20:41 Message: The malformed XML file - pythontest1.xml can be downloaded here: https://bugzilla.redhat.com/attachment.cgi?id=366572 Don't wonder, it really contains only "^@" characters. >From https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-3720. The XML-Twig.pl script code is as follows: --- Script start --- use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig --- Script end --- Run it as: perl ./XML-Twig.pl Apologize for asking you to download the PoC, but three times tried to attach it here, but was unsuccessful (due *.txt attachment format and attachment size < 256 K requirement) - changed suffix of both files to *.txt and both of them are lower in size than 256 K, but still wasn't successful - maybe I am just doing something wrong. Thanks, Jan. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 20:09 Message: This is not a showstopper issue that happens all over the place. Priority reset to default. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 18:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 18:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 16:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 15:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Tue Nov 10 23:38:08 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Tue, 10 Nov 2009 22:38:08 +0000 Subject: [Expat-bugs] [ expat-Bugs-2895533 ] found a resource leak Message-ID: Bugs item #2895533, was opened at 2009-11-10 22:38 Message generated for change (Tracker Item Submitted) made by ettlmartin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2895533&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: orbitcowboy (ettlmartin) Assigned to: Nobody/Anonymous (nobody) Summary: found a resource leak Initial Comment: during a check with the static code analysis tool cppcheck, i found a resource leak, I reported it to the wxWidgets developers. The told me to contact you: http://trac.wxwidgets.org/ticket/11432 http://trac.wxwidgets.org/ticket/11194 Best regards Orbitcowboy ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2895533&group_id=10127 From noreply at sourceforge.net Thu Nov 12 17:53:01 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 12 Nov 2009 16:53:01 +0000 Subject: [Expat-bugs] [ expat-Bugs-2895533 ] found a resource leak Message-ID: Bugs item #2895533, was opened at 2009-11-10 17:38 Message generated for change (Settings changed) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2895533&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed Resolution: None Priority: 5 Private: No Submitted By: orbitcowboy (ettlmartin) Assigned to: Nobody/Anonymous (nobody) Summary: found a resource leak Initial Comment: during a check with the static code analysis tool cppcheck, i found a resource leak, I reported it to the wxWidgets developers. The told me to contact you: http://trac.wxwidgets.org/ticket/11432 http://trac.wxwidgets.org/ticket/11194 Best regards Orbitcowboy ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-12 11:53 Message: Attached patch was not done against current CVS. Applied modified patch - see readfilemap.c rev. 1.15. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2895533&group_id=10127 From noreply at sourceforge.net Thu Nov 12 19:37:06 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 12 Nov 2009 18:37:06 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 06:06 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-12 13:37 Message: I have a hard time reproducing this directly with Expat. It simply works for me using the two files attached to Bugzilla issue. I don't use Perl at all, and I am doing my work on Windows. Can you tell me if you use a specially compiled version of Expat, and how Perl configures Expat when calling it? Maybe Fred is better at debugging this, as he is more of a Unix guy. Btw, my feeling is that this is more related to a bug in the parsing initialization, as these macros have no safe-guards at all and rely on the calling code to prevent anomalous situations. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 14:41 Message: The malformed XML file - pythontest1.xml can be downloaded here: https://bugzilla.redhat.com/attachment.cgi?id=366572 Don't wonder, it really contains only "^@" characters. >From https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-3720. The XML-Twig.pl script code is as follows: --- Script start --- use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig --- Script end --- Run it as: perl ./XML-Twig.pl Apologize for asking you to download the PoC, but three times tried to attach it here, but was unsuccessful (due *.txt attachment format and attachment size < 256 K requirement) - changed suffix of both files to *.txt and both of them are lower in size than 256 K, but still wasn't successful - maybe I am just doing something wrong. Thanks, Jan. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 14:09 Message: This is not a showstopper issue that happens all over the place. Priority reset to default. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 12:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 12:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 10:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 09:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Fri Nov 13 13:49:33 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Fri, 13 Nov 2009 12:49:33 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 12:06 Message generated for change (Comment added) made by iankko You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) Group: None Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Jan Lieskovsky (iankko) Date: 2009-11-13 13:49 Message: Karl, Fred, could you please add Joe Orton (nickname jorton) to the Cc-list of this ticket? I would do so, but I don't know how - - didn't find explicit Cc field to be able to do so. Or the only way how to make this visible for him is to remove the Private checkbox sign (wouldn't like to do so, as since then it would be visible for everyone :(). Would like rather to find the patch for it first. Thanks && Regards, Jan. -- Jan iankko Lieskovsky ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-12 19:37 Message: I have a hard time reproducing this directly with Expat. It simply works for me using the two files attached to Bugzilla issue. I don't use Perl at all, and I am doing my work on Windows. Can you tell me if you use a specially compiled version of Expat, and how Perl configures Expat when calling it? Maybe Fred is better at debugging this, as he is more of a Unix guy. Btw, my feeling is that this is more related to a bug in the parsing initialization, as these macros have no safe-guards at all and rely on the calling code to prevent anomalous situations. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 20:41 Message: The malformed XML file - pythontest1.xml can be downloaded here: https://bugzilla.redhat.com/attachment.cgi?id=366572 Don't wonder, it really contains only "^@" characters. >From https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-3720. The XML-Twig.pl script code is as follows: --- Script start --- use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig --- Script end --- Run it as: perl ./XML-Twig.pl Apologize for asking you to download the PoC, but three times tried to attach it here, but was unsuccessful (due *.txt attachment format and attachment size < 256 K requirement) - changed suffix of both files to *.txt and both of them are lower in size than 256 K, but still wasn't successful - maybe I am just doing something wrong. Thanks, Jan. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 20:09 Message: This is not a showstopper issue that happens all over the place. Priority reset to default. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 18:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 18:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 16:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 15:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 12:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From noreply at sourceforge.net Fri Nov 27 14:36:01 2009 From: noreply at sourceforge.net (SourceForge.net) Date: Fri, 27 Nov 2009 13:36:01 +0000 Subject: [Expat-bugs] [ expat-Bugs-2894085 ] expat: buffer over-read and crash in big2_toUtf8() Message-ID: Bugs item #2894085, was opened at 2009-11-08 06:06 Message generated for change (Settings changed) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: XML::Parser (inactive) >Group: Test Required Status: Open Resolution: None Priority: 5 Private: Yes Submitted By: Jan Lieskovsky (iankko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: expat: buffer over-read and crash in big2_toUtf8() Initial Comment: Hello SourceForge expat maintainers, originally CVE-2009-3720 was reported in expat: [1] http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-3720 Non-public, original bug report for CVE-2009-3720: [2] http://sourceforge.net/tracker/?func=detail&aid=1990430&group_id=10127&atid=110127 And relevant patch for CVE-2009-3720: [3] http://expat.cvs.sourceforge.net/viewvc/expat/expat/lib/xmltok_impl.c?r1=1.13&r2=1.15&view=patch While the above patch [3] solves the issue in expat itself and in various other packages (PyXML, 4Suite), which embed expat, or when called via perl-XML-Parser-Expat, it does not help,when using the same reproducer via perl-XML-Twig module. In this case the crash (buffer overread) occurs in expat's big2_toUtf8 () routine - more exactly in DEFINE_UTF16_TO_UTF8(big2_) macro in lib/xmltok.c:626. Have investigated the issue in more detail, and assuming the crash occurs in 540 E ## toUtf8(const ENCODING *enc, \...) routine, as present in expat-2.0.1/lib/xmltok.c (at line 540). Assuming the problematic line of the code is this one (lib/xmltok.c): 545 for (from = *fromP; from != fromLim; from += 2) { \ 'from' represents pointer to the start of XML data, we are about to parse, 'fromLim' represents upper bound - point, where parsing should end. In each pass of the for loop we increment 'from' value by two (because on lines: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ we consumed both parts of from). This works perfect, when addresses of 'from' and 'fromLim' are aligned, i.e. both are multiple of '2'. But the problem arises, when 'fromLim' has not value dividable by two (for example 165218551) - in that case, 'from' value can't never equal to 'fromLim' value (in last round == 'fromLim - 1', so we increment it by two, but now we already 'skipped' it from == fromLim + 1, and keep incrementing it (in the effort to reach from == fromLim condition) in an infinite loop, till the operating system recognizes we tried to access memory location, which doesn't belong to us and kills the process. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-27 08:35 Message: Fixed in xmlparse.c rev. 1.165. Needs regression testing. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-13 07:49 Message: Karl, Fred, could you please add Joe Orton (nickname jorton) to the Cc-list of this ticket? I would do so, but I don't know how - - didn't find explicit Cc field to be able to do so. Or the only way how to make this visible for him is to remove the Private checkbox sign (wouldn't like to do so, as since then it would be visible for everyone :(). Would like rather to find the patch for it first. Thanks && Regards, Jan. -- Jan iankko Lieskovsky ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-12 13:37 Message: I have a hard time reproducing this directly with Expat. It simply works for me using the two files attached to Bugzilla issue. I don't use Perl at all, and I am doing my work on Windows. Can you tell me if you use a specially compiled version of Expat, and how Perl configures Expat when calling it? Maybe Fred is better at debugging this, as he is more of a Unix guy. Btw, my feeling is that this is more related to a bug in the parsing initialization, as these macros have no safe-guards at all and rely on the calling code to prevent anomalous situations. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 14:41 Message: The malformed XML file - pythontest1.xml can be downloaded here: https://bugzilla.redhat.com/attachment.cgi?id=366572 Don't wonder, it really contains only "^@" characters. >From https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-3720. The XML-Twig.pl script code is as follows: --- Script start --- use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig --- Script end --- Run it as: perl ./XML-Twig.pl Apologize for asking you to download the PoC, but three times tried to attach it here, but was unsuccessful (due *.txt attachment format and attachment size < 256 K requirement) - changed suffix of both files to *.txt and both of them are lower in size than 256 K, but still wasn't successful - maybe I am just doing something wrong. Thanks, Jan. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 14:09 Message: This is not a showstopper issue that happens all over the place. Priority reset to default. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2009-11-09 12:38 Message: Can you attach the file that allows us to reproduce this? ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-09 12:10 Message: Just to make my report complete - this issue is present in all versions of expat from 1.95.5 up to latest stable one - 2.0.1 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 10:21 Message: Grrr, when changing content of pythontest1.xml to contain: ^@space or ^@spacea Substitute space for ' '. the crash is back (pointer are mangled again at the same function :(). Now stopping to fuzze with this, because we will never fix it. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 09:23 Message: The following patch seems to fix the issue for me (under assumption patch for CVE-2009-3720 is also applied): $ cat expat-toUtf8.patch --- expat-2.0.1/lib/xmltok.c.orig 2006-11-26 18:34:46.000000000 +0100 +++ expat-2.0.1/lib/xmltok.c 2009-11-08 15:12:27.000000000 +0100 @@ -543,6 +543,9 @@ E ## toUtf8(const ENCODING *enc, \ { \ const char *from; \ for (from = *fromP; from != fromLim; from += 2) { \ + /* Stop parsing if from && fromLim addresses aren't aligned */ \ + if (from == fromLim - 1) \ + goto after; \ int plane; \ unsigned char lo2; \ unsigned char lo = GET_LO(from); \ @@ -596,6 +599,8 @@ E ## toUtf8(const ENCODING *enc, \ } \ } \ *fromP = from; \ +after: \ + *fromP = from + 1; \ } #define DEFINE_UTF16_TO_UTF16(E) \ The output is then: # perl XML-Twig.pl no element found at line 2, column 1, byte 3 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 But not sure, we shouldn't check also for case, when addresses of 'to' and 'toLim' aren't aligned (we are doing so in utf8_toUtf16() routine: 340 static void PTRCALL 341 utf8_toUtf16(const ENCODING *enc, at line: 358 case BT_LEAD4: 359 { 360 unsigned long n; 361 if (to + 1 == toLim) 362 goto after; ... 377 after: 378 *fromP = from; 379 *toP = to; 380 } So the resulting patch would then check both cases from == fromLim -1 || to == toLim - 1, will attach it in next comment - opinions appreciated. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:49 Message: Here is the valgrind output (proving it's buffer over-read) in the moment of crash: ==28534== Memcheck, a memory error detector. ==28534== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28534== Using LibVEX rev 1658, a library for dynamic binary translation. ==28534== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28534== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28534== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28534== For more details, rerun with: -v ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x457077C: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570733: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x4570740: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x660AAB5: Perl_utf8n_to_uvuni (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6600CB1: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== ==28534== Conditional jump or move depends on uninitialised value(s) ==28534== at 0x6600CB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x6604DB4: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x66094E0: Perl_regexec_flags (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65AB011: Perl_pp_match (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654703D: (within /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654B79F: Perl_call_sv (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4016E7E: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4564AE8: reportDefault (xmlparse.c:5130) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A0 is 0 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570772: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E676A1 is 1 bytes after a block of size 65,536 alloc'd ==28534== at 0x40053C0: malloc (vg_replace_malloc.c:149) ==28534== by 0x6595C1E: Perl_safesysmalloc (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x4015A3C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x4566262: XML_GetBuffer (xmlparse.c:1634) ==28534== by 0x400EE5C: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== Invalid read of size 1 ==28534== at 0x4570891: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83005 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Invalid read of size 1 ==28534== at 0x45708B0: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== Address 0x4E83004 is not stack'd, malloc'd or (recently) free'd ==28534== ==28534== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28534== Access not within mapped region at address 0x5283000 ==28534== at 0x457076F: big2_toUtf8 (xmltok.c:626) ==28534== by 0x4564AC7: reportDefault (xmlparse.c:5128) ==28534== by 0x456AF29: doProlog (xmlparse.c:4497) ==28534== by 0x456CD04: prologProcessor (xmlparse.c:3551) ==28534== by 0x456450A: XML_ParseBuffer (xmlparse.c:1562) ==28534== by 0x400EF34: (within /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x400FAB3: XS_XML__Parser__Expat_ParseStream (in /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so) ==28534== by 0x65AD51C: Perl_pp_entersub (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x65A698E: Perl_runops_standard (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x654C20D: perl_run (in /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so) ==28534== by 0x804921D: main (in /usr/bin/perl) ==28534== ==28534== ERROR SUMMARY: 4417417 errors from 9 contexts (suppressed: 30 from 1) ==28534== malloc/free: in use at exit: 4,192,790 bytes in 95,243 blocks. ==28534== malloc/free: 141,904 allocs, 46,661 frees, 12,100,734 bytes allocated. ==28534== For counts of detected errors, rerun with: -v ==28534== searching for pointers to 95,243 not-freed blocks. ==28534== checked 4,132,360 bytes. ==28534== ==28534== LEAK SUMMARY: ==28534== definitely lost: 1,415 bytes in 33 blocks. ==28534== possibly lost: 0 bytes in 0 blocks. ==28534== still reachable: 4,191,375 bytes in 95,210 blocks. ==28534== suppressed: 0 bytes in 0 blocks. ==28534== Use --leak-check=full to see details of leaked memory. ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:40 Message: To verify, the issue isn't present in / it isn't fault ot XML-Parser-Expat create following XML-Parser-Expat.pl file: use XML::Parser::Expat; $parser = new XML::Parser::Expat; $parser->setHandlers('Start' => \&sh, 'End' => \&eh, 'Char' => \&ch); #open(FOO, 'pythontest1.xml') or die "Couldn't open"; #$parser->parse(*FOO); $parser->parsefile('pythontest1.xml'); close(FOO); and run it as: perl XML-Parser-Expat.pl This results in: # perl XML-Parser-Expat.pl no element found at line 2, column 1, byte 3 at XML-Parser-Expat.pl line 9 Further note: ----------------- Even when you modify mentioned 'pythontest1.xml' file, i.e. add one more character to it, it's properly parsed by expat (in this case 'from' and 'fromLim' addresses are aligned so the parsing ends 'in finite time'): Added "a" characted at the end of pythontest1.xml (i.e. it looks like ^@a). This returns: # perl XML-Twig.pl syntax error at line 1, column 0, byte 0 at /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/XML/Parser.pm line 187 ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:34 Message: Reproducer: ========= The invalid XML file, containing UTF-8 character, the crash occurs on, can be retrieved from: https://bugzilla.redhat.com/attachment.cgi?id=366572 To reproduce the crash, create XML-Twig.pl script in the form of: =============================================== use XML::Twig; my $twig=XML::Twig->new(); # create the twig $twig->parsefile('pythontest1.xml'); # build it #my_process( $twig); # my_process isn't valid XML::Twig routine, so let this commented out #$twig->print; # output the twig And run the reproducer as: =================== perl XML-Twig.pl -> Segmentation fault (core dumped) Investigating the crash in gdb leads to: # gdb /usr/bin/perl core.28422 ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 28422] #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 626 DEFINE_UTF16_TO_UTF8(big2_) (gdb) bt #0 0x00fcd76f in big2_toUtf8 (enc=0xfdf860, fromP=0xbf8c57ac, fromLim=0x9c8bb4b "", toP=0xbf8c57bc, toLim=0x9868a28 "\005") at lib/xmltok.c:626 #1 0x00fc1ac8 in reportDefault (parser=0x982cac8, enc=0xfdf860, s=0x9cabb3e "", end=0x9c8bb4b "") at lib/xmlparse.c:5128 #2 0x00fc7f2a in doProlog (parser=0x982cac8, enc=0xfdf860, s=0x9c8bb48 "", end=0x9c8bb4b "", tok=-15, next=0x9c8bb4b "", nextPtr=0x982cae0, haveMore=0 '\0') at lib/xmlparse.c:4497 #3 0x00fc9d05 in prologProcessor (parser=0x982cac8, s=0x9c8bb48 "", end=0x9c8bb4b "", nextPtr=0x982cae0) at lib/xmlparse.c:3551 #4 0x00fc150b in XML_ParseBuffer (parser=0x982cac8, len=0, isFinal=1) at lib/xmlparse.c:1562 #5 0x007d1f35 in ?? () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #6 0x007d2ab4 in XS_XML__Parser__Expat_ParseStream () from /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi/auto/XML/Parser/Expat/Expat.so #7 0x065ad51d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #8 0x065a698f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #9 0x0654c20e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so #10 0x0804921e in main () ---------------------------------------------------------------------- Comment By: Jan Lieskovsky (iankko) Date: 2009-11-08 06:09 Message: Here is my further issue analysis (some of the information might be duplicate, but there is also additional one): While running "perl XML-Parser-Expat.pl" reports error on fixed CVE-2009-3720 expat packages, running "perl XML-Twig.pl" still crashes: $ perl XML-Twig.pl Segmentation fault (core dumped) gdb output: ... Core was generated by `perl XML-Twig.pl'. Program terminated with signal 11, Segmentation fault. [New process 23957] #0 0x009e9cb9 in big2_toUtf8 (enc=0xa00900, fromP=0xbffa17b0, fromLim=0x8ceca2f "", toP=0xbffa179c, toLim=0x88115f4 "\201") at lib/xmltok.c:634 634 DEFINE_UTF16_TO_UTF8(big2_) The problem is present in expat-2.0.1/lib/xmltok.c in toUtf8() macro: 538 #define DEFINE_UTF16_TO_UTF8(E) \ 539 static void PTRCALL \ 540 E ## toUtf8(const ENCODING *enc, \ 541 const char **fromP, const char *fromLim, \ 542 char **toP, const char *toLim) \ 543 { \ 544 const char *from; \ 545 for (from = *fromP; from != fromLim; from += 2) { \ 546 int plane; \ 547 unsigned char lo2; \ 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ 550 switch (hi) { \ 551 case 0: \ 552 if (lo < 0x80) { \ 553 if (*toP == toLim) { \ 554 *fromP = from; \ 555 return; \ 556 } \ 557 *(*toP)++ = lo; \ 558 break; \ 559 } \ 560 /* fall through */ \ 561 case 0x1: case 0x2: case 0x3: \ 562 case 0x4: case 0x5: case 0x6: case 0x7: \ 563 if (toLim - *toP < 2) { \ 564 *fromP = from; \ 565 return; \ 566 } \ 567 *(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ 568 *(*toP)++ = ((lo & 0x3f) | 0x80); \ 569 break; \ 570 default: \ 571 if (toLim - *toP < 3) { \ 572 *fromP = from; \ 573 return; \ 574 } \ "from" should point to start of the data and "fromLim" represents upper bound till above for cycle should loop. In each pass of the for loop, we increment the "from" value by 2 because we have already eaten its both parts: 548 unsigned char lo = GET_LO(from); \ 549 unsigned char hi = GET_HI(from); \ and can move further. But the problem arises, when the address of "fromLim" is not aligned with the address of "from", i.e. it's not multiple of two. In that case (assume from == fromLim -1) we will increment from value (because it != fromLim) but cross the limit value for the "fromLim" and end up in an infinite loop till the OS recognizes buffer over read and kills the process. Running "perl XML-Twig.pl" demonstrates this issue. Patched expat-2.0.1 to be more verbose which branch the code went through, and after finding out that by processing "pythontest1.xml" it loops in "case 0:" for "hi", added functions to print out the values of "from" and "fromLim" variables. Here is the output: fromLim (end) has value = 165218551 from has value = 165218548 Went by default branch fromLim (end) has value = 165218551 from has value = 165218552 fromLim (end) has value = 165218551 from has value = 165218554 ... from has value = 165416942 fromLim (end) has value = 165218551 from has value = 165416944 seg fault So at startup from < fromLim, we increment from with 2, so the distance is < 3 -> we go to "default:" break part ("Went by the default branch"), detect "from" still isn't equal to "fromLim" and increment "from" value again by two. From now we end up in endless loop, killed by OS. Further note: ------------- When you add one more characted (even space) into 'pythontest1.xml', save it and try to process it again - syntax error by processing XML file is reported: $ perl XML-Twig.pl syntax error at line 1, column 1, byte 2 at /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi/XML/Parser.pm line 187 at XML-Twig.pl line 4 at XML-Twig.pl line 4 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=2894085&group_id=10127 From gaurav.madhav at wipro.com Thu Nov 19 10:13:13 2009 From: gaurav.madhav at wipro.com (Gaurav Madhav ) Date: Thu, 19 Nov 2009 09:13:13 -0000 Subject: [Expat-bugs] Cross compilation of expat Message-ID: <384A750E719B4B2AA327DF16D51D972D@D138380> Hi I am trying to compile Expat library using the cross compiler (arm-wrs-linux-gnueabi-armv6jel-glibc_small-gcc). I downloaded expat 2.0.0 and 2.0.1 from net and tried to execute the following command after untaring the package. ./configure --host=/home/crdev/spark/library/bcom/x86-linux2/arm-wrs-linux-gnueabi-armv6 jel-glibc_small-gcc --prefix=/home/crdev/expat/2.0.0/compiled/ After I execute this command, I am getting the following compilation error configure: WARNING: If you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used. checking build system type... i686-pc-linux-gnu checking host system type... Invalid configuration `arm-wrs-linux-gnueabi-armv6jel-glibc_small-gcc': machine `arm-wrs-linux-gnueabi-armv6jel-glibc_small' not recognized configure: error: /bin/sh conftools/config.sub arm-wrs-linux-gnueabi-armv6jel-glibc_small-gcc failed However when i try to compile it using gcc,library(lexpat.a) are getting generated properly. Can you please suggest me what could be the problem. Thanks Gaurav Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com -------------- next part -------------- An HTML attachment was scrubbed... URL: