From noreply at sourceforge.net Sat Jul 1 06:00:15 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Fri, 30 Jun 2006 21:00:15 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContext() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Settings changed) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None >Group: Test Required Status: Open >Resolution: Accepted Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContext() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Sat Jul 1 17:02:41 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 08:02:41 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContext() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required >Status: Closed Resolution: Accepted Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContext() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Sat Jul 1 17:32:18 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 08:32:18 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContext() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Closed Resolution: Accepted Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContext() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Sat Jul 1 17:40:15 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 08:40:15 -0700 Subject: [Expat-bugs] [ expat-Bugs-448234 ] Features added since older version Message-ID: Bugs item #448234, was opened at 2001-08-05 16:30 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=448234&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Not a Bug Status: Open Resolution: None >Priority: 4 Submitted By: Nobody/Anonymous (nobody) >Assigned to: Nobody/Anonymous (nobody) >Summary: Features added since older version Initial Comment: >From comparing Expat 1.2 from jclark.com and a current CVS snapshot, it is clear that some significant changes have been made. Can you please provide a list of features that have been added and changes that have been made since 1.2? Thanks, Mike ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:40 Message: Logged In: YES user_id=3066 This should be done, but I've (clearly) not had time, and it doesn't seem to be a high priority or someone would have contributed a patch. Un-assigning since I'm not getting to this, and revised the summary since this isn't really specific to changes from 1.2 (though that's an important part of this, since there were so many additions between 1.2 and 1.95). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=448234&group_id=10127 From noreply at sourceforge.net Sat Jul 1 18:00:56 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 09:00:56 -0700 Subject: [Expat-bugs] [ expat-Bugs-1490371 ] additional config for INSTALL_ROOT Message-ID: Bugs item #1490371, was opened at 2006-05-17 12:36 Message generated for change (Settings changed) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: www.libexpat.org Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) >Summary: additional config for INSTALL_ROOT Initial Comment: When I install expat 2.0.0, it shows me the following error always. but expat 1.9.5 is fine. camelot# make install make: Fatal error in reader: Makefile, line 48: Unexpected end of line seen the line 48 is as following: 47:ifndef INSTALL_ROOT 48:INSTALL_ROOT=$(DESTDIR) 49:if ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-01 17:01 Message: Logged In: YES user_id=290026 Could you please try a checkout from CVS. If you still have a problem, then maybe "make" on your system is too old, or otherwise different. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-06-01 16:33 Message: Logged In: NO I'm having the same problem building in a Solaris 10 on Sparc environment. I'm using 2.0.0 from a .gz tarball. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-05-17 13:19 Message: Logged In: YES user_id=290026 In which environment do you try to build expat? Is this a checkout from CVD or did you download the .gz archive? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127 From noreply at sourceforge.net Sat Jul 1 18:03:10 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 09:03:10 -0700 Subject: [Expat-bugs] [ expat-Bugs-676131 ] Need documentation for migration path. Message-ID: Bugs item #676131, was opened at 2003-01-28 10:24 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=676131&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Nobody/Anonymous (nobody) Summary: Need documentation for migration path. Initial Comment: There needs to be more documentation for people upgrading from older versions of Expat. This includes (but is not limited to!) the introduction of the XML_Status enumeration and annotations in the documentation for when each API construct was added to the library. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 12:03 Message: Logged In: YES user_id=3066 Indeed, there should be such documentation for this. Un-assigning from myself due to time availability. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=676131&group_id=10127 From noreply at sourceforge.net Sat Jul 1 19:21:55 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 10:21:55 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From noreply at sourceforge.net Sat Jul 1 19:39:55 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 01 Jul 2006 10:39:55 -0700 Subject: [Expat-bugs] [ expat-Bugs-1506891 ] XML_SetCharacterDataHandler callback function not parsing Message-ID: Bugs item #1506891, was opened at 2006-06-15 16:08 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506891&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required >Status: Closed >Resolution: Works For Me Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: XML_SetCharacterDataHandler callback function not parsing Initial Comment: Hello, I have a XML_SetCharacterDataHandler callback function that uses the text to build a directory path. I have noticed that at times the 1st node or last node will result in a partial capture of the text. example. my-57/actual image/ will return with "my-57/actual" only. I've attached my callback functions, startelement, endelement and dataelement. Thank you, Satyajit ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 13:39 Message: Logged In: YES user_id=3066 There's no clear indication of a bug here; general Q&A should occur on the mailing lists, not in the bug tracker. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-16 14:42 Message: Logged In: YES user_id=290026 Expat comes with a few demo apps, just look at them. This is taken from the "elements" demo: main(int argc, char *argv[]) { char buf[BUFSIZ]; XML_Parser parser = XML_ParserCreate(NULL); int done; int depth = 0; XML_SetUserData(parser, &depth); XML_SetElementHandler(parser, startElement, endElement); do { int len = (int)fread(buf, 1, sizeof(buf), stdin); done = len < sizeof(buf); if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR) { fprintf(stderr, "%s at line %" XML_FMT_INT_MOD "u\n", XML_ErrorString(XML_GetErrorCode(parser)), XML_GetCurrentLineNumber(parser)); return 1; } } while (!done); XML_ParserFree(parser); return 0; } ---------------------------------------------------------------------- Comment By: sssketkar man! (sketkar) Date: 2006-06-16 14:28 Message: Logged In: YES user_id=944435 Could you point me to an example I can look at to compare. I'm probably missing something obvious. Thanks for all your help. Satyajit. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-16 14:25 Message: Logged In: YES user_id=290026 I have no problem with your file. Maybe your parsing loop treats the last buffer incorrectly? ---------------------------------------------------------------------- Comment By: sssketkar man! (sketkar) Date: 2006-06-16 14:17 Message: Logged In: YES user_id=944435 Thought it might be useful, here is the XML file. test500115021118813-Jun-06 15:06:2810A/Actual Image/70L/Actual Image/71D/Acoustic Image/10R-5/Actual Image/70F/Actual Image/60P/Actual Image/80P-F/Actual Image/110P-M/Actual Image/ ---------------------------------------------------------------------- Comment By: sssketkar man! (sketkar) Date: 2006-06-16 14:09 Message: Logged In: YES user_id=944435 Also, out of curiousity, is the CharacterHandler called on a timed basis, ie time-sliced? Many the last node directory isn't getting buffered correctly because of timing issue between the StartElementHandler and EndElementHandler. Satyajit ---------------------------------------------------------------------- Comment By: sssketkar man! (sketkar) Date: 2006-06-16 14:07 Message: Logged In: YES user_id=944435 Okay, I changed my approach, so that the StartElementHandler sets a flag that is used by the CharacterHandler to collect user text until the EndElementHandlers is called at which point the buffered text is retreived. This approach seems to work expect for the last "iteration", i.e if there are 5 nodes, the first 4 are parsed properly. The last one is still not getting buffered all the way or correctly. Here is a debug output from each Handler... START: tool START: tooladdress END: tooladdress START: imageid END: imageid START: directory END: directory XXY-WER/Actual Image/ <-- correct END: tool START: tool START: tooladdress END: tooladdress START: imageid END: imageid START: directory END: directory XYZ-W3/Actualage//e/ <-- not so correct END: tool As you can see the last tool node directory is incorrect, it should have been XYZ-W3/Actual Image/ Thank you, Satyajit ---------------------------------------------------------------------- Comment By: sssketkar man! (sketkar) Date: 2006-06-15 18:42 Message: Logged In: YES user_id=944435 Thank you very much. I was expecting that the CharacterHandler was collecting all non-XML data in 1-shot. I wasn't using the EndElement as a check. I'll try that. Satyajit ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-15 17:52 Message: Logged In: YES user_id=290026 Do not expect Expat to return all character data within an element in one call-back. You have to accumulate the text in a buffer until the end-tag is reported. Are you doing this? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506891&group_id=10127 From noreply at sourceforge.net Mon Jul 3 22:40:37 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 03 Jul 2006 13:40:37 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-03 16:40 Message: Logged In: YES user_id=290026 I think in most cases this is not a problem. The general parsing loop in doContent() always checks if the characterDataHandler is set first. In the specific case you mentioned, there is a loop within the general loop, and in that internal loop there is no check for NULL. We could, for instance, pull the NULL check inside the loop, like your 2nd case, and the result would look like this: case XML_TOK_DATA_CHARS: if (MUST_CONVERT(enc, s)) { for (;;) { if (characterDataHandler) { ICHAR *dataPtr = (ICHAR *)dataBuf; XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); *eventEndPP = s; characterDataHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); if (s == next) break; *eventPP = s; } } } else if (characterDataHandler) { characterDataHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)next - (XML_Char *)s)); } else if (defaultHandler) reportDefault(parser, enc, s, next); break; I am not sure if the performance penalty is that high. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From noreply at sourceforge.net Tue Jul 4 15:37:59 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Tue, 04 Jul 2006 06:37:59 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContext() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required >Status: Open >Resolution: None Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContext() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Tue Jul 4 15:55:19 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Tue, 04 Jul 2006 06:55:19 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:55 Message: Logged In: YES user_id=290026 The same issue also exists in the doCdataSection() function, and I think the solution I suggested (putting the check if the character data handler is set into the internal loop) also solves bug # 1515266, as I described there. For the case where there is only one call-back, this should not be a performance penalty at all, as there still would be only one check if the handler is set. Attached as xmlparse.c.diff (Internal loop solution) - this also fixes the doCdataSection() function. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-03 16:40 Message: Logged In: YES user_id=290026 I think in most cases this is not a problem. The general parsing loop in doContent() always checks if the characterDataHandler is set first. In the specific case you mentioned, there is a loop within the general loop, and in that internal loop there is no check for NULL. We could, for instance, pull the NULL check inside the loop, like your 2nd case, and the result would look like this: case XML_TOK_DATA_CHARS: if (MUST_CONVERT(enc, s)) { for (;;) { if (characterDataHandler) { ICHAR *dataPtr = (ICHAR *)dataBuf; XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); *eventEndPP = s; characterDataHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); if (s == next) break; *eventPP = s; } } } else if (characterDataHandler) { characterDataHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)next - (XML_Char *)s)); } else if (defaultHandler) reportDefault(parser, enc, s, next); break; I am not sure if the performance penalty is that high. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From noreply at sourceforge.net Tue Jul 4 19:42:36 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Tue, 04 Jul 2006 10:42:36 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 13:42 Message: Logged In: YES user_id=290026 I replaced my last attachment with one that includes an update to the docs (reference.html). This solution should fix issue # 1515266 as well. I intend to commit this soon, if no objections are made. Note to Fred: I took out your test for XML_FINISHED and XML_SUSPENDED, as it currently introduces an issue for XML_SUSPENDED, and inconsistent behaviour for XML_FINISHED. We can discuss special treatment of aborting vs. suspending (i.e. ensure no more call-backs when aborting) later, but even as it is, subsequent call-backs can be suppressed by setting the affected handlers to NULL. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:55 Message: Logged In: YES user_id=290026 The same issue also exists in the doCdataSection() function, and I think the solution I suggested (putting the check if the character data handler is set into the internal loop) also solves bug # 1515266, as I described there. For the case where there is only one call-back, this should not be a performance penalty at all, as there still would be only one check if the handler is set. Attached as xmlparse.c.diff (Internal loop solution) - this also fixes the doCdataSection() function. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-03 16:40 Message: Logged In: YES user_id=290026 I think in most cases this is not a problem. The general parsing loop in doContent() always checks if the characterDataHandler is set first. In the specific case you mentioned, there is a loop within the general loop, and in that internal loop there is no check for NULL. We could, for instance, pull the NULL check inside the loop, like your 2nd case, and the result would look like this: case XML_TOK_DATA_CHARS: if (MUST_CONVERT(enc, s)) { for (;;) { if (characterDataHandler) { ICHAR *dataPtr = (ICHAR *)dataBuf; XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); *eventEndPP = s; characterDataHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); if (s == next) break; *eventPP = s; } } } else if (characterDataHandler) { characterDataHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)next - (XML_Char *)s)); } else if (defaultHandler) reportDefault(parser, enc, s, next); break; I am not sure if the performance penalty is that high. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From noreply at sourceforge.net Wed Jul 5 15:09:14 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 06:09:14 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open >Resolution: Fixed Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:09 Message: Logged In: YES user_id=290026 Applied patch in xmlparse.c rev. 1.156 and reference.html rev. 1.71. Please let nme know if we should discuss special treatment of aborting vs. suspending. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 13:42 Message: Logged In: YES user_id=290026 I replaced my last attachment with one that includes an update to the docs (reference.html). This solution should fix issue # 1515266 as well. I intend to commit this soon, if no objections are made. Note to Fred: I took out your test for XML_FINISHED and XML_SUSPENDED, as it currently introduces an issue for XML_SUSPENDED, and inconsistent behaviour for XML_FINISHED. We can discuss special treatment of aborting vs. suspending (i.e. ensure no more call-backs when aborting) later, but even as it is, subsequent call-backs can be suppressed by setting the affected handlers to NULL. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:55 Message: Logged In: YES user_id=290026 The same issue also exists in the doCdataSection() function, and I think the solution I suggested (putting the check if the character data handler is set into the internal loop) also solves bug # 1515266, as I described there. For the case where there is only one call-back, this should not be a performance penalty at all, as there still would be only one check if the handler is set. Attached as xmlparse.c.diff (Internal loop solution) - this also fixes the doCdataSection() function. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-03 16:40 Message: Logged In: YES user_id=290026 I think in most cases this is not a problem. The general parsing loop in doContent() always checks if the characterDataHandler is set first. In the specific case you mentioned, there is a loop within the general loop, and in that internal loop there is no check for NULL. We could, for instance, pull the NULL check inside the loop, like your 2nd case, and the result would look like this: case XML_TOK_DATA_CHARS: if (MUST_CONVERT(enc, s)) { for (;;) { if (characterDataHandler) { ICHAR *dataPtr = (ICHAR *)dataBuf; XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); *eventEndPP = s; characterDataHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); if (s == next) break; *eventPP = s; } } } else if (characterDataHandler) { characterDataHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)next - (XML_Char *)s)); } else if (defaultHandler) reportDefault(parser, enc, s, next); break; I am not sure if the performance penalty is that high. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From noreply at sourceforge.net Wed Jul 5 15:14:20 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 06:14:20 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContext() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open >Resolution: Fixed Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContext() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Wed Jul 5 15:18:56 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 06:18:56 -0700 Subject: [Expat-bugs] [ expat-Bugs-1513208 ] memory leak Message-ID: Bugs item #1513208, was opened at 2006-06-27 05:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1513208&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: memory leak Initial Comment: version: 1.95.8 two bufferes, one is "GL-002- 0012", another is "GL- 002-0012", the latter leak 80 bytes every time. the difference of both bufferes is whether the PROLOG exists. the tested function: XML_Parse. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:18 Message: Logged In: YES user_id=290026 Have you tested with CVS version? How are you determining the leak? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1513208&group_id=10127 From noreply at sourceforge.net Wed Jul 5 15:20:33 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 06:20:33 -0700 Subject: [Expat-bugs] [ expat-Bugs-1505207 ] Problem in Parsing using expat on Windows Message-ID: Bugs item #1505207, was opened at 2006-06-13 02:44 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1505207&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: www.libexpat.org Group: Platform Specific >Status: Closed >Resolution: Works For Me Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Karl Waclawek (kwaclaw) Summary: Problem in Parsing using expat on Windows Initial Comment: Hi, I am using expat to parse xml file. The parsing works fine in Unix but in windows I am facing problems. The start tag in case of Windows is only returning the first character. e.g. if the tag is like in the Handler for start tag, only the first character p from "property" is returned. I am linking with static library libexpatwMT.lib and have also defined XML_STATIC. Please let me know if i am doing anything wrong. Expat Version: 1.95.8 Thanks in advance, Asif Iqbal ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:20 Message: Logged In: YES user_id=290026 As Fred already said on another issue, this is an FAQ, not a bug report. Closing this issue. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-13 09:11 Message: Logged In: YES user_id=290026 On Windows, libexpatw... returns characters in UTF-16 encoding, libexpat... returns them in UTF-8 encoding. On Unix, UTF-8 is the standard encoding, on Windows it is UTF-16. Just treat the text as UTF-16, and you should be fine. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1505207&group_id=10127 From noreply at sourceforge.net Wed Jul 5 15:25:43 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 06:25:43 -0700 Subject: [Expat-bugs] [ expat-Bugs-1506892 ] I forgot to all the files, sorry Message-ID: Bugs item #1506892, was opened at 2006-06-15 16:10 Message generated for change (Settings changed) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506892&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: I forgot to all the files, sorry Initial Comment: Hello, I have a XML_SetCharacterDataHandler callback function that uses the text to build a directory path. I have noticed that at times the 1st node or last node will result in a partial capture of the text. example. my-57/actual image/ will return with "my-57/actual" only. I've attached my callback functions, startelement, endelement and dataelement. Thank you, Satyajit ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:25 Message: Logged In: YES user_id=290026 This was determined not to be a bug. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1506892&group_id=10127 From noreply at sourceforge.net Thu Jul 6 04:38:05 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 19:38:05 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Thu Jul 6 04:55:51 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 19:55:51 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 22:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Thu Jul 6 05:23:40 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 20:23:40 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 6 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 23:23 Message: Logged In: YES user_id=3066 The tests now pass, but agree that the lack of falling back to the default handler is undesirable. As noted, I'm not sure how much we want to worry about this in code, though, rather than through documentation. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 22:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Thu Jul 6 07:15:40 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 05 Jul 2006 22:15:40 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed >Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-06 01:15 Message: Logged In: YES user_id=3066 Python (on the trunk) is no longer quite as sensitive to the Expat implementation for this, so that's not a source of time pressure to come up with the final fix for this. Reducing priority back to "Medium" ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 23:23 Message: Logged In: YES user_id=3066 The tests now pass, but agree that the lack of falling back to the default handler is undesirable. As noted, I'm not sure how much we want to worry about this in code, though, rather than through documentation. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 22:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Thu Jul 6 19:03:32 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 06 Jul 2006 10:03:32 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 11:04 Message generated for change (Comment added) made by bcannon You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Brett Cannon (bcannon) Date: 2006-07-06 10:03 Message: Logged In: YES user_id=357491 Yes, I'm listening, Fred. =) If you look at PEP 356 (http://www.python.org/dev/peps/pep-0356/) it seems like b2 is due on July 12 and rc1 August 1. So there is still time to get whatever change/fix needed to Python's wrapper before we hit final release. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-06 05:56 Message: Logged In: YES user_id=290026 One way to preserver the old default handler logic would be this: Revert back to the original code, but save the character data handler into a local variable for the duration of the inner for loop. This would prevent the segfault, but would enforce the call-backs in the loop to go on until the loop terminates, even if the character data handler was cleared. I personally like this solution, but the question is how Python could handle it if there were more call-backs even after the handlers were cleared. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:15 Message: Logged In: YES user_id=3066 Python (on the trunk) is no longer quite as sensitive to the Expat implementation for this, so that's not a source of time pressure to come up with the final fix for this. Reducing priority back to "Medium" ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 20:23 Message: Logged In: YES user_id=3066 The tests now pass, but agree that the lack of falling back to the default handler is undesirable. As noted, I'm not sure how much we want to worry about this in code, though, rather than through documentation. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 19:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 19:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 06:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 06:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 06:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 08:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 08:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 21:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 11:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Thu Jul 6 19:19:40 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 06 Jul 2006 10:19:40 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-06 13:19 Message: Logged In: YES user_id=290026 I am attaching a patch to current CVS that preserves the default handler failover logic by saving the character data handler to a local variable instead of moving the NULL check into the inner for loop (file "localCharDataHandlerPatch.diff"). The drawback: Even if the handler is cleared, it will be called back on as long as the inner for loop is active. Could be a problem for Python, if it cannot deal with a few more call-backs despite clearing the handlers. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2006-07-06 13:03 Message: Logged In: YES user_id=357491 Yes, I'm listening, Fred. =) If you look at PEP 356 (http://www.python.org/dev/peps/pep-0356/) it seems like b2 is due on July 12 and rc1 August 1. So there is still time to get whatever change/fix needed to Python's wrapper before we hit final release. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-06 08:56 Message: Logged In: YES user_id=290026 One way to preserver the old default handler logic would be this: Revert back to the original code, but save the character data handler into a local variable for the duration of the inner for loop. This would prevent the segfault, but would enforce the call-backs in the loop to go on until the loop terminates, even if the character data handler was cleared. I personally like this solution, but the question is how Python could handle it if there were more call-backs even after the handlers were cleared. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-06 01:15 Message: Logged In: YES user_id=3066 Python (on the trunk) is no longer quite as sensitive to the Expat implementation for this, so that's not a source of time pressure to come up with the final fix for this. Reducing priority back to "Medium" ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 23:23 Message: Logged In: YES user_id=3066 The tests now pass, but agree that the lack of falling back to the default handler is undesirable. As noted, I'm not sure how much we want to worry about this in code, though, rather than through documentation. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 22:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Mon Jul 10 21:02:30 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 10 Jul 2006 12:02:30 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515266 ] missing check of stopped parser in doContent() 'for' loop Message-ID: Bugs item #1515266, was opened at 2006-06-30 14:04 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Test Required Status: Open Resolution: Fixed Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: missing check of stopped parser in doContent() 'for' loop Initial Comment: In Expat 2.0.0, in expat.c:doConvert() there is a 'for' loop for the XML_TOK_DATA_CHARS case. There is unfortunately no check in that loop whether the parser was stopped during that call because of an error. This was discovered in Python (Lib/test/crashers/xml_parsers.py) because pyexpat, upon error where there is no error return code like with characterDataHandlers, sets all handlers to 0, sets parsingStatus to XML_FINISHED, and sets errorCode. This leads to a segfault if the 'for' loop goes around again because parser->m_characterDataHandler is set to 0. A simple check if the parser is stopped fixes the problem. I have attached a simple patch that just breaks out of the loop and lets execution fall through to the bottom of the 'switch' statement. I don't know if returning errorCode directly would be better or if checking for XML_SUSPENDED is also desirable. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-10 15:02 Message: Logged In: YES user_id=290026 Applied the patch preserving default handler failover. See xmlparse.c rev. 1.158. Docs updated as well. Python compatibility still needs testing. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-06 13:19 Message: Logged In: YES user_id=290026 I am attaching a patch to current CVS that preserves the default handler failover logic by saving the character data handler to a local variable instead of moving the NULL check into the inner for loop (file "localCharDataHandlerPatch.diff"). The drawback: Even if the handler is cleared, it will be called back on as long as the inner for loop is active. Could be a problem for Python, if it cannot deal with a few more call-backs despite clearing the handlers. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2006-07-06 13:03 Message: Logged In: YES user_id=357491 Yes, I'm listening, Fred. =) If you look at PEP 356 (http://www.python.org/dev/peps/pep-0356/) it seems like b2 is due on July 12 and rc1 August 1. So there is still time to get whatever change/fix needed to Python's wrapper before we hit final release. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-06 08:56 Message: Logged In: YES user_id=290026 One way to preserver the old default handler logic would be this: Revert back to the original code, but save the character data handler into a local variable for the duration of the inner for loop. This would prevent the segfault, but would enforce the call-backs in the loop to go on until the loop terminates, even if the character data handler was cleared. I personally like this solution, but the question is how Python could handle it if there were more call-backs even after the handlers were cleared. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-06 01:15 Message: Logged In: YES user_id=3066 Python (on the trunk) is no longer quite as sensitive to the Expat implementation for this, so that's not a source of time pressure to come up with the final fix for this. Reducing priority back to "Medium" ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 23:23 Message: Logged In: YES user_id=3066 The tests now pass, but agree that the lack of falling back to the default handler is undesirable. As noted, I'm not sure how much we want to worry about this in code, though, rather than through documentation. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 22:55 Message: Logged In: YES user_id=290026 The bug is quite obvious when you look at it. When the character data handler is cleared, the for loop will do nothing forever. Please check again with xmlparse.c rev. 1.157. However this quick fix is not quite satisfying. There is one piece of logic that becomes ill-fitted now: the "fail-over" to the default handler does not work as expected anymore, so I'll have some more thinking to do. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-05 22:38 Message: Logged In: YES user_id=3066 The tests no longer complete, but take up all the CPU the system will let them have. Hallmarks of an infinite loop, if you ask me. :-) Are you able to run the tests on Windows? I don't know if a MSVC++ target was ever built for them, and don't have access to a Windows development machine most of the time. One thing that can be done is to document that the character data handler can't be removed (though it can be replaced), during parsing, except from some non-character data (and non-decoding-related) handler. Then the Python bindings can use an alternate approach, replacing the character handler with a completely no-op handler until it can be safely removed completely. Brett, are you still paying attention? I can make the needed changes to the Python bindings to isolate those from some of the changes in Expat, hopefully no later than sometime this weekend. Not sure what the release schedule is, though. Karl, I'm generally inclined to make Expat as safe from segfaults as possible, so I'd like things to "just work" in even some of the oddball scenarios that exception-handling wrappers built to support scripting languages might present, though I don't object to making them go through a bit of extra work. I know our main audience is very performance-sensitive, so I don't want to pay too high a cost on that front. It might be worth taking the discussion of alternatives to the mailing list, but I vaguely recall that we've done that before. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:26 Message: Logged In: YES user_id=290026 Corrected Summary. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:14 Message: Logged In: YES user_id=290026 Applied the patch for bug # 1515600 which solves this issue as well. Removed the check for XML_FINISHED/XML_SUSPENDED. We could discuss special treatment of XML_FINISHED, but if one is clearing all handlers anyway, then special treatment of XML_FINISHED is not necessary. For Fred: I have not re-run the test cases. Please do so and close the issue if successful. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:37 Message: Logged In: YES user_id=290026 I am re-opening this issue because in the case of a suspended parser, breaking out of the inner loop in XML_TOK_DATA_CHARS means that character call-backs are missed when resuming the parser. We should let the inner loop finish reporting all characters. The documentation already states that after calling XML_StopParser() there may still be a few call-backs that would otherwise be missed, so this would not be new behaviour, but consistent with existing behaviour. The solution to the problem described is the same as suggested for bug # 1515600 (Segfault after removing character data handler). Just put the NULL check for the character data handler inside the internal loop. Btw, the same problem exists in the doCdataSection() function. I'll attach a patch suggestion to bug # 1515600. We might decide to treat XML_FINISHED different from XML_SUSPENDED such that no other call-backs will happen, but in that case we need to review all the other places where this would need to be done as well (and update the documentation, of course). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:32 Message: Logged In: YES user_id=3066 Confirmed that the suspend behavior parallels the abort behavior Brett's patch fixed; fixed and added a regression test in lib/xmlparse.c 1.155 and tests/runtests.c 1.66. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 11:02 Message: Logged In: YES user_id=3066 Added a regression test in tests/runtests.c revision 1.65. Closing this report. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-07-01 00:00 Message: Logged In: YES user_id=3066 That seems fine, but can be done faster within the Expat implementation. I've committed the simplified patch as lib/xmlparse.c revision 1.154. I'll have a test case committed tomorrow as well. Leaving this report open for now since I need to finish up the test case. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2006-06-30 14:40 Message: Logged In: YES user_id=3066 The Python folks need this dealt with before Python 2.5, so I'll try and take a look at it this weekend if no one beats me to it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515266&group_id=10127 From noreply at sourceforge.net Mon Jul 10 21:03:33 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Mon, 10 Jul 2006 12:03:33 -0700 Subject: [Expat-bugs] [ expat-Bugs-1515600 ] Segfault after removing character data handler Message-ID: Bugs item #1515600, was opened at 2006-07-01 13:21 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: Fixed Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault after removing character data handler Initial Comment: Removing the character data handler from within the character data handler while character data remains to be reported causes a call to a NULL pointer (generally followed by a memory access violation of your platform's favorite flavor). If the XML_StopParser() API has been called, this is not a problem with the version in CVS. This is admittedly an odd use case. The recent fixes to make the XML_StopParser() calls supported makes the parser behave well when accessed from languages that support exceptions (the host language API can call XML_StopParser to abort further work from Expat when an exception occurs). The case of a character data handler removing itself is unusual (in context, there can be no calls to anything else other than a decoding handler). I think there are two possible solutions: 1) Document that the character data handler cannot remove itself without calling XML_StopParser(). This avoids introducing a performance penalty for really this really odd case, but I don't know how bad testing for a NULL value would really be at this point, since there are a few other checks and an indirect assignment. 2) Add a check that the character data handler is still set before the loop goes around again, and fall back to the defaultHandler for the remaining data. This would introduce a single check for a NULL pointer in the loop in the XML_TOK_DATA_CHARS case in doContent(). I've attached a patch with a test case that demonstrates this bug; the test generates a segfault on Unix. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-10 15:03 Message: Logged In: YES user_id=290026 Applied an improved patch that preserves default handler failover logic. See xmlparse.c rev. 1.158. Docs updated as well. Python compatibility still needs testing. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-05 09:09 Message: Logged In: YES user_id=290026 Applied patch in xmlparse.c rev. 1.156 and reference.html rev. 1.71. Please let nme know if we should discuss special treatment of aborting vs. suspending. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 13:42 Message: Logged In: YES user_id=290026 I replaced my last attachment with one that includes an update to the docs (reference.html). This solution should fix issue # 1515266 as well. I intend to commit this soon, if no objections are made. Note to Fred: I took out your test for XML_FINISHED and XML_SUSPENDED, as it currently introduces an issue for XML_SUSPENDED, and inconsistent behaviour for XML_FINISHED. We can discuss special treatment of aborting vs. suspending (i.e. ensure no more call-backs when aborting) later, but even as it is, subsequent call-backs can be suppressed by setting the affected handlers to NULL. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-04 09:55 Message: Logged In: YES user_id=290026 The same issue also exists in the doCdataSection() function, and I think the solution I suggested (putting the check if the character data handler is set into the internal loop) also solves bug # 1515266, as I described there. For the case where there is only one call-back, this should not be a performance penalty at all, as there still would be only one check if the handler is set. Attached as xmlparse.c.diff (Internal loop solution) - this also fixes the doCdataSection() function. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-07-03 16:40 Message: Logged In: YES user_id=290026 I think in most cases this is not a problem. The general parsing loop in doContent() always checks if the characterDataHandler is set first. In the specific case you mentioned, there is a loop within the general loop, and in that internal loop there is no check for NULL. We could, for instance, pull the NULL check inside the loop, like your 2nd case, and the result would look like this: case XML_TOK_DATA_CHARS: if (MUST_CONVERT(enc, s)) { for (;;) { if (characterDataHandler) { ICHAR *dataPtr = (ICHAR *)dataBuf; XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); *eventEndPP = s; characterDataHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); if (s == next) break; *eventPP = s; } } } else if (characterDataHandler) { characterDataHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)next - (XML_Char *)s)); } else if (defaultHandler) reportDefault(parser, enc, s, next); break; I am not sure if the performance penalty is that high. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1515600&group_id=10127 From mturner at cc.umanitoba.ca Thu Jul 13 17:16:18 2006 From: mturner at cc.umanitoba.ca (Myron Turner) Date: Thu, 13 Jul 2006 10:16:18 -0500 Subject: [Expat-bugs] Junk after document element Message-ID: <44B663C2.4090506@cc.umanitoba.ca> I'm not a subscriber to the expat bugs list, but I've been developing an XML Pull Parser for PHP, using PHP 5 and expat 1.95.8. When I run tests on earlier versions of expat, for the same XML I will get the "junk after document element" error. I see from google that the xml mailing lists are full of questions concerning this error. I've looked briefly at the expat code and see that tokens are passed to an error checking function which defaults to a return value of XML_ERROR_JUNK_AFTER_DOC_ELEMENT, which could explain why this error appears so often. I've looked through the bug reports and checked the subject headers to the expat bugs mailing list archives , but I can't find any reference to this problem. Has this been noticed and silently fixed? If in fact this is a bug in earlier versions of expat I'd like to make an accommodation in my own code, in so far as I'm able to determine which version of expat or PHP is being used. Thanks, Myron Turner http://www.mturner.org/XML_PullParser/ http://freshmeat.net/projects/xml_pullparser/ From karl at waclawek.net Thu Jul 13 19:38:51 2006 From: karl at waclawek.net (Karl Waclawek) Date: Thu, 13 Jul 2006 13:38:51 -0400 Subject: [Expat-bugs] Junk after document element In-Reply-To: <44B663C2.4090506@cc.umanitoba.ca> References: <44B663C2.4090506@cc.umanitoba.ca> Message-ID: <44B6852B.8020804@waclawek.net> Myron Turner wrote: > I'm not a subscriber to the expat bugs list, but I've been developing an > XML Pull Parser for PHP, using PHP 5 and expat 1.95.8. When I run > tests on earlier versions of expat, for the same XML I will get the > "junk after document element" error. I see from google that the xml > mailing lists are full of questions concerning this error. I've looked > briefly at the expat code and see that tokens are passed to an error > checking function which defaults to a return value of > XML_ERROR_JUNK_AFTER_DOC_ELEMENT, which could explain why this error > appears so often. > > I've looked through the bug reports and checked the subject headers to > the expat bugs mailing list archives , but I can't find any reference to > this problem. Has this been noticed and silently fixed? If in fact > this is a bug in earlier versions of expat I'd like to make an > accommodation in my own code, in so far as I'm able to determine which > version of expat or PHP is being used. > > There is no know bug that I am aware of. XML_ERROR_JUNK_AFTER_DOC_ELEMENT simply means that the buffer passed to the parser contained extra characters after the end of the document. It is a common error not to tell the parser exactly where the last buffer ends. Maybe you should review your parser loop or post it here. Karl From mturner at cc.umanitoba.ca Fri Jul 14 17:36:14 2006 From: mturner at cc.umanitoba.ca (Myron Turner) Date: Fri, 14 Jul 2006 10:36:14 -0500 Subject: [Expat-bugs] Junk after document element References: 44B663C2.4090506@cc.umanitoba.ca Message-ID: <44B7B9EE.3080208@cc.umanitoba.ca> > > XML_ERROR_JUNK_AFTER_DOC_ELEMENT simply means that the buffer passed to > the parser contained extra characters after the end of the document. > It is a common error not to tell the parser exactly where the last buffer ends. > Maybe you should review your parser loop or post it here. > > Karl Thanks very much. An extra pass through the loop was in fact the problem. Myron -- _____________________ Myron Turner http://www.mturner.org/XML_PullParser/ From noreply at sourceforge.net Sun Jul 16 03:41:41 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sat, 15 Jul 2006 18:41:41 -0700 Subject: [Expat-bugs] [ expat-Patches-1523242 ] Patch to compile EXPAT with Open Watcom 1.5 Message-ID: Patches item #1523242, was opened at 2006-07-15 21:41 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=310127&aid=1523242&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Feature Request Status: Open Resolution: None Priority: 5 Submitted By: MikeG (greenemk) Assigned to: Nobody/Anonymous (nobody) Summary: Patch to compile EXPAT with Open Watcom 1.5 Initial Comment: The attached archive contains the following: watcomconfig.h Open Watcom header file, copy to lib directory expat.diff Source diffs against 7-10-06 cvs watcom.zip Open Watcom makefiles, unzips to root expat directory (expat-2.0.0\watcom) Allows expat library and files to be compiled with Open Watcom 1.5 (www.openwatcom.org) for OS/2-ECS, NT/Win2000, and Linux. All tests pass on all 3 OSs. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=310127&aid=1523242&group_id=10127 From noreply at sourceforge.net Sun Jul 16 21:19:53 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 16 Jul 2006 12:19:53 -0700 Subject: [Expat-bugs] [ expat-Patches-1523242 ] Patch to compile EXPAT with Open Watcom 1.5 Message-ID: Patches item #1523242, was opened at 2006-07-15 21:41 Message generated for change (Comment added) made by greenemk You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=310127&aid=1523242&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Feature Request Status: Open Resolution: None Priority: 5 Submitted By: MikeG (greenemk) Assigned to: Nobody/Anonymous (nobody) Summary: Patch to compile EXPAT with Open Watcom 1.5 Initial Comment: The attached archive contains the following: watcomconfig.h Open Watcom header file, copy to lib directory expat.diff Source diffs against 7-10-06 cvs watcom.zip Open Watcom makefiles, unzips to root expat directory (expat-2.0.0\watcom) Allows expat library and files to be compiled with Open Watcom 1.5 (www.openwatcom.org) for OS/2-ECS, NT/Win2000, and Linux. All tests pass on all 3 OSs. ---------------------------------------------------------------------- >Comment By: MikeG (greenemk) Date: 2006-07-16 15:19 Message: Logged In: YES user_id=1390255 A quick change to makefile.mif. When building makefile.mif includes buildopts.inc and watopts.tmp. I moved watopts.tmp to be included after buildopts.inc so compile options can be overridden if expat is being compile as part of a larger project. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=310127&aid=1523242&group_id=10127 From noreply at sourceforge.net Thu Jul 20 19:56:00 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 20 Jul 2006 10:56:00 -0700 Subject: [Expat-bugs] [ expat-Bugs-1526052 ] '<' char in argument string? Message-ID: Bugs item #1526052, was opened at 2006-07-20 17:56 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Davar Learn (davarlearn) Assigned to: Nobody/Anonymous (nobody) Summary: '<' char in argument string? Initial Comment: This may be a problem with the parser, or it may be me trying to go further than XML Standards? im trying to store keyboard mapping data in arguments. Here is some sample XML code using test data: # problem here ... ... ... In this line the parser takes the '<' char to be the start of the next tag, all further tags are mis aligned and my handler ignores them as invalid data format. this may be how the program is ment to operate? If it is, are there any other ways i can store a string containing '<' characters. I need to be able to use all asci(possibly some non asci) char's to store font conversion mappings. Im working on a sourceforge project and would be greatfull of some feadback. thanks for youre help James ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 From noreply at sourceforge.net Thu Jul 20 19:57:51 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 20 Jul 2006 10:57:51 -0700 Subject: [Expat-bugs] [ expat-Bugs-1526052 ] '<' char in argument string? Message-ID: Bugs item #1526052, was opened at 2006-07-20 17:56 Message generated for change (Settings changed) made by davarlearn You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None >Priority: 7 Submitted By: Davar Learn (davarlearn) Assigned to: Nobody/Anonymous (nobody) Summary: '<' char in argument string? Initial Comment: This may be a problem with the parser, or it may be me trying to go further than XML Standards? im trying to store keyboard mapping data in arguments. Here is some sample XML code using test data: # problem here ... ... ... In this line the parser takes the '<' char to be the start of the next tag, all further tags are mis aligned and my handler ignores them as invalid data format. this may be how the program is ment to operate? If it is, are there any other ways i can store a string containing '<' characters. I need to be able to use all asci(possibly some non asci) char's to store font conversion mappings. Im working on a sourceforge project and would be greatfull of some feadback. thanks for youre help James ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 From noreply at sourceforge.net Fri Jul 21 04:59:48 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 20 Jul 2006 19:59:48 -0700 Subject: [Expat-bugs] [ expat-Bugs-1526052 ] '<' char in argument string? Message-ID: Bugs item #1526052, was opened at 2006-07-20 12:56 Message generated for change (Comment added) made by turnermm You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 7 Submitted By: Davar Learn (davarlearn) Assigned to: Nobody/Anonymous (nobody) Summary: '<' char in argument string? Initial Comment: This may be a problem with the parser, or it may be me trying to go further than XML Standards? im trying to store keyboard mapping data in arguments. Here is some sample XML code using test data: # problem here ... ... ... In this line the parser takes the '<' char to be the start of the next tag, all further tags are mis aligned and my handler ignores them as invalid data format. this may be how the program is ment to operate? If it is, are there any other ways i can store a string containing '<' characters. I need to be able to use all asci(possibly some non asci) char's to store font conversion mappings. Im working on a sourceforge project and would be greatfull of some feadback. thanks for youre help James ---------------------------------------------------------------------- Comment By: Myron Turner (turnermm) Date: 2006-07-20 21:59 Message: Logged In: YES user_id=771029 The parser expects src="<" This will be converted back to '<' by the parser in the output. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1526052&group_id=10127 From giulio.rossato at infocamere.it Fri Jul 21 13:52:39 2006 From: giulio.rossato at infocamere.it (Giulio Rossato) Date: Fri, 21 Jul 2006 13:52:39 +0200 Subject: [Expat-bugs] using '<' and '>' in the attributes Message-ID: <44C0C007.3030806@infocamere.it> Hi all, I have a problem using expat. When i use the characters '<' and '>' inside an attribute value, i get the error 'not well-formed (invalid token)'. I get the error whether with version 2.0.0 or with version 1.95.8 of expat. The tests are run on solaris operating system. Here is my test code: fpx10:[sirius]>/home/sirius/development/src_test/sqlexecTest $ cat expattest.c /* * Test of expat lib. */ #include #include #include #define XML_INPUT_FILE "expattest.xml" static void XMLCALL startElement(void *userData, const char *name, const char **atts) { int i; printf("start tag: <%s>\n", name); for (i=0; atts[i]!=NULL; i+=2) printf("\tattribute %s=%s\n", atts[i], atts[i+1]); } static void XMLCALL endElement(void *userData, const char *name) { printf("end tag: <%s>\n", name); } static int parseInputXml(FILE *fpIn) { XML_Parser parser; char buf[BUFSIZ]; int done; parser = XML_ParserCreate(NULL); if (parser == NULL) { printf("error creating parser\n"); return 1; } XML_SetUserData(parser, NULL); XML_SetElementHandler(parser, startElement, endElement); do { int len = fread(buf, 1, sizeof(buf), fpIn); done = feof(fpIn); if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR) { if (XML_GetErrorCode(parser) != XML_ERROR_ABORTED) printf("parse error at line %d: %s\n", XML_GetCurrentLineNumber(parser), XML_ErrorString(XML_GetErrorCode(parser))); break; } } while (!done); XML_ParserFree(parser); return 0; } static void testExpat() { FILE *fpIn; int ret; fpIn = fopen(XML_INPUT_FILE, "r"); if (fpIn == NULL) { printf("error opening file %s\n", XML_INPUT_FILE); return; } ret = parseInputXml(fpIn); fclose(fpIn); } int main() { testExpat(); } If I run this program with various input, the outputs that I get are: fpx10:[sirius]>/home/sirius/development/src_test/sqlexecTest $ cat expattest.xml; ./expattest start tag: attribute name=test 1 end tag: fpx10:[sirius]>/home/sirius/development/src_test/sqlexecTest $ cat expattest.xml; ./expattest parse error at line 2: not well-formed (invalid token) fpx10:[sirius]>/home/sirius/development/src_test/sqlexecTest $ cat expattest.xml; ./expattest start tag: attribute name=test <> 1 end tag: In the first prove the attribute 'name' doesn't contain the characters '<' and '>' and the test program goes well. In the second prove the attribute 'name' contains the characters '<' and '>' and the test program gets an error. In the third prove the characters '<' and '>' are been substituted rispectively with < and > and the test program runs correctly. Is there an error on my code? can the characters < and > be used in the attributes? Thanks for your feedback. From noreply at sourceforge.net Thu Jul 27 20:46:38 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 27 Jul 2006 11:46:38 -0700 Subject: [Expat-bugs] [ expat-Bugs-1490371 ] additional config for INSTALL_ROOT Message-ID: Bugs item #1490371, was opened at 2006-05-17 09:36 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: www.libexpat.org Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: additional config for INSTALL_ROOT Initial Comment: When I install expat 2.0.0, it shows me the following error always. but expat 1.9.5 is fine. camelot# make install make: Fatal error in reader: Makefile, line 48: Unexpected end of line seen the line 48 is as following: 47:ifndef INSTALL_ROOT 48:INSTALL_ROOT=$(DESTDIR) 49:if ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-07-27 11:46 Message: Logged In: NO I'm getting the same error, expat-2.0.0.tar.gz, Solaris 8 on Sparc, using Sun Forte 7 cc. zeus:/tmp/expat-2.0.0# which make /usr/ccs/bin/make zeus:/tmp/expat-2.0.0# which cc /opt/forte7/SUNWspro/bin/cc zeus:/tmp/expat-2.0.0# ./configure checking build system type... sparc-sun-solaris2.9 checking host system type... sparc-sun-solaris2.9 checking for gcc... no checking for cc... cc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... no checking whether cc accepts -g... yes checking for cc option to accept ANSI C... none needed checking for a sed that does not truncate output... /usr/bin/sed checking for egrep... egrep checking for non-GNU ld... /usr/ucb/ld checking if the linker (/usr/ucb/ld) is GNU ld... no checking for /usr/ucb/ld option to reload object files... - r checking for BSD-compatible nm... /usr/ccs/bin/nm -p checking whether ln -s works... yes checking how to recognise dependent libraries... pass_all checking how to run the C preprocessor... cc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... no checking for unistd.h... yes checking dlfcn.h usability... yes checking dlfcn.h presence... yes checking for dlfcn.h... yes checking for g++... no checking for c++... no checking for gpp... no checking for aCC... no checking for CC... CC checking whether we are using the GNU C++ compiler... no checking whether CC accepts -g... yes checking how to run the C++ preprocessor... CC -E checking for g77... no checking for f77... f77 checking whether we are using the GNU Fortran 77 compiler... no checking whether f77 accepts -g... yes checking the maximum length of command line arguments... 262144 checking command to parse /usr/ccs/bin/nm -p output from cc object... ok checking for objdir... .libs checking for ar... ar checking for ranlib... ranlib checking for strip... strip checking for cc option to produce PIC... -KPIC checking if cc PIC flag -KPIC works... yes checking if cc static flag -Bstatic works... yes checking if cc supports -c -o file.o... yes checking whether the cc linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... no checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes configure: creating libtool appending configuration tag "CXX" to libtool checking whether the CC linker (/usr/ucb/ld) supports shared libraries... yes checking for CC option to produce PIC... -KPIC checking if CC PIC flag -KPIC works... yes checking if CC static flag -Bstatic works... yes checking if CC supports -c -o file.o... yes checking whether the CC linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate appending configuration tag "F77" to libtool checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes checking for f77 option to produce PIC... -KPIC checking if f77 PIC flag -KPIC works... yes checking if f77 static flag -Bstatic works... yes checking if f77 supports -c -o file.o... yes checking whether the f77 linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate checking for gcc... (cached) cc checking whether we are using the GNU C compiler... (cached) no checking whether cc accepts -g... (cached) yes checking for cc option to accept ANSI C... (cached) none needed checking for a BSD-compatible install... conftools/install- sh -c checking for ANSI C header files... (cached) yes checking whether byte ordering is bigendian... yes checking for an ANSI C-conforming const... yes checking for size_t... yes checking for memmove... yes checking for bcopy... yes checking fcntl.h usability... yes checking fcntl.h presence... yes checking for fcntl.h... yes checking for unistd.h... (cached) yes checking for off_t... yes checking for stdlib.h... (cached) yes checking for unistd.h... (cached) yes checking for getpagesize... yes checking for working mmap... yes checking for an ANSI C99-conforming __func__... yes configure: creating ./config.status config.status: creating Makefile config.status: creating expat_config.h zeus:/tmp/expat-2.0.0# make make: Fatal error in reader: Makefile, line 48: Unexpected end of line seen INSTALL_ROOT=$(DESTDIR) ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-01 14:01 Message: Logged In: YES user_id=290026 Could you please try a checkout from CVS. If you still have a problem, then maybe "make" on your system is too old, or otherwise different. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-06-01 13:33 Message: Logged In: NO I'm having the same problem building in a Solaris 10 on Sparc environment. I'm using 2.0.0 from a .gz tarball. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-05-17 10:19 Message: Logged In: YES user_id=290026 In which environment do you try to build expat? Is this a checkout from CVD or did you download the .gz archive? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127 From noreply at sourceforge.net Thu Jul 27 21:32:43 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 27 Jul 2006 12:32:43 -0700 Subject: [Expat-bugs] [ expat-Bugs-1490371 ] additional config for INSTALL_ROOT Message-ID: Bugs item #1490371, was opened at 2006-05-17 09:36 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: www.libexpat.org Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: additional config for INSTALL_ROOT Initial Comment: When I install expat 2.0.0, it shows me the following error always. but expat 1.9.5 is fine. camelot# make install make: Fatal error in reader: Makefile, line 48: Unexpected end of line seen the line 48 is as following: 47:ifndef INSTALL_ROOT 48:INSTALL_ROOT=$(DESTDIR) 49:if ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-07-27 12:32 Message: Logged In: NO Sorry, it's Solaris 9 :-). But you get the idea - same error as other people. I also tried './configure --prefix =/usr/local', no difference. Changed ifndef INSTALL_ROOT INSTALL_ROOT=$(DESTDIR) endif to INSTALL_ROOT=$(prefix) and it put eveything in /usr/local/usr/local. Perhaps a Solaris/GNU make syntax problem. Tried GNU make, no luck. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-07-27 11:46 Message: Logged In: NO I'm getting the same error, expat-2.0.0.tar.gz, Solaris 8 on Sparc, using Sun Forte 7 cc. zeus:/tmp/expat-2.0.0# which make /usr/ccs/bin/make zeus:/tmp/expat-2.0.0# which cc /opt/forte7/SUNWspro/bin/cc zeus:/tmp/expat-2.0.0# ./configure checking build system type... sparc-sun-solaris2.9 checking host system type... sparc-sun-solaris2.9 checking for gcc... no checking for cc... cc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... no checking whether cc accepts -g... yes checking for cc option to accept ANSI C... none needed checking for a sed that does not truncate output... /usr/bin/sed checking for egrep... egrep checking for non-GNU ld... /usr/ucb/ld checking if the linker (/usr/ucb/ld) is GNU ld... no checking for /usr/ucb/ld option to reload object files... - r checking for BSD-compatible nm... /usr/ccs/bin/nm -p checking whether ln -s works... yes checking how to recognise dependent libraries... pass_all checking how to run the C preprocessor... cc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... no checking for unistd.h... yes checking dlfcn.h usability... yes checking dlfcn.h presence... yes checking for dlfcn.h... yes checking for g++... no checking for c++... no checking for gpp... no checking for aCC... no checking for CC... CC checking whether we are using the GNU C++ compiler... no checking whether CC accepts -g... yes checking how to run the C++ preprocessor... CC -E checking for g77... no checking for f77... f77 checking whether we are using the GNU Fortran 77 compiler... no checking whether f77 accepts -g... yes checking the maximum length of command line arguments... 262144 checking command to parse /usr/ccs/bin/nm -p output from cc object... ok checking for objdir... .libs checking for ar... ar checking for ranlib... ranlib checking for strip... strip checking for cc option to produce PIC... -KPIC checking if cc PIC flag -KPIC works... yes checking if cc static flag -Bstatic works... yes checking if cc supports -c -o file.o... yes checking whether the cc linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... no checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes configure: creating libtool appending configuration tag "CXX" to libtool checking whether the CC linker (/usr/ucb/ld) supports shared libraries... yes checking for CC option to produce PIC... -KPIC checking if CC PIC flag -KPIC works... yes checking if CC static flag -Bstatic works... yes checking if CC supports -c -o file.o... yes checking whether the CC linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate appending configuration tag "F77" to libtool checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes checking for f77 option to produce PIC... -KPIC checking if f77 PIC flag -KPIC works... yes checking if f77 static flag -Bstatic works... yes checking if f77 supports -c -o file.o... yes checking whether the f77 linker (/usr/ucb/ld) supports shared libraries... yes checking dynamic linker characteristics... solaris2.9 ld.so checking how to hardcode library paths into programs... immediate checking for gcc... (cached) cc checking whether we are using the GNU C compiler... (cached) no checking whether cc accepts -g... (cached) yes checking for cc option to accept ANSI C... (cached) none needed checking for a BSD-compatible install... conftools/install- sh -c checking for ANSI C header files... (cached) yes checking whether byte ordering is bigendian... yes checking for an ANSI C-conforming const... yes checking for size_t... yes checking for memmove... yes checking for bcopy... yes checking fcntl.h usability... yes checking fcntl.h presence... yes checking for fcntl.h... yes checking for unistd.h... (cached) yes checking for off_t... yes checking for stdlib.h... (cached) yes checking for unistd.h... (cached) yes checking for getpagesize... yes checking for working mmap... yes checking for an ANSI C99-conforming __func__... yes configure: creating ./config.status config.status: creating Makefile config.status: creating expat_config.h zeus:/tmp/expat-2.0.0# make make: Fatal error in reader: Makefile, line 48: Unexpected end of line seen INSTALL_ROOT=$(DESTDIR) ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-06-01 14:01 Message: Logged In: YES user_id=290026 Could you please try a checkout from CVS. If you still have a problem, then maybe "make" on your system is too old, or otherwise different. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2006-06-01 13:33 Message: Logged In: NO I'm having the same problem building in a Solaris 10 on Sparc environment. I'm using 2.0.0 from a .gz tarball. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2006-05-17 10:19 Message: Logged In: YES user_id=290026 In which environment do you try to build expat? Is this a checkout from CVD or did you download the .gz archive? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1490371&group_id=10127