[XML-SIG] Eureka! (Core dumps in 4Suite and PyXML)

Uche Ogbuji uche.ogbuji@fourthought.com
18 Sep 2002 14:45:43 -0600


Well, I found the source of the core dumps, I think.

First of all, I used valgrind.  After ignoring the well-known spurious
error reports for Python, the only reports were for illegal memory
overwrites in expat/lib/xmlparse.c

For example:

==6975== Invalid write of size 2
==6975==    at 0x43A98077: doContent
(Ft/Xml/src/expat/lib/xmlparse.c:2110)
==6975==    by 0x43A9E59E: contentProcessor
(Ft/Xml/src/expat/lib/xmlparse.c:1691)
==6975==    by 0x43A9E1E6: XML_ParseBuffer
(Ft/Xml/src/expat/lib/xmlparse.c:1394)
==6975==    by 0x43A9E18C: XML_Parse
(Ft/Xml/src/expat/lib/xmlparse.c:1382)
==6975==    Address 0x40B114DE is 10 bytes before a block of size 28
free'd
==6975==    at 0x40044946: free (vg_clientfuncs.c:180)
==6975==    by 0x80579CD: _PyObject_Del (Objects/object.c:143)
==6975==    by 0x805C9EC: string_dealloc (Objects/stringobject.c:504)
==6975==    by 0x805C31B: PyString_InternInPlace
(Objects/stringobject.c:3628)
==6975== 

So, since I recently upgraded 4Suite to use 1.95.5, I backed that out
and restored the 1.95.4 files.  The core dumps went away in 4Suite.  I
tried the PyXML versions before the move to the newest expat: both 0.7.1
and 0.8.0.  No core dumps in either case.

So it seems to me it's something in expat 1.95.5.

Following the pointer from valgrind gives the following block of lines
as the suspect:

  2103            if (ns && localPart) {
  2104              /* localPart and prefix may have been overwritten in
  2105                 tag->name.str, since this points to the
binding->uri
  2106                 buffer which gets re-used; so we have to add them
again
  2107              */
  2108              uri = (XML_Char *)tag->name.str + tag->name.uriLen;
  2109              /* don't need to check for space - already done in
storeAtts() */
  2110              while (*localPart) *uri++ = *localPart++;
  2111              prefix = (XML_Char *)tag->name.prefix;
  2112              if (ns_triplets && prefix) {
  2113                *uri++ = namespaceSeparator;
  2114                while (*prefix) *uri++ = *prefix++;
  2115               }
  2116              *uri = XML_T('\0');
  2117            }

With 2110 being the line singled out.  It says that memory was already
freed.  Doesn't say whether uri or localPart.

I've backed out of the latest expat on my own machine so I can continue
working.  Since Jeremy is also seeing core dumps, I'll probably check in
that reversion.  But I'd like any assistance making sure I'm not off my
skull.  Anyone have any other ideas for verification?


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Apache 2.0 API -
http://www-106.ibm.com/developerworks/linux/library/l-apache/
Basic XML and RDF techniques for knowledge management, Part 7 -
http://www-106.ibm.com/developerworks/xml/library/x-think12.html
Keeping pace with James Clark -
http://www-106.ibm.com/developerworks/xml/library/x-jclark.html