[XML-SIG] [ pyxml-Bugs-1519384 ] bug in xmlparse_GetInputContext

SourceForge.net noreply at sourceforge.net
Sun Jul 9 01:51:44 CEST 2006


Bugs item #1519384, was opened at 2006-07-08 16:51
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1519384&group_id=6473

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nelson Arzola (narzola72)
Assigned to: Nobody/Anonymous (nobody)
Summary: bug in xmlparse_GetInputContext

Initial Comment:
SHORT VERSION:
I think the call in extensions/pyexpat:1089 

result = PyString_FromStringAndSize(buffer + offset, size) 

should be:

result = PyString_FromStringAndSize(buffer + offset,
size - offset)


This this change, my application does not core dump any
more.




LONG VERSION:
I have a Apache + mod_python + Expat(2.0.0) application.  

Under Linux (Gentoo), everything works as expected.

Under Mac OS X, everything works as expected until I
increase the size of a particular XML template file. 
The entire application will segfault.  It does not
matter what I add to this file.  It can be whitespace,
comments, or additional XML markup.

I've included some of the output I gathered from gdb. 
Here is what I am sure of:

I put a breakpoint on extensions/pyexpat.c:1089.  The
31st call from this function to
PyString_FromStringAndSize(buffer + offset, size) will
cause the segfault coredump.

I then followed this 31st call to
PyString_FromStringAndSize.  As I stepped through the
execution, the call to memcpy(op->ob_sval, str, size)
in Objects/stringobject.c:80 is the culprit.

Just before this call, I printed out the values of the
arguments:

(gdb) print *op
$14 = {
  ob_refcnt = 1, 
  ob_type = 0x508f88, 
  ob_size = 11070, 
  ob_shash = -1, 
  ob_sstate = 0, 
  ob_sval = "<"
}
(gdb) 


(gdb) print op->ob_sval + 0
$23 = 0x1aa8e14 "<li><a tal:href=\"string:
/public/justforstudents.xhtml\">Just for
Students</a></li>\n\t<li>&nbsp;</li>\n\t<li><a
tal:href=\"string:
/public/departments.xhtml\">Departments</a></li>\n\t<li><a
tal:href=\"string: "...
(gdb) 


(gdb) print str
$15 = 0x16b44c3 "<a tal:href=\"string:
/public/justforstudents.xhtml\">Just for
Students</a></li>\n\t<li>&nbsp;</li>\n\t<li><a
tal:href=\"string:
/public/departments.xhtml\">Departments</a></li>\n\t<li><a
tal:href=\"string: /pub"...
(gdb) 


(gdb) print size
$16 = 11070
(gdb)


When I allow the execution to continue to the memcpy, I
get the following exception:

(gdb) s

Program received signal EXC_BAD_ACCESS, Could not
access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x016b7000
0xffff8824 in ___memcpy () at
/System/Library/Frameworks/System.framework/PrivateHeaders/ppc/cpu_capabilities.h:189
189    
/System/Library/Frameworks/System.framework/PrivateHeaders/ppc/cpu_capabilities.h:
No such file or directory.
        in
/System/Library/Frameworks/System.framework/PrivateHeaders/ppc/cpu_capabilities.h
(gdb) 


Backing up to frame 1 (Objects/stringobject.c:80), I
can see that "str" is not "size" (11070) bytes long,
but actually less:

(gdb) print (long) strlen(str)
$31 = 5755


When I backup to frame 2 (extensions/pyexpat.c:1089), I
can can see that:

(gdb) print size - offset
$34 = 5755


I think this works under Linux because Linux actually
allocates memory in larger chunks than requested or the
 overlapped area is not getting accessed......

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1519384&group_id=6473


More information about the XML-SIG mailing list