A QFB agent: how to catch C-level crashes and last Python stack ?

robert no-spam at no-spam-no-spam.com
Sat Apr 29 15:11:54 EDT 2006


Thomas Heller wrote:
> robert wrote:
> 
>> When employing complex UI libs (wx, win32ui, ..) and other extension 
>> libs, nice "only Python stack traces" remain a myth.
>>
>> Currently I'm hunting again a rare C-level crash bug of a Python based 
>> Windows app with rare user reports - and still in the dark (I get 
>> snippets of machine stack traces / screenshots with random "mem. 
>> access error" / "python caused the runtime to terminate in an unusual 
>> way" / ..)
>>
>> I'd like to hook a kind of quality feedback agent a C-level to enable 
>> the user transfer a report (similar to what Mozilla/Netscape has), 
>> e.g. in a changed python.exe stub. Next to the machine stack/regs it 
>> should grab the relevant last Python thread stack(s), if any, and 
>> maybe other useful status and python global vars.
>> (There are also Python threads going on. )
>>
>> Is that possible?
>>
>> -robert
> 
> 
> It looks like this may be what you want.  Quoting from
> 
> http://www.usenix.org/events/usenix01/full_papers/beazley/beazley.pdf
> 
> """
> In recent years, scripting languages such as Perl,
> Python, and Tcl have become popular development tools
> for the creation of sophisticated application software.
> One of the most useful features of these languages is
> their ability to easily interact with compiled languages
> such as C and C++. Although this mixed language approach
> has many benefits, one of the greatest drawbacks
> is the complexity of debugging that results from using
> interpreted and compiled code in the same application.
> In part, this is due to the fact that scripting language
> interpreters are unable to recover from catastrophic errors
> in compiled extension code. Moreover, traditional
> C/C++ debuggers do not provide a satisfactory degree
> of integration with interpreted languages. This paper
> describes an experimental system in which fatal extension
> errors such as segmentation faults, bus errors, and
> failed assertions are handled as scripting language exceptions.
> This system, which has been implemented as
> a general purpose shared library, requires no modifications
> to the target scripting language, introduces no performance
> penalty, and simplifies the debugging of mixed
> interpreted-compiled application software.
> """
> 
> It may be an interesting project to port this to Windows.
> 
> Hope that helps,
> 

Thanks, exactly. That would be quite luxurious as WAD manages to even 
continue normal Python execution on the current stack (only in a UNIX 
main thread?) with a regular Python exception. That would make pets out 
of all bugs - preventing users from facing nameless app freezes, but 
make the help bug elimination for distributed programs by detailed reports.

simply catching signal.SIGSEGV with Python's signal module causes just 
looping at 100% CPU load (The GIL is not forced ?)

WAD obviously has not anymore evolved since the beginning.

The Author says, that a Windows port would be quite complex - because 
advanced execution context manipulation is obviously not possible with 
signals on Windows.
Yet he mentions the Windows C++ "structured exception handling", which 
may enable that.

I know that the fast Object Store OO-Database uses exactly that method 
to catch and recover execution from seg faults, while updating a kind of 
memory mapped database into the system.
I integrated that OO-DBMS exception handler once with Python some years 
back by making a C++ file out of python.c as shown below or even by 
using that PSECall alone, which is needed anyway for wrapping any other 
Python threads (start functions) with the seg fault handler. Probably 
there is even better thread control possible than on Linux. Yet I don't 
know so far if its possible to find the valid Python exit functions on 
the stack (for returning from exceptions) the same style as in Unix ELF 
binaries ...

--

#define OS_PSE_ESTABLISH_FAULT_HANDLER \
{ try { _PSE_NS_ _ODI_fault_handler _ODI_handler; try {
..
#define OS_PSE_END_FAULT_HANDLER \
   } catch (_PSE_NS_ os_err& e) { e.print(); } \
} catch (_PSE_NS_ _ODI_fault_handler&) {} }
..


#include <os_pse/ostore.hh>
#include "Python.h"

extern "C" { DL_EXPORT(int) Py_Main(int, char**);}

int main(int argc, char **argv)
{
	int result=0;

	// The top of the stack of every thread of every program
	// must be wrapped in a fault handler
	// This macro begins the handler's scope
	OS_PSE_ESTABLISH_FAULT_HANDLER

	result = Py_Main(argc, argv);

    // This macro ends the fault handler's scope
    OS_PSE_END_FAULT_HANDLER

	return result;
}

---
#ifdef SWIG

//# for wrapping python threads with the PSE fault handler
%native(PSECall) _wrap_PSECall;
%{
//#include <os_pse/ostore.hh>
//#include "Python.h"
static PyObject *_wrap_PSECall(PyObject *self, PyObject *args) {
     PyObject * func =0 ;
     PyObject * funcargs =0 ;

	PyObject* result;

     if(!PyArg_ParseTuple(args,"OO:PSECall",&func, &funcargs)) return NULL;

	// The top of the stack of every thread of every program
	// must be wrapped in a fault handler
	// This macro begins the handler's scope
	OS_PSE_ESTABLISH_FAULT_HANDLER


	result = PyEval_CallObject(func,funcargs); // Call Python

     // This macro ends the fault handler's scope
     OS_PSE_END_FAULT_HANDLER
	
     return result;
}
%}
#endif // SWIG

---


-robert



More information about the Python-list mailing list