SIGSEGV and SIGILL inside PyCFunction_Call

Anders Wegge Keller wegge at wegge.dk
Thu Jul 20 08:03:23 EDT 2017


På Thu, 20 Jul 2017 07:44:26 +0200
dieter <dieter at handshake.de> skrev:
> Anders Wegge Keller <wegge at wegge.dk> writes:

...

>>  Weird observation #1: Sometimes the reason is SIGSEGV, sometimes it's
>> SIGILL.   
 
> Python tends to be sensitive to the stack size. In previous times,
> there have often be problems because the stack size for threads
> has not been large enough. Not sure, whether "nnrpd" is multi threaded
> and provides a sufficiently large stack for its threads.

 Luckily, the "threading model" of nnrpd is fork().

> A "SIGILL" often occurs because a function call has destroyed part
> of the stack content and the return is erroneous (returning in the midst
> of an instruction).

 I think you're right. That also explains why gdb have trouble with the last
stack frame. 


>>  I'm not ready to give up yet, but I need some help proceeding from here.
>> What do the C_TRACE really do,  
 
> The casing (all upper case letters) indicates a C preprocessor macro.
> Search the "*.h" files for its definition.

 I know where it is. I just don't feel like deciphering a 60 lines
monstrosity before at least asking if someone has a intimate enough
relationship with it, to give a TL;DR.

> I suppose that with a normal Python build (no debug build), the
> macro will just call "PyCFunction_Call".
> Alternatively, it might provide support for debugging, tracing
> (activated by e.g. "pdb.set_trace()").

 Probably. I can see I have to dig into it.

>> and is there some way of getting a level
>> deeper, to see what cause the SEGV. Also, how can the C code end up with
>> an illegal instruction_  

...

> Unfortunately, stack corruption is a non local problem (the point
> where the problem is caused is usually far away from the point
> where it is observed).
> 
> If the problem is not "too small stack size", you might need
> a tool to analyse memory overrides.

 The trouble with that is that nnrpd is a system daemon, and as such is a
bit difficult to trace in place. That's why I am asking for help
reasoning the cause, before I have to resort to running a debugger as a
privileged user.


-- 
//Wegge



More information about the Python-list mailing list