[Tutor] UPDATE: Is there a 'hook' to capture all exits from a python program?

Martin A. Brown martin at linux-ip.net
Sat Mar 21 03:20:33 CET 2015


Hi,

This is mostly a distant footnote to Doug Basberg's original 
question, which I believe is largely answered at this point.

Albert-Jan Roskum, Alan Gauld and Steven D'Aprano were asking about 
signals and how they are handled (in Un*xen).  I am trying to 
address that.

>> Yeah, I know you can catch a signal and add your own handler, but I
>> meant what is the default Python suspend behaviour? Does it execute any
>> outstanding exception blocks? What about finally blocks? Or, if about to
>> exit a context manager, the __exit__ method?

Suspension of the process has nothing to do with Python.  This 
happens in the Un*x scheduler--long before Python is involved.

> Depends on the signal.

Correct.  It all depends on the signal.

Short version:

(Apologies, Jack Nicholson, in demonic form or otherwise):

   * You can't 'handle':  STOP, CONT, KILL, SEGV, BUS.

   * You can handle: HUP, INT, QUIT, USR1, USR2, PIPE, ALRM, TERM and others.

Short advice (though I do not have experience dealing with signals 
in a realtime environment).  I suggest the following guidelines:

   1. Do not catch a signal you cannot handle (or do not intend to
      handle).

   2. Do everything you can at startup to make sure that the
      environment in which you are operating is as you expect.

   3. Catch all the signals you want to catch and, in response to
      receiving such a signal, do what you need in order to shut down
      cleanly.  This coexists peacefully with the handy
      atexit handlers suggested earlier.

(I am normally not at all a fan of an unspecific try--finally, but I 
get what Peter Otten is suggesting and might make the same choice, 
were I faced with Doug Basberg's situation.)

> I'm not an expert on how signals work in Linux, or other Unixes, 
> but I expect that, in the absense of a specific signal handler to 
> catch it, the "sleep" (pause?) signal will cause the interpreter 
> to just stop and wait. I think that's signal 19 on Linux, and 18 
> to wake.

Longer version:

I have experience with handling Linux signals and Python.  There may 
be subtle differences on other Un*xen.  If you wish to know more, I 
would suggest reading the chapter on Signals in _Advanced 
Programming in the Unix Environment_ (Chapter 10, in my second 
edition by Stevens & Rago).

You cannot catch nor handle:

   * SIGSTOP (19), because that tells Un*x, "Please remove this
     process from the scheduler, i.e. freeze it!"

   * SIGCONT (18), because that tells Unix, "Please restore this
     process to normal scheduling, i.e. unfreeze it."

   * SIGKILL (9), because that tells Unix, "Terminate this thing,
     with prejudice!  Do not tell it what happened."

This means, your Un*X will never actually deliver SIGSTOP, SIGCONT 
or SIGKILL to Python and your program.

I believe that you cannot do anything with the following signals:

   * SIGSEGV (11), because this means that there has been a memory
     fault.  Python is a sufficiently high-level language, that, if
     this happens, this should not be your code doing it.  (Unless
     you are writing C extensions for Python, and then, of course,
     you know what you are doing....)

   * SIGBUS (7), because this is extraordinarily rare (today), but
     would be a case of trying to access memory that does not exist.

In practice, I have seen SIGSEGV often over the last 20 years 
(perhaps I have worked with flaky software, or perhaps that is just 
something that happens in this line of work).  I have seen SIGBUS 
very rarely (usually a precursor to a machine eating itself for 
lunch).

The signals STOP and CONT are so rarely exhibited that they are 
perceived as exotic specimens when demonstrated.

The KILL signal is the big hammer that everybody learns in their 
first month using any Un*x (which is unfortunate because of the 
power it commands).

> Signal 9 doesn't give the interpreter to do anything. The OS just 
> yanks the carpet out from under its feet and terminates the 
> process with extreme prejudice. Signal 9 cannot be caught, no 
> signal handlers will detect it, no try...finally blocks will run. 
> The process just stops.

Correct.  When the (Linux | Un*x) kernel has a signal 9 for a 
process, that process does not get any chance to clean up.  It 
simply disappears.  It is never given an opportunity to run 
again--i.e. it will never be scheduled again.

> Don't use kill -9 unless you need to. I always try three steps to 
> kill a rogue process:
>
> First use "kill <processid>" with no other arguments. Give it 30 
> seconds or so to let the process tidy up after itself, and if it 
> still hasn't quiet, try "kill -HUP <processid>". Again, give it 30 
> seconds or so. Then, if and only if necessary, "kill -9 
> <processid>".

Agreed.  In my experience, most mature sysadmins do this, too.

>> Or does it just stop and wait till its resumed? Kind of like
>> an implicit yield statement?

Sending a SIGSTOP to a process is equivalent to freezing it in 
memory/process space.  I sometimes think of this as suspend (because 
of ctrl-Z in bash, Alan).  The process does not know that time is 
passing.  It does not get scheduled.  (Its memory may be swapped 
out, but this doesn't matter.)  Sending SIGSTOP to a STOPped process 
is a noop.

Sending a SIGCONT to a process is equivalent to unfreezing it in 
memory/process space.  The process will now be scheduled (along with 
all other processes).  It will resume, not knowing how much time has 
passed since it was last running.  (If it checks gettimeofday() or 
checks the clock in some other fashion, of course, it can determine 
that it was suspended.)  Sending a SIGCONT to a running process is a 
noop.

If you are going to list out explicitly the signals you wish to 
catch, the following are the most common signals to catch which 
represent various ways people attempt to tell a process to 
terminate.  They are:  INT, QUIT, HUP, TERM and maybe HUP.

HUP is special, as it is commonly understood (in the sysadmin world) 
to mean things like 're-read your config file' or 'restart a certain 
routine activity'.  So, it may surprise some folk if a process died 
after receiving a HUP.  This may be desirable--it depends entirely 
on what your software does.

Stopping and starting are hard!

-Martin

-- 
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list