Repost: Can't sys.exit() from SIGTERM handler?

Jeff Epler jepler at unpythonic.net
Mon Jan 5 22:54:30 EST 2004


Python's signal handling is "strange at best".  As I understand it,
when you register a Python function via signal.signal, Python installs a
C signal handler that sets a flag, and then the interpreter checks for
that flag to be set.  If it is set, the Python signal handler is
invoked.  This was an expedient way to do this, because the alternative
(running the Python code in the signal handler) would mean making sure
signal arrival at any moment was safe for the state of Python
objects---a virtually impossible task.

#1.  The idiom for "exit" is either
	sys.exit(<arg>)
or
	raise SystemExit, arg
Everywhere you are writing a "blanket except statement" like
	except:
or
	except Exception:
you will have to write something like
	except (SystemExit, KeyboardInterrupt): raise
as an earlier except clause to get the proper behavior.  This will,
ahem, encourage you to avoid overly broad exception handlers.  Most
Python programs I write don't have any such clause, and Python programs
that need to run in a "server" kind of role probably have exactly one
of them, in the right place to kill a particular connection but leave
the server running.

#2.  You're getting into territory where I've never been.  Thinking that
the problem might be with multiple SIGTERMs delivered to the Python
process, I changed my earlier program and got results I don't
understand.  


:r term2.py
import os, signal, time, sys, traceback

x = 0
def sig(a, b):
	print "Entering exception handler"
	global x
	x = x + 1
	print "sigterm", x
	print "sleeping in exception handler"
	time.sleep(2)
	print "Exiting exception handler"
	x = x - 1
	sys.exit()

signal.signal(signal.SIGTERM, sig)

os.system("(while true; do sleep 1; kill %s || exit; done) &" % os.getpid())

print "Sleeping (should be killed)"

try:
	time.sleep(2)
except SystemExit:
	traceback.print_exc()
	raise

print "sleep finished (!?)"

$ term2.py 
Sleeping (should be killed)
Entering exception handler
sigterm 1
sleeping in exception handler
 Entering exception handler
sigterm 2
sleeping in exception handler
 Entering exception handler
sigterm 3
sleeping in exception handler
 Entering exception handler
sigterm 4
sleeping in exception handler
					# hit ctrl-c here
Traceback (most recent call last):
  File "term2.py", line 22, in ?
    time.sleep(2)
  File "term2.py", line 12, in sig
    print "Exiting exception handler"
  File "term2.py", line 12, in sig
    print "Exiting exception handler"
  File "term2.py", line 12, in sig
    print "Exiting exception handler"
  File "term2.py", line 12, in sig
    print "Exiting exception handler"
KeyboardInterrupt

By commenting out the line "print "Exiting exception handler"" (which
never seemed to be reached !?) gives this behavior:

$ python term2.py 
Sleeping (should be killed)
Entering exception handler
sigterm 1
sleeping in exception handler
Entering exception handler
sigterm 1
sleeping in exception handler
$ sh: line 1: kill: (13961) - No such process

If I had to guess, I'd say that in the first example something about the
'print' statement causes pending Python signal handlers to be invoked,
and in the second example that the Python signal handler is invoked
after the 'except' statement's body is executed but before anything is
printed.

Back to your question, remember that Python exceptions raised in a
signal handler can be handled in the main program (that's how sys.exit
works, after all).  So what if the flow of code looks like this:
	sigterm delivered
		waitpid
			sigterm delivered
				waitpid
when the "inner" waitpid finishes, the outer will wait for the same
process and get the ECHILD exception.  If that's the case, then you
could either wait for your processes in a way that is safe in the
presence of a recursive invocation, such as:
	while list:
		pid = kids.pop()
		os.kill(pid, signal.SIGTERM)
		waitpid(pid, 0)
... another way to make this safe would be a targeted try: except: block
around the waitpid line, if that's the site where you've seen the
exception raised:
	try:
		waitpid(pid, 0)
	except OSError, detail:
		if detail.errno != errno.ECHILD:
			raise
		else:
			react to the fact that the child is already dead

Handling child processes is hopelessly subtle (especially when signals are
in the mix), and I'm sure I've gotten some things here terribly wrong too.
I wish there was somebody else offering you better advice.

Jeff




More information about the Python-list mailing list