Python, threads, and signals (oh my!)

Donn Cave donn at u.washington.edu
Wed Mar 14 14:32:35 EST 2001


Quoth jlowe at mentos.urbana.css.mot.com (Jason Lowe):
[
  I'm quoting this lengthy post intact partly just because it was
  interesting and well researched and could be missed in the heavy
  traffic here.
]
| I've been playing around with Python and threads, and I've noticed some
| odd and often unstable behavior.  In particular, on my Solaris 8 box I
| can get Python 1.5.2, 1.6, or 2.0 to core dump every time with the
| following sequence.  I've also seen this happen on Solaris 6.
|
|
| 1. Enter the following code into the interactive interpreter:
| --
| import threading
|
| def loopingfunc():
|   while 1: pass
|
| threading.Thread(target=loopingfunc).start()
| --
|
| 2. Send a SIGINT signal (usually Ctrl-C, your terminal settings may
|    vary).  "Keyboard Interrupt" is displayed and so far everything looks 
|    fine.
|
| 3. Now simply press the <Enter> key to enter a blank line in the
|    interpreter.  For my Solaris 8 box with the GNU readline 2.2 module
|    present, this always ends up in a core dump.  It may take a while,
|    since at this point the readline signal handler is being re-entered
|    recursively until the stack overflows.
|
|
| Looking more into this, it appears that on Solaris, more than one
| thread is processing the signal handler installed by the readline
| module (according to truss output of the process).  Unfortunately, the
| readline signal handlers don't support being re-entered, as they have
| global data that's not protected.  
|
| Now granted, signals and threads are a dangerous business.  However, it
| would be nice if the user sending SIGINT to a script wouldn't cause
| instability.  Looking at the Python module sources, I noticed that the
| signal module is somewhat "thread aware" -- it allows signal
| handlers to be installed only by the main thread.  However, I found it a
| little odd that the thread support, both in the core interpreter and in
| the thread module, had no support code for signals -- even in the
| specific thread cases like POSIX (pthreads).
|
| According to various pthread documentation, it appears the "right" way
| to handle the often volatile signal/thread mix is to mask all signals
| except in one thread (usually the main thread) so that only one thread
| will receive the signal.  Unfortunately, I couldn't find any Python
| module that would allow one to change the signal mask of a thread
| (pthread_sigmask) or even a process (sigmask, sigprocmask, etc.).  I'm
| assuming the lack of this interface has to do with portability across
| platforms.
|
| I was able to solve the problem by modifying Python/thread_pthread.h's
| PyThread_start_new_thread() to block all signals before creating the new 
| thread and then restoring the signal mask after the new thread was off
| and running.  Therefore, all threads created by Python except the
| initial thread will have all signals masked with this change.  Masking
| all signals in new Python threads makes sense to me, given that the
| signal module doesn't like other threads installing handlers anyway.
|
| As a side note, I think part of the problem with sending SIGINT at the
| interactive prompt while threading is aggravated by the Python readline
| module's signal handler.  It setjmp()'s and longjmp()'s to do its dirty
| work, and many thread implementations require the longjmp() be performed 
| by the thread that did the setjmp().  If signals aren't masked in all
| threads except the one doing the readline() call, then this isn't
| guaranteed.
|
| So what's everyone else's take on this?  Has anyone else experienced
| problems in Python when using threads and receiving signals (like
| SIGINT)?  Should the thread code be at least a little more "signal
| aware", so that new Python threads have all signals masked on platforms
| that support this?
|
| Thanks in advance.
|
| Jason Lowe
| --
| Jason Lowe                                    Urbana Design Center
| Motorola Personal Communications Sector       1800 South Oak Street
| jlowe at urbana.css.mot.com                      Champaign, IL  61820-6947
|                                               (217) 384-8513

I ran some experiments on other UNIX platforms, and Solaris is not
alone if signals aren't handled well when there are multiple concurrent
threads.  In general, signals were trapped but Python handler functions
weren't executed.

My take is that it's complicated.  Does it make sense to split out
readline's special problems as an issue that could be addressed
separately?  In my experiments, without readline, there were still
these issues, and even without readline, signal handling at the
interpreter prompt differed from an interpreted disk file.

I can't think of any reason why it would hurt anything to block
signals by default per non-main thread.  I don't think it will
work for many platforms, but if it works for anyone's I guess
that's better than nothing.  (And the one I would be most concerned
about, BeOS, doesn't seem to need it, as signals seem to work more
or less as intended in multithreaded programs there.)

I think the ability to block signals in general should be added,
in the form of the POSIX 1003.1 sigprocmask function.  It would
be easy enough to do, if anyone cared enough.  Is that how you
masked off the signal from the new thread - do new pthreads copy
their signal handling from the parent at creation?

	Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list