[Python-Dev] Strange segfault in Python threads and linux kernel 2.6

Donovan Baarda abo at minkirri.apana.org.au
Thu Jan 27 00:51:43 CET 2005


On Wed, 2005-01-26 at 01:53 +1100, Anthony Baxter wrote:
> On Wednesday 26 January 2005 01:01, Donovan Baarda wrote:
> > In this case it turns out to be "don't do exec() in a thread, because what
> > you exec can have all it's signals masked". That turns out to be a hell of
> > a lot of things; popen, os.command, etc. They all only work OK in a
> > threaded application if what you are exec'ing doesn't use any signals.
> 
> Yep. You just have to be aware of it. We do a bit of this at work, and we
> either spool via a database table, or a directory full of spool files. 
> 
> > Actually, I've noticed that zope often has a sorta zombie "which" process
> > which it spawns. I wonder it this is a stuck thread waiting for some
> > signal...
> 
> Quite likely.

For the record, it seems that the java version also contributes. This
problem only occurs when you have the following combination;

Linux >=2.6
Python <=2.3
j2re1.4 =1.4.2.01-1 | kaffe 2:1.1.4xxx

If you use Linux 2.4, it goes away. If you use Python 2.4 it goes away.
If you use j2re1.4.1.01-1 it goes away.

For the problem to occur the following combination needs to occur;

1) Linux uses the thread's sigmask instead of the main thread/process
sigmask for the exc'ed process (ie, 2.6 does this, 2.4 doesn't).

2) Python needs to screw with the sigmask in threads (python 2.3 does,
python 2.4 doesn't).

3) The exec'ed process needs to rely on threads (j2re1.4 1.4.2.01-1
does, j2re1.4 1.4.1.01-1 doesn't).

It is hard to find old Debian deb's of j2re1.4 (1.4.1.01-1), and when
you do, you will also need the now non-existent j2se-common 1.1 package.
I don't know if this qualifies as a potential bug against j2re1.4
1.4.2.01-1.

For now my solution is to roll back to the older j2re1.4.


-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/



More information about the Python-Dev mailing list