[Python-Dev] Strange segfault in Python threads and linux kernel 2.6

Michael Hudson mwh at python.net
Fri Jan 21 13:46:41 CET 2005


Donovan Baarda <abo at minkirri.apana.org.au> writes:

> On Thu, 2005-01-20 at 14:12 +0000, Michael Hudson wrote:
>> Donovan Baarda <abo at minkirri.apana.org.au> writes:
>> 
>> > On Wed, 2005-01-19 at 13:37 +0000, Michael Hudson wrote:
>> >> Donovan Baarda <abo at minkirri.apana.org.au> writes:
> [...]
>> >> The main oddness about python threads (before 2.3) is that they run
>> >> with all signals masked.  You could play with a C wrapper (call
>> >> setprocmask, then exec fop) to see if this is what is causing the
>> >> problem.  But please try 2.4.
>> >
>> > Python 2.4 does indeed fix the problem. 
>> 
>> That's good to hear.
> [...]
>
> I still don't understand what Linux 2.4 vs Linux 2.6 had to do with
> it.

I have to admit to not being that surprised that behaviour appears
somewhat inexplicable.

As you probably know, linux 2.6 has a more-or-less entirely different
threads implementation (NPTL) than 2.4 (LinuxThreads) -- so changes in
behaviour aren't exactly surprising.  Whether they were intentional, a
good thing, etc, I have a careful lack of opinion :)

> Reading the man pages for execve(), pthread_sigmask() and sigprocmask(),
> I can see some ambiguities, but mostly only if you do things they warn
> against (ie, use sigprocmask() instead of pthread_sigmask() in a
> multi-threaded app).

Uh, I don't know how much I'd trust documentation in this situation.
Really.

Threads and signals are almost inherently incompatible, unfortunately.

> The man page for execve() says that the new process will inherit the
> "Process signal mask (see sigprocmask() )". This implies to me it will
> inherit the mask from the main process, not the thread's signal mask.

Um.  Maybe.  But this is the sort of thing I meant above -- if signals
are delivered to threads, not processes, what does the "Process signal
mask" mean?  The signal mask of the thread that executed main()?  I
guess you could argue that, but I don't know how much I'd bet on it.

> It looks like Linux 2.4 uses the signal mask of the main thread or
> process for the execve(), whereas Linux 2.6 uses the thread's signal
> mask.

I'm not sure that this is the case -- I'm reasonably sure I saw
problems caused by the signal masks before 2.6 was ever released.  But
I could be wrong.

> Given that execve() replaces the whole process, including all
> threads, I dunno if using the thread's mask is right. Could this be
> a Linux 2.6 kernel bug?

You could ask, certainly...

Although I've done a certain amount of battle with these problems, I
don't know what any published standards have to say about these things
which is the only real criteria by which it could be called "a bug".

>> > I'm not sure what the correct behaviour should be. The fact that it
>> > works in python2.4 feels more like a byproduct of the thread mask change
>> > than correct behaviour. 
>> 
>> Well, getting rid of the thread mask changes was one of the goals of
>> the change.
>
> I gathered that... which kinda means the fact that it fixed execvp in
> threads is a side effect...(though I also guess it fixed a lot of other
> things like this too).

Um.  I meant "getting rid of the thread mask" was one of the goals
*because* it would fix the problems with execve and system() and
friends.

>> > To me it seems like execvp() should be setting the signal mask back
>> > to defaults or at least the mask of the main process before doing
>> > the exec.
>> 
>> Possibly.  I think the 2.4 change -- not fiddling the process mask at
>> all -- is the Right Thing, but that doesn't help 2.3 users.  This has
>> all been discussed before at some length, on python-dev and in various
>> bug reports on SF.
>
> Would a simple bug-fix for 2.3 be to have os.execvp() set the mask to
> something sane before executing C execvp()?

Perhaps.  I'm not sure I want to go fiddling there.  Maybe someone
else does.  system(1) presents a problem too, though, which is harder
to worm around unless we want to implement it ourselves, in practice.

> Given that Python does not have any visibility of the procmask...
>
> This might be a good idea regardless as it will protect against this bug
> resurfacing in the future if someone decides fiddling with the mask for
> threads is a good idea again.

In the long run, everyone will use 2.4.  There are some other details
to the changes in 2.4 that have a slight chance of breaking programs
which is why I'm uneasy about putting them in 2.3.5 -- for a bug fix
release it's much much worse to break a program that was working than
to fail to fix one that wasn't.

>> In your situation, I think the simplest thing you can do is dig out an
>> old patch of mine that exposes sigprocmask + co to Python and either
>> make a custom Python incorporating the patch and use that, or put the
>> code from the patch into an extension module.  Then before execing
>> fop, use the new code to set the signal mask to something sane.  Not
>> pretty, particularly, but it should work.
>
> The extension module that exposes sigprocmask() is probably best for
> now...

I hope it helps!

Cheers,
mwh

-- 
  <etrepum> Jokes around here tend to get followed by implementations.
                                                -- from Twisted.Quotes


More information about the Python-Dev mailing list