Obscure Threading Bug

Donn Cave donn at oz.net
Thu Jul 20 03:06:49 EDT 2000


Quoth Glyph Lefkowitz <glyph at twistedmatrix.com>:
| I apologize for the length and obscurity of the attachment.  However, the
| previous example where I encountered this bug was easily twenty times as
| long, and it took some effort to make it this short :-).
|
| Simply put, the interpreter usually segfaults when I run this file.  I
| don't know exactly why.  My copy of python doesn't have debugging
| symbols,so my backtraces aren't terribly meaningful, but the first few
| lines are usually something like:
|
| #0  __flockfile (stream=0x0) at lockfile.c:32
| #1  0x4009ac07 in _IO_ferror (fp=0x0) at ferror.c:35
| #2  0x8069178 in PyFile_SetBufSize ()
|
| the common theme being PyFile_SetBufSize ().
|
| "lots of threads, lots of sockets" is a common theme for a lot of apps
| that I would like to build, so I would be glad to help if there's any more
| information I can supply.
|
| This bug has been known to occurr on: RedHat/Debian Linux default Python
| install, Digital Unix python 1.5.2 (Digital UNIX V4.0F  (Rev. 1229); Thu
| Jul 15 17:56:36 EET DST 1999) and on debian using CVS python.

I tried it on a couple of others - FreeBSD 4.0, BeOS 4.5.2 - and
actually didn't get any crashes, but did get some tracebacks.

With luck perhaps someone who is more familiar with SocketServer and
threads will have an easy fix.  I can only say, this looks like a
recipe for trouble to me.

The cure, in my opinion, is to quit using file objects, and carefully
segregate every single direct access to (one end of) a socket into
its socket-specific thread.  Even recv() vs. send(), these should
happen in the same thread.

That's how I do it.  Maybe you can get away with some things that
I stay away from, it's up to you.

Now when another thread wants to do something with our socket, it
has to go through the socket thread, which is naturally blocking
on input from the socket.  The mechanism is another socket, or
a pipe, and the select() function.  Could sound like a hack, but
that extra I/O is essentially your message based dispatching system,
and in my opinion it's the most elegant way to work with threads.
It's also possible to add a thread-interlocked function call
parameter passing database, to reduce the potential size of I/O
if that's an issue.

I have code that does that kind of thing, and I'm thinking about
cleaning it up and making it more generic (currently very BeOS
specific) for other people to use.  Is there any other message
dispatch thread framework out there we should be looking at?

	Donn Cave, donn at oz.net



More information about the Python-list mailing list