Python C/API based multithread python program locks

kanji kanji_rama at yahoo.com
Tue Sep 28 20:21:21 EDT 2004


Hi ALL,

I have written a multithreaded python program where each thread calls
a C function
(via Python/C extension module) to execute some tasks on a remote
node. The number
of threads == the number of nodes specified by the user.


The issue is it works most of the time, but occassionally (I mean this
is quite random ) it hangs and it does not generate any errors as
such. While trying to debug, sometimes even the gdb hangs, but i
managed to get a backtrace of a hung thread:

#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb75d11ee in pthread_cond_wait@@GLIBC_2.3.2 ()
from/lib/tls/libpthread.so.0
#2  0x0809bb3f in PyThread_acquire_lock ()
#3  0x0809e45c in _PyObject_GC_Del ()
#4  0x0807cad6 in PyEval_GetFuncDesc ()
#5  0x0807abc4 in PyEval_EvalCode ()
#6  0x0807b65e in PyEval_EvalCodeEx ()
#7  0x0807cbbb in PyEval_GetFuncDesc ()
#8  0x0807ab33 in PyEval_EvalCode ()
#9  0x0807b65e in PyEval_EvalCodeEx ()
#10 0x0807cbbb in PyEval_GetFuncDesc ()
#11 0x0807ab33 in PyEval_EvalCode ()
#12 0x0807b65e in PyEval_EvalCodeEx ()
#13 0x0807cbbb in PyEval_GetFuncDesc ()
#14 0x0807ab33 in PyEval_EvalCode ()
#15 0x0807b65e in PyEval_EvalCodeEx ()
#16 0x08078555 in PyEval_EvalCode ()
#17 0x08098569 in PyRun_FileExFlags ()
#18 0x080974d0 in PyRun_SimpleFileExFlags ()
#19 0x08096e1a in PyRun_AnyFileExFlags ()
#20 0x08053ac9 in Py_Main ()
#21 0x08053519 in main ()


So just to weed out the possibility that it is not because of some
error in the code, I iteratively called the same function (which
creates say 100 threads) in a for loop - for 500 times. I found that
it tends to hang at different iterations -- say may be at iteration
#480  or #12 or sometimes it sails smoothly.


in the python program -- the outputs from all threads are synchronized
via thread.join()

In the extension C srcs, i have used Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS brackets to take care of GIL.  I have separately
tested the C functions and it seemed to work fine.

Any ideas what could be the possible problem ? The test system is RHEL
3 and Python version 2.2.2

Please let me know if there any useful pointers to solve this issue.

Thanks
kanji



More information about the Python-list mailing list