queue deadlock possibility

Michael Bayer mike_mp at zzzcomputing.com
Mon Jun 26 01:48:56 EDT 2006


Hi -

i was just going through this thread: http://mail.python.org/ 
pipermail/python-list/2006-April/336948.html , where it is suggested  
that the Lock instance used by Queue.Queue should be publically  
configurable.  I have identified another situation where a Queue can  
be deadlocked, one which is also alleviated by configuring the type  
of Lock used by the Queue (or just changing it to an RLock).

The scenario arises when the Queue is operated upon within the  
__del__ method of an object; since __del__ can be called at somewhat  
unpredictable times, I have observed that it is in fact possible, in  
extremely rare circumstances, for put() to be called within a get()  
or possibly vice versa; since both methods lock on the same  
underlying mutex object which is an instance of threading.Lock, a  
deadlock occurs.

The issue can be fixed by substituting a threading.RLock for the  
threading.Lock object that Queue instantiates by default.

The scenario this has arisen within is a database connection pool,  
which puts connections in a Queue, returns them via get() within a  
wrapper object, and the wrapper object automatically returns the  
connection to the Queue via put() within its __del__ method (an  
explicit close() method is available as well).  While I cant  
reproduce it locally, one of my users experiences it regularly.  I  
had him install the "threadframe" module to trace it out, and it  
reveals that all threads are hung within Queue on the acquiring of  
the "not_empty" and "not_full" Conditionals, and the offending stack  
trace within it looks like this:

   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 84, in  
connect
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 130, in  
__init__
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 102, in  
get
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 226, in  
do_get
   File "/usr/lib/python2.4/Queue.py", line 116, in get
     raise Empty
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 157, in  
__del__
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 163, in  
_close
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 99, in  
return_conn
   File "build/bdist.linux-i686/egg/sqlalchemy/pool.py", line 216, in  
do_return_conn
   File "/usr/lib/python2.4/Queue.py", line 71, in put
     self.not_full.acquire()

this is a simplified version of the logic, the actual version is the  
pool.py module in the SQLAlchemy package:

import Queue

pool = Queue.Queue(maxsize=10)

class ConnectionWrapper(object):
     def __init__(self, connection):
         self.connection = connection
     def __del__(self):
         pool.put_nowait(self)

# fill up the pool with 10 connections
for x in range(0,10):
     pool.put_nowait(database.connect())

def connect():
     return ConnectionWrapper(pool.get())

At the moment I am modifying the Queue's mutex to be a  
threading.RLock to fix the problem;  what does the community think of  
either making the Queue's Lock instance public or changing it to an  
RLock ?

- mike







More information about the Python-list mailing list