thread/queue bug

Bengt Richter bokr at oz.net
Sat Dec 11 14:56:04 EST 2004


On Fri, 10 Dec 2004 16:18:51 -0600, phil <phillip.watts at anvilcom.com> wrote:

>And sorry I got ticked, frustrating week
>
threading problems can do that ;-)

You are obviusly deeper into this than I can get from a cursory scan,
but I'll make some general debugging comments ;-)

> >And I could help more, being fairly experienced with
> >threading issues and race conditions and such, but
> >as I tried to indicate in the first place, you've
> >provided next to no useful (IMHO) information to
> >let anyone help you more than this
>
>This is about 5% of the code.
>Uses no locks.
>I am mystified, I have written probably 100,000 lines
>of Python and never seen a thread just lock up and quit
>running.  It happens on a Queue() statement so my suspicion
>is a bug.  ??
Or a once-in-a-blue-moon resource access deadlock of some kind?

>
>I have kludged around it by putting all the thread/queue stuff
>in the main program and import the stuff I don't want to
>distribute.  But mysteries haunt your dreams, sooo...
>#!/usr/local/bin/python
>
># pnetx.py
>
>from threading import *
>from time import *
>from Queue import Queue
>from socket import *
Do you really need to import * ? Though the above should be safe import-wise
you are setting yourself up for name-clash problems by putting so many in
the same space. E.g., if something happened to match a misspelling typo in
your program, you wouldn't get a name error exception. Etc., etc.

One of the first rules of debugging is to eliminate the unnecessary ;-)
At least there doesn't seem to be name clashes between the above (except
starting with '_' which shouldn't get imported with *

 >>> d = {}
 >>> for imp in 'threading time Queue socket'.split():
 ...    m = __import__(imp)
 ...    for name in m.__dict__.keys():
 ...        d.setdefault(name, []).append(imp)
 ...
 >>> for k,v in d.items():
 ...     if len(v)!=1: print k,v
 ...
 _sleep ['threading', 'Queue']
 __file__ ['threading', 'Queue', 'socket']
 __all__ ['threading', 'Queue', 'socket']
 __builtins__ ['threading', 'Queue', 'socket']
 __name__ ['threading', 'time', 'Queue', 'socket']
 _time ['threading', 'Queue']
 __doc__ ['threading', 'time', 'Queue', 'socket']

Just verifying an assumption ;-)
OTOH, do you really need (ignoring wid and name and the first content of dir()):

 >>> dir()
 ['__builtins__', '__doc__', '__name__']
 >>> from threading import *
 >>> from time import *
 >>> from Queue import *
 >>> from socket import *
 >>> wid = 0
 >>> for name in dir():
 ...     print repr(name),
 ...     wid += len(repr(name))+1
 ...     if wid>60:
 ...         print
 ...         wid = 0
 ...
 'AF_APPLETALK' 'AF_INET' 'AF_IPX' 'AF_UNSPEC' 'AI_ADDRCONFIG'
 'AI_ALL' 'AI_CANONNAME' 'AI_DEFAULT' 'AI_MASK' 'AI_NUMERICHOST'
 'AI_PASSIVE' 'AI_V4MAPPED' 'AI_V4MAPPED_CFG' 'BoundedSemaphore'
 'CAPI' 'Condition' 'EAI_ADDRFAMILY' 'EAI_AGAIN' 'EAI_BADFLAGS'
 'EAI_BADHINTS' 'EAI_FAIL' 'EAI_FAMILY' 'EAI_MAX' 'EAI_MEMORY'
 'EAI_NODATA' 'EAI_NONAME' 'EAI_PROTOCOL' 'EAI_SERVICE' 'EAI_SOCKTYPE'
 'EAI_SYSTEM' 'Empty' 'Event' 'Full' 'INADDR_ALLHOSTS_GROUP' 'INADDR_ANY'
 'INADDR_BROADCAST' 'INADDR_LOOPBACK' 'INADDR_MAX_LOCAL_GROUP'
 'INADDR_NONE' 'INADDR_UNSPEC_GROUP' 'IPPORT_RESERVED' 'IPPORT_USERRESERVED'
 'IPPROTO_GGP' 'IPPROTO_ICMP' 'IPPROTO_IDP' 'IPPROTO_IGMP' 'IPPROTO_IP'
 'IPPROTO_MAX' 'IPPROTO_ND' 'IPPROTO_PUP' 'IPPROTO_RAW' 'IPPROTO_TCP'
 'IPPROTO_UDP' 'IP_ADD_MEMBERSHIP' 'IP_DEFAULT_MULTICAST_LOOP'
 'IP_DEFAULT_MULTICAST_TTL' 'IP_DROP_MEMBERSHIP' 'IP_MAX_MEMBERSHIPS'
 'IP_MULTICAST_IF' 'IP_MULTICAST_LOOP' 'IP_MULTICAST_TTL' 'IP_OPTIONS'
 'IP_TOS' 'IP_TTL' 'Lock' 'MSG_DONTROUTE' 'MSG_OOB' 'MSG_PEEK'
 'NI_DGRAM' 'NI_MAXHOST' 'NI_MAXSERV' 'NI_NAMEREQD' 'NI_NOFQDN'
 'NI_NUMERICHOST' 'NI_NUMERICSERV' 'Queue' 'RAND_add' 'RAND_egd'
 'RAND_status' 'RLock' 'SOCK_DGRAM' 'SOCK_RAW' 'SOCK_RDM' 'SOCK_SEQPACKET'
 'SOCK_STREAM' 'SOL_IP' 'SOL_SOCKET' 'SOL_TCP' 'SOL_UDP' 'SOMAXCONN'
 'SO_ACCEPTCONN' 'SO_BROADCAST' 'SO_DEBUG' 'SO_DONTROUTE' 'SO_ERROR'
 'SO_KEEPALIVE' 'SO_LINGER' 'SO_OOBINLINE' 'SO_RCVBUF' 'SO_RCVLOWAT'
 'SO_RCVTIMEO' 'SO_REUSEADDR' 'SO_SNDBUF' 'SO_SNDLOWAT' 'SO_SNDTIMEO'
 'SO_TYPE' 'SO_USELOOPBACK' 'SSLType' 'SSL_ERROR_EOF' 'SSL_ERROR_INVALID_ERROR_CODE'
 'SSL_ERROR_SSL' 'SSL_ERROR_SYSCALL' 'SSL_ERROR_WANT_CONNECT'
 'SSL_ERROR_WANT_READ' 'SSL_ERROR_WANT_WRITE' 'SSL_ERROR_WANT_X509_LOOKUP'
 'SSL_ERROR_ZERO_RETURN' 'Semaphore' 'SocketType' 'TCP_NODELAY'
 'Thread' 'Timer' '__builtins__' '__doc__' '__name__' 'accept2dyear'
 'activeCount' 'altzone' 'asctime' 'clock' 'ctime' 'currentThread'
 'daylight' 'enumerate' 'error' 'errorTab' 'gaierror' 'getaddrinfo'
 'getdefaulttimeout' 'getfqdn' 'gethostbyaddr' 'gethostbyname'
 'gethostbyname_ex' 'gethostname' 'getnameinfo' 'getprotobyname'
 'getservbyname' 'gmtime' 'has_ipv6' 'herror' 'htonl' 'htons'
 'inet_aton' 'inet_ntoa' 'localtime' 'mktime' 'ntohl' 'ntohs'
 'setdefaulttimeout' 'setprofile' 'settrace' 'sleep' 'socket'
 'ssl' 'sslerror' 'strftime' 'strptime' 'struct_time' 'time' 'timeout'
 'timezone' 'tzname' 'wid'
 >>>

 (Hm, should have pre-tested wid+len(current) to limit width ;-) 

>import sys
>import os
>
># glob is a DUMMY CLASS
where from? accidentally snipped?
>
>glob.listenq = Queue(1000)
>
>def listener():
>	while 1:
>	    msg,addrport = listenersocket.recvfrom(BUFSIZE)
>	    addr = addrport[0]
what is the usage level on glob.listenq? It is finite 1000, so
you could theoretically block on the put. What do you get
if you print glob.listenq.qsize here?

>	    glob.listenq.put (( msg,addr),1)
>	    if msg == 'DIE': break
>	
>def procmsgs():
>	while 1:
>	    msg,addr = glob.listenq.get(1)
>	    if msg == 'DIE': break
Might be interesting to get some output when 'DIE' is recognized. Above too.

>	    wds = msg.split(';')
            assert wds[0] in ['S', 'E', 'ACK', 'MONITOR'] # check assumptions. E.g.,
                                                          # split on ';' means spaces around
                                                          # the ';' are not eliminated
>	    if wds[0] == 'S': recvsucc(msg); continue
>	    if wds[0] == 'E': recvevent(msg); continue
>	    if wds[0] == 'ACK': recvack(msg[4:]); continue
>	    if wds[0] == 'MONITOR':
>	      if addr not in monitorlist:
>		print 'This line ALWAYS PRINTS'
>	        queuelist.append( Queue(0) )
Why are you apparently creating *new* queues in this thread that you are not using
and appending them to queuelist in the main thread? Are you testing to see how many
can be created, or for some Queue creation bug?

What do you get if you print 'queuelist length = %s'%len(queuelist) here?
How often are you getting 'MONITOR'?

>		## The above fails if this code is imported
Imported from where? Do you mean that this code works if it
is embedded in some larger code, and if you put an import of this
at the top of that code it doesn't work? That would not only be
moving the source, but also the point of invocation. I.e., the
import *executes* what you import once, putting the result in
the module's global name space (other than code that explicitly
accesses other modules etc.) Do you need to wrap part of this
in a function that you can call *after* importing the definitions?

If your real app has multiple threads accessing a main thread queuelist
without a lock, you might want to look closely and see if that needs
to be a queue mechanism itself.

>	        ## It doesn't matter whether it is imported
>                 ##    as .py or .pyc
>		## also mq = Queue(0); queuelist.append(mq) # fails
>	        print 'This line ALWAYS PRINTS if i execute this source'
>	        print 'but NEVER PRINTS if I import '
>		print 'this source from a 1 line program'
>	        print 'Thats weird'
>		monitoron.append( 1 )
>	        monitorlist.append( addr )
>
Do you want all the following to be executed once at the point of importing this?

>queuelist = [Queue(0)]
What is this for? (oops, I hope I restored the above line correctly)
>
>
>listenersocket = socket(AF_INET,SOCK_DGRAM)
>listenersocket.bind(ListenAddr)
>
>procm = Thread(target=procmsgs)
>procm.start()
>listenr = Thread(target=listener)
>listenr.start()
>
>## Then start a bunch of other threads and queuea.
>
>
>Don't spend a lot of time on this, not worth it.
>I just thought someone might have experienced
>something similar.
>

Just some thoughts. HTH.


Regards,
Bengt Richter



More information about the Python-list mailing list