collecting results in threading app

Gerardo Herzig gherzig at fmed.uba.ar
Fri Apr 4 11:27:14 EDT 2008


John Nagle wrote:

>Gerardo Herzig wrote:
>  
>
>>Hi all. Newbee at threads over here. Im missing some point here, but cant 
>>figure out which one.
>>
>>This little peace of code executes a 'select count(*)' over every table 
>>in a database, one thread per table:
>><code>
>>class TableCounter(threading.Thread):
>>   def __init__(self, conn, table):
>>       self.connection = connection.Connection(host=conn.host, 
>>port=conn.port, user=conn.user, password='', base=conn.base)
>>       threading.Thread.__init__(self)
>>       self.table = table
>>
>>   def run(self):
>>       result =  self.connection.doQuery("select count(*) from %s" % 
>>self.table, [])[0][0]
>>       print result
>>       return result
>>
>>
>>class DataChecker(metadata.Database):
>>
>>   def countAll(self):
>>       for table in self.tables:
>>           t = TableCounter(self.connection, table.name)
>>           t.start()
>>       return
>></code>
>>
>>It works fine, in the sense that every run() method prints the correct 
>>value.
>>But...I would like to store the result of t.start() in, say, a list. The 
>>thing is, t.start() returns None, so...what im i missing here?
>>Its the desing wrong?
>>    
>>
>
>     1.  What interface to MySQL are you using?  That's not MySQLdb.
>     2.  If SELECT COUNT(*) is slow, check your table definitions.
>         For MyISAM, it's a fixed-time operation, and even for InnoDB,
>         it shouldn't take that long if you have an INDEX.
>     3.  Threads don't return "results" as such; they're not functions.
>
>
>As for the code, you need something like this:
>
>class TableCounter(threading.Thread):
>    def __init__(self, conn, table):
>      self.result = None
>      ...
>
>     def run(self):
>         self.result =  self.connection.doQuery("select count(*) from %s" %
>  self.table, [])[0][0]
>
>
>     def countAll(self):
>         mythreads = [] # list of TableCounter objects
>	# Start all threads
>         for table in self.tables:
>             t = TableCounter(self.connection, table.name)
>             mythreads.append(t) # list of counter threads
>             t.start()
>         # Wait for all threads to finish
>         totalcount = 0
>         for mythread in mythreads:		# for all threads
>	    mythread.join()			# wait for thread to finish
>             totalcount += mythread.result	# add to result
>	print "Total size of all tables is:", totalcount
>
>
>
>					John Nagle
>  
>
Thanks John, that certanly works. According to George's suggestion, i 
will take a look to the Queue module.
One question about

for mythread in mythreads:		# for all threads
	    mythread.join()			# wait for thread to finish


That code will wait for the first count(*) to finish and then continues 
to the next count(*). Because if is that so, it will be some kind of 
'use threads, but execute one at the time'.
I mean, if mytreads[0] is a very longer one, all the others will be 
waiting...rigth?
There is an approach in which i can 'sum' after *any* thread finish?

Could a Queue help me there?
Thanks!

Gerardo




More information about the Python-list mailing list