collecting results in threading app
Gerardo Herzig
gherzig at fmed.uba.ar
Fri Apr 4 11:27:14 EDT 2008
John Nagle wrote:
>Gerardo Herzig wrote:
>
>
>>Hi all. Newbee at threads over here. Im missing some point here, but cant
>>figure out which one.
>>
>>This little peace of code executes a 'select count(*)' over every table
>>in a database, one thread per table:
>><code>
>>class TableCounter(threading.Thread):
>> def __init__(self, conn, table):
>> self.connection = connection.Connection(host=conn.host,
>>port=conn.port, user=conn.user, password='', base=conn.base)
>> threading.Thread.__init__(self)
>> self.table = table
>>
>> def run(self):
>> result = self.connection.doQuery("select count(*) from %s" %
>>self.table, [])[0][0]
>> print result
>> return result
>>
>>
>>class DataChecker(metadata.Database):
>>
>> def countAll(self):
>> for table in self.tables:
>> t = TableCounter(self.connection, table.name)
>> t.start()
>> return
>></code>
>>
>>It works fine, in the sense that every run() method prints the correct
>>value.
>>But...I would like to store the result of t.start() in, say, a list. The
>>thing is, t.start() returns None, so...what im i missing here?
>>Its the desing wrong?
>>
>>
>
> 1. What interface to MySQL are you using? That's not MySQLdb.
> 2. If SELECT COUNT(*) is slow, check your table definitions.
> For MyISAM, it's a fixed-time operation, and even for InnoDB,
> it shouldn't take that long if you have an INDEX.
> 3. Threads don't return "results" as such; they're not functions.
>
>
>As for the code, you need something like this:
>
>class TableCounter(threading.Thread):
> def __init__(self, conn, table):
> self.result = None
> ...
>
> def run(self):
> self.result = self.connection.doQuery("select count(*) from %s" %
> self.table, [])[0][0]
>
>
> def countAll(self):
> mythreads = [] # list of TableCounter objects
> # Start all threads
> for table in self.tables:
> t = TableCounter(self.connection, table.name)
> mythreads.append(t) # list of counter threads
> t.start()
> # Wait for all threads to finish
> totalcount = 0
> for mythread in mythreads: # for all threads
> mythread.join() # wait for thread to finish
> totalcount += mythread.result # add to result
> print "Total size of all tables is:", totalcount
>
>
>
> John Nagle
>
>
Thanks John, that certanly works. According to George's suggestion, i
will take a look to the Queue module.
One question about
for mythread in mythreads: # for all threads
mythread.join() # wait for thread to finish
That code will wait for the first count(*) to finish and then continues
to the next count(*). Because if is that so, it will be some kind of
'use threads, but execute one at the time'.
I mean, if mytreads[0] is a very longer one, all the others will be
waiting...rigth?
There is an approach in which i can 'sum' after *any* thread finish?
Could a Queue help me there?
Thanks!
Gerardo
More information about the Python-list
mailing list