Thread Question

Ritesh Raj Sarraf rrs at researchut.com
Fri Jul 28 13:54:43 EDT 2006


Simon Forman on Thursday 27 Jul 2006 22:47 wrote:

> def run(request, response, func=dummy_func):
> '''
> Get items from the request Queue, process them
> with func(), put the results along with the
> Thread's name into the response Queue.
> 
> Stop running once an item is None.
> '''
> name = currentThread().getName()
> while 1:
> item = request.get()
> if item is None:
> break
> response.put((name, func(item)))
> 

Meanwhile, instead of sitting idle and waiting for a reply, I thought of trying
to understand the code (the example by Simon).
Good part is that I was able to use it. :-)

Here's the changed code:

        from Queue import Queue
        from threading import Thread, currentThread
        
        # Imports for the dummy testing func.
        from time import sleep
        from random import random
        
        NUMTHREADS = 3
        
        def run(request, response, func=download_from_web):
            '''
            Get items from the request Queue, process them
            with func(), put the results along with the
            Thread's name into the response Queue.
        
            Stop running once an item is None.
            '''
            name = currentThread().getName()
            while 1:
                item = request.get()
                (sUrl, sFile, download_size, checksum) = stripper(item)
                if item is None:
                    break
                response.put((name, func(sUrl, sFile, sSourceDir, None)))
        
        
        # Create two Queues for the requests and responses
        requestQueue = Queue()
        responseQueue = Queue()
        
        
        # Pool of NUMTHREADS Threads that run run().
        thread_pool = [
                       Thread(
                              target=run,
                              args=(requestQueue, responseQueue)
                              )
                       for i in range(NUMTHREADS)
                       ]
        
        
        # Start the threads.
        for t in thread_pool: t.start()
        
        # Queue up the requests.
        for item in lRawData: requestQueue.put(item)
        
        
        # Shut down the threads after all requests end.
        # (Put one None "sentinel" for each thread.)
        for t in thread_pool: requestQueue.put(None)
        
        # Don't end the program prematurely.
        #
        # (Note that because Queue.get() is blocking by
        # default this isn't strictly necessary.  But if
        # you were, say, handling responses in another
        # thread, you'd want something like this in your
        # main thread.)
        for t in thread_pool: t.join()

I'd like to put my understanding over here and would be happy if people can
correct me at places.

So here it goes:
Firstly the code initializes the number of threads. Then it moves on to
initializing requestQueue() and responseQueue().
Then it moves on to thread_pool, where it realizes that it has to execute the
function run().
>From NUMTHREADS in the for loop, it knows how many threads it is supposed to
execute parallelly.

So once the thread_pool is populated, it starts the threads. 
Actually, it doesn't start the threads. Instead, it puts the threads into the
queue.

Then the real iteration, about which I was talking in my earlier post, is done.
The iteration happens in one go. And requestQueue.put(item) puts all the items
from lRawData into the queue of the run().
But there, the run() already known its limitation on the number of threads.

No, I think the above statement is wrong. The actual pool about the number of
threads is stored by thread_pool. Once its pool (at a time 3 as per this
example) is empty, it again requests for more threads using the requestQueue()

And in function run(), when the item of lRawData is None, the thread stops.
The the cleanup and checks of any remaining threads is done.

Is this all correct ?

I also do have a couple of questions more which would be related to locks. But
I'd post them once I get done with this part.

Thanks,
Ritesh
-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
"Stealing logic from one person is plagiarism, stealing from many is research."
"The great are those who achieve the impossible, the petty are those who
cannot - rrs"




More information about the Python-list mailing list