Threads and temporary files

aiwarrior zubeido at yahoo.com.br
Sat Mar 14 15:25:00 EDT 2009


On Mar 14, 3:01 am, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Fri, 13 Mar 2009 19:07:46 -0200, aiwarrior <zube... at yahoo.com.br>  
> escribió:
>
> > I recently am meddling with threads and wanted to make a threaded
> > class that instead of processing anything just retrieves data from a
> > file and returns that data to a main thread that takes all the
> > gathered data and concatenates it sequentially.
> > An example is if we want to get various ranges of an http resource in
> > paralell
>
> The usual way to communicate between threads is using a Queue object.
> Instead of (create a thread, do some work, exit/destroy thread) you could  
> create the threads in advance (a "thread pool" of "worker threads") and  
> make them wait for some work to do from a queue (in a quasi-infinite  
> loop). When work is done, they put results in another queue. The main  
> thread just places work units on the first queue; another thread  
> reassembles the pieces from the result queue. For an I/O bound application  
> like yours, this should work smoothly.
> You should be able to find examples on the web - try the Python Cookbook.
>
> --
> Gabriel Genellina

I already tried a double queue implementation as you suggest with a
queue for the threads to get info from and another for the threads to
put the info in. My implementation test was using a file with some
lines of random data.
Here it is

class DownloadUrl(threading.Thread):
    def __init__(self,queue_in,queue_out):
        threading.Thread.__init__( self )

        #self.url = url
        #self.starts = starts
        #self.ends = ends
        self.queue_in = queue_in
        self.queue_out = queue_out

    def run(self):

        (fp,i) = self.queue_in.get()
        self.queue_in.task_done()
        #print var
        #self.queue_out.put("i",False)

worknr = 5
queue_in = Queue.Queue(worknr)
queue_out = Queue.Queue(worknr)
threads = []
fp = open("./xi","r")
#print fp.readlines()

for i in xrange(10):
	queue_in.put((fp,i))
	DownloadUrl(queue_in,queue_out).start()


queue_in.join()
while queue_out.qsize():
	print queue_out.get()
	queue_out.task_done()

>Any reason you're using threads instead of processes?
Perhaps because of more flexible way to share data between threads
than processes



More information about the Python-list mailing list