Code For Five Threads To Process Multiple Files?
tdahsu at gmail.com
tdahsu at gmail.com
Fri May 23 10:25:10 EDT 2008
On May 23, 12:20 am, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
> On Thu, 22 May 2008 11:03:48 -0700 (PDT), tda... at gmail.com declaimed the
> following in comp.lang.python:
>
> > Ah, well, I didn't get any other responses, but here's what I've done:
>
> Apparently the direct email from my work address did not get through
> (I don't have group posting ability from work).
>
> > loopCount = 0
> > for l in range(len(self.filesToProcess)):
> > threads = []
> > try:
>
> > threads.append(threading.Thread(target=self.processFiles(self.filesToProcess[loopCount
> > +l])))
>
> Python lists index from 0... So this will be 0+0, first entry in the
> file list
>
>
>
> > threads.append(threading.Thread(target=self.processFiles(self.filesToProcess[loopCount
> > +2])))
>
> This is 0+2, THIRD entry in the file list -- you've just skipped
> over the second entry...
>
> > threads.append(threading.Thread(target=self.processFiles(self.filesToProcess[loopCount
> > +3])))
>
> > threads.append(threading.Thread(target=self.processFiles(self.filesToProcess[loopCount
> > +4])))
>
> > threads.append(threading.Thread(target=self.processFiles(self.filesToProcess[loopCount
> > +5])))
>
> Very ugly... Also going to fail for other reasons... Consider:
>
> filestoprocess = [ 'file1', 'file2', 'file3' ]
> for jnk in range(len(filestoprocess)): #this will loop three times!
> #jnk = 0, 1, 2
>
> You proceed to create FIVE threads (or try to) when there are only
> THREE files... It will fail as soon as it tries loopCount+3 (fourth
> entry in a three element list)
>
> > msg = "Processing file...\n"
> > for thread in threads:
> > wx.CallAfter(self.textctrl03.write(msg),
> > thread.start())
>
> Is this running as the main controller of some GUI? if so...
>
> > for thread in threads:
> > thread.join()
>
> Your GUI will essentially freeze since it can't process events
> (including screen updates) until the entire function you are in returns
> to the event handler... But .join() blocks until the specified thread
> really finishes...
>
> > loopCount += 5
> > except IndexError:
> > pass
>
> BAD style -- if you are going to trap an exception, you should do
> something with it... But then, the only reason you would GET this
> exception is because the preceding code is looping too many times
> relative to the number of files...
>
> As shown, with three files, you will create the first thread (0) for
> first file, skip the second file creating the second thread (1) for the
> third file, and raise an exception on trying to create the third thread
> (2) when you try to access a fourth file in the list. The exception
> will be raised -- SKIPPING over the thread.start() calls, and skipping
> the thread.join() calls. You then ignore the error, and go back to the
> start of the loop where the index is now "1"... AND reset the thread
> list, so threads 0&1 are forgotten, never started, never joined, garbage
> collected...
>
> Again, you now create a thread (0) giving it the second file (since
> loopCount was never incremented, and the first thread is using loopCount
> + <loopindex>), create thread (1) giving it the third file, raise the
> exception... repeat
>
>
>
> > It works, and it works well. It starts five threads, and processes
> > five files at a time. (In the "self.processFiles" I read the whole
> > file into memory using readlines(), which works well.)
>
> It only works as long as loopCount+5 is less than the number of
> files in the list... AND at that, it skips one file and double processes
> another...
>
> > Of course, now the wx.CallAfter function doesn't work... I get
> > "TypeError: 'NoneType' object is not callable" for every time it is
> > run...
>
> Probably because it wants you to supply it with one or two
> /callable/ functions... but you are actually calling the functions and
> passing it the results of the called functions (and they aren't
> returning anything -- None).
>
> Ignoring GUI stuff... here is a simple one-job threadpool algorithm
> -- you have to plug in the file list and the actual processing work. It
> creates n-threads; and those threads pull the work off of a common
> queue; the main program only has to fill the queue with the work to be
> done, and stuff a sentinal value onto the queue when it wants the
> threads to die -- which would be before shutdown of the program (create
> the pool at start-up, leave the threads blocked on the .get() until you
> need one to process...
>
> -=-=-=-=-=-=-=-
> #
> # Example code for a pooled thread file processor
> # NOT EXECUTABLE as is -- there is no code to obtain
> # the list of files to be processed; and the processor
> # just sleeps...
>
> import threading
> import Queue
> import time #just for demo sleep
>
> NUMTHREADS = 5
> SENTINAL = object()
>
> workQueue = Queue.Queue()
>
> def fileProc(): #function that handles processing of the files
> while True:
> fname = workQueue.get()
> if fname is SENTINAL:
> workQueue.put(SENTINAL) #recycle sentinal for next
> break
> print "Processing %s" % fname
> time.sleep(3) #replace with real file processing
>
> threadList = []
> for ti in range(NUMTHREADS): #create worker threads
> t = threading.Thread(target=fileProc)
> t.start()
> threadList.append(t)
>
> for fn in listOfFiles: #queue up the file names to be worked
> workQueue.put(fn) #need to expand to include how names are
> #obtained
>
> workQueue.put(SENTINAL) #signal that no more files are to be worked
>
> for t in threadList:
> t.join() #wait for each thread to exit (ensures main
> #doesn't exit before all threads finish
> processing
>
> --
> Wulfraed Dennis Lee Bieber KD6MOG
> wlfr... at ix.netcom.com wulfr... at bestiaria.com
> HTTP://wlfraed.home.netcom.com/
> (Bestiaria Support Staff: web-a... at bestiaria.com)
> HTTP://www.bestiaria.com/
Thanks for the information! I can definitely see what you're talking
about, and the Exception is only "pass" right now while I am working
on the code.
However, it does process every file (it doesn't skip the second one),
and I'm guessing that this is because it loops so many times? I guess
that means I am successful in spite of myself! ;-) (This wouldn't be
the first time... ;-) )
I REALLY appreciate your insights!!
More information about the Python-list
mailing list