Threads and Progress Bar

Ritesh Raj Sarraf rrs at researchut.com
Fri Sep 1 15:50:20 EDT 2006


Dennis Lee Bieber on Friday 01 Sep 2006 23:04 wrote:

> Well... first off -- some minimal code would be of use...
>

I was scared that people might feel that I'm asking for a ready-made
solution. :-)

Here's the code.

The is the progress bar code.
progressbar.py
class progressBar:
    def __init__(self, minValue = 0, maxValue = 10, totalWidth=12):
        self.progBar = "[]"   # This holds the progress bar string
        self.min = minValue
        self.max = maxValue
        self.span = maxValue - minValue
        self.width = totalWidth
        self.amount = 0       # When amount == max, we are 100% done 
        self.updateAmount(0)  # Build progress bar string

    def updateAmount(self, newAmount = 0):
        if newAmount < self.min: newAmount = self.min
        if newAmount > self.max: newAmount = self.max
        self.amount = newAmount

        # Figure out the new percent done, round to an integer
        diffFromMin = float(self.amount - self.min)
        percentDone = (diffFromMin / float(self.span)) * 100.0
        percentDone = round(percentDone)
        percentDone = int(percentDone)

        # Figure out how many hash bars the percentage should be
        allFull = self.width - 2
        numHashes = (percentDone / 100.0) * allFull
        numHashes = int(round(numHashes))

        # build a progress bar with hashes and spaces
        self.progBar = "[" + '#'*numHashes + ' '*(allFull-numHashes) + "]"

        # figure out where to put the percentage, roughly centered
        percentPlace = (len(self.progBar) / 2) - len(str(percentDone)) 
        percentString = str(percentDone) + "%"
        
        # slice the percentage into the bar
        self.progBar = self.progBar[0:percentPlace] + percentString +
self.progBar[percentPlace+len(percentString):] \
        + " " + str(newAmount/1024) + "KB of " + str(self.max/1024) + "KB"

    def __str__(self):
        return str(self.progBar)
        
def myReportHook(count, blockSize, totalSize):
    import sys
    global prog
    prog = ""

    if prog == "":
        prog = progressBar(0,totalSize,50)
    prog.updateAmount(count*blockSize)
    sys.stdout.write (str(prog))
    sys.stdout.write ("\r")
    #print count * (blockSize/1024) , "kb of " , (totalSize/1024) , "kb
downloaded.\n"


Here's the function, download_from_web() which calls the progress bar:
main.py
def download_from_web(sUrl, sFile, sSourceDir, checksum):
    
    try:
        block_size = 4096
        i = 0
        counter = 0
        
        os.chdir(sSourceDir)
        temp = urllib2.urlopen(sUrl)
        headers = temp.info()
        size = int(headers['Content-Length'])
        data = open(sFile,'wb')
        
        log.msg("Downloading %s\n" % (sFile))
        while i < size:
            data.write (temp.read(block_size))
            i += block_size
            counter += 1
            progressbar.myReportHook(counter, block_size, size)
        print "\n"
        data.close()
        temp.close()


And since I later implemented threads, multiple threads call download_from_web()
concurrently, which in effect calls progress bar, thus I get a progress bar
which continuously keeps getting overwritten. :-)

Here's the code where multiple threads execute:

try:
            lRawData = open(uri, 'r').readlines()
        except IOError, (errno, strerror):
            log.err("%s %s\n" % (errno, strerror))
            errfunc(errno, '')
            
        
        #INFO: Mac OS is having issues with Python Threading.
        # Use the conventional model for Mac OS
        if sys.platform == 'darwin':
            log.verbose("Running on Mac OS. Python doesn't have proper support
for Threads on Mac OS X.\n")
            log.verbose("Running in the conventional non-threaded way.\n")
            for each_single_item in lRawData:
                (sUrl, sFile, download_size, checksum) =
stripper(each_single_item)
                
                if download_from_web(sUrl, sFile, sSourceDir, None) != True:
                    #sys.stderr.write("%s not downloaded from %s\n" % (sFile,
sUrl))
                    #sys.stderr.write("%s failed\n\n" % (sFile))
                    variables.errlist.append(sFile)
                    pass
                else:
                    if zip_bool:
                        compress_the_file(zip_type_file, sFile, sSourceDir)
                        os.remove(sFile) # Remove it because we don't need the
file once it is zipped.
        else:
            #INFO: Thread Support
            if variables.options.num_of_threads > 1:
                log.msg("WARNING: Threads is still in alpha stage. It's better
to use just a single thread at the moment.\n")
                log.warn("Threads is still in alpha stage. It's better to use
just a single thread at the moment.\n")
                
            NUMTHREADS = variables.options.num_of_threads
            name = threading.currentThread().getName()
            ziplock = threading.Lock()
            
            def run(request, response, func=download_from_web):
                '''Get items from the request Queue, process them
                with func(), put the results along with the
                Thread's name into the response Queue.
                
                Stop running once an item is None.'''
            
                while 1:
                    item = request.get()
                    if item is None:
                        break
                    (sUrl, sFile, download_size, checksum) = stripper(item)
                    response.put((name, sUrl, sFile, func(sUrl, sFile,
sSourceDir, None)))
                    
                    # This will take care of making sure that if downloaded,
they are zipped
                    (thread_name, Url, File, exit_status) = responseQueue.get()
                    if exit_status == True:
                        if zip_bool:
                            ziplock.acquire()
                            try:
                                compress_the_file(zip_type_file, File,
sSourceDir)
                                os.remove(File) # Remove it because we don't
need the file once it is zipped.
                            finally:
                                ziplock.release()
                    else:
                        variables.errlist.append(File)
                        pass
            
            # Create two Queues for the requests and responses
            requestQueue = Queue.Queue()
            responseQueue = Queue.Queue()
            
            # Pool of NUMTHREADS Threads that run run().
            thread_pool = [
                           threading.Thread(
                                  target=run,
                                  args=(requestQueue, responseQueue)
                                  )
                           for i in range(NUMTHREADS)
                           ]
            
            # Start the threads.
            for t in thread_pool: t.start()
            
            # Queue up the requests.
            for item in lRawData: requestQueue.put(item)
            
            # Shut down the threads after all requests end.
            # (Put one None "sentinel" for each thread.)
            for t in thread_pool: requestQueue.put(None)
            
            # Don't end the program prematurely.
            #
            # (Note that because Queue.get() is blocking by
            # defualt this isn't strictly necessary. But if
            # you were, say, handling responses in another
            # thread, you'd want something like this in your
            # main thread.)
            for t in thread_pool: t.join()

 
> Second... It sounds like you only created one progress bar, and each
> thread is referencing that single bar. I'd suspect you need to create a
> bar for EACH thread you create, and tell the thread which bar to update.

Yes, you're correct. That's what I'm also suspecting. I tried to do some minor
changes but couldn't succeed.
Request you to, if you reply with code, give a little explanation so that I can
understand and learn from it.

Thanks,
Ritesh
-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."
"Stealing logic from one person is plagiarism, stealing from many is research."
"The great are those who achieve the impossible, the petty are those who
cannot - rrs"




More information about the Python-list mailing list