Best way to report progress at fixed intervals

rdmurray at bitdance.com rdmurray at bitdance.com
Tue Dec 9 13:35:14 EST 2008


On Tue, 9 Dec 2008 at 08:40, Slaunger wrote:
> I am a novice Python 2.5 programmer, who write some cmd line scripts
> for processing large amounts of data.
>
> I would like to have possibility to regularly print out the progress
> made during the processing, say every 1 seconds, and i am wondering
> what a proper generic way to do this is.
>
> I have created this test example to show the general problem. Running
> the script gives me the output:
>
> Work through all 20 steps reporting progress every 1.0 secs...
> Work step 0
> Work step 1
> Work step 2
> Work step 3
> Work step 4
> Processed 4 of 20
> Work step 5
[...]
> Work step 19
> Finished working through 20 steps
[...]
> Quite frankly, I do not like what I have made! It is a mess,
> responsibilities are mixed, and it seems overly complicated. But I
> can't figure out how to do this right.
>
> I would therefore like some feedback on this proposed generic "report
> progress at regular intervals" approach presented here. What could I
> do better?

I felt like a little lunchtime challenge, so I wrote something that
I think matches your spec, based on your sample code.  This is not
necessarily the best implementation, but I think it is simpler and
clearer than yours.  The biggest change is that the work is being
done in the subthread, while the main thread does the monitoring.

It would be fairly simple to enhance this so that you could pass
arbitrary arguments in to the worker function, in addition to
or instead of the loop counter.

-----------------------------------------------------------------------
"""
Test module for testing generic ways of displaying progress
information at regular intervals.
"""
import random
import threading
import time

def work(i):
     """
     Dummy process function, which takes a random time in the interval
     0.0-0.5 secs to execute
     """
     print "Work step %d" % i
     time.sleep(0.5 * random.random())


class Monitor(object):
     """
     This class creates an object that will execute a worker function
     in a loop and at regular intervals emit a progress report on
     how many times the function has been called.
     """

     def dowork(self):
         """
         Call the worker function in a loop, keeping track of how
         many times it was called in self.no
         """
         for self.no in xrange(self.max_iter):
             self.func(self.no)

     def __call__(self, func, verbose=True, max_iter=20, progress_interval=1.0):
         """
         Repeatedly call 'func', passing it the loop count, for max_iter
         iterations, and every progress_interval seconds report how
         many times the function has been called.
         """
         # Not all of these need to be instance variables, but they might
         # as well be in case we want to reference them in an enhanced
         # dowork function.
         self.func = func
         self.verbose = verbose
         self.max_iter=max_iter
         self.progress_interval=progress_interval

         if self.verbose:
             print ("Work through all %d steps reporting progress every "
                 "%3.1f secs...") % (self.max_iter, self.progress_interval)

         # Create a thread to run the loop, and start it going.
         worker = threading.Thread(target=self.dowork)
         worker.start()

         # Monitoring loop.
         loops = 0
         # We're going to loop ten times per second using an integer count,
         # so multiply the seconds parameter by 10 to give it the same
         # magnitude.
         intint = int(self.progress_interval*10)
         # isAlive will be false after dowork returns
         while worker.isAlive():
             loops += 1
             # Wait 0.1 seconds between checks so that we aren't chewing
             # CPU in a spin loop.
             time.sleep(0.1)
             # when the modulus (second element of divmod tuple) is zero,
             # then we have hit a new progress_interval, so emit the report.
             if not divmod(loops, intint)[1]:
                 print "Processed %d of %d" % (self.no, self.max_iter)

         if verbose:
             print "Finished working through %d steps" % max_iter

if __name__ == "__main__":
     #Create the monitor.
     monitor = Monitor()
     #Run the work function under monitoring.
     monitor(work)



More information about the Python-list mailing list