How to cleanly pause/stop a long running function?

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Sat May 12 23:45:17 EDT 2007


On Sat, 12 May 2007 13:51:05 -0700, Basilisk96 wrote:

> Suppose I have a function that may run for a long time - perhaps from
> several minutes to several hours. An example would be this file
> processing function:
> 
> import os
> def processFiles(startDir):
>     for root, dirs, files in os.walk(startDir):
>         for fname in files:
>             if fname.lower().endswith(".zip"):
>                 # ... do interesting stuff with the file here ...
> 
> Imagine that there are thousands of files to process. This could take
> a while. How can I implement this so that the caller can pause or
> interrupt this function, and resume its program flow?

I don't think there really is what I would call a _clean_ way, although
people may disagree about what's clean and what isn't.

Here's a way that uses global variables, with all the disadvantages that
entails:

last_dir_completed = None
restart = object()  # a unique object

def processFiles(startDir):
    global last_dir_completed
    if startDir is restart:
        startDir = last_dir_completed
    for root, dirs, files in os.walk(startDir):
        for fname in files:
            if fname.lower().endswith(".zip"):
                # ... do interesting stuff with the file here ...
        last_Dir_completed = root



Here's another way, using a class. Probably not the best way, but a way.

class DirLooper(object):
    def __init__(self, startdir):
        self.status = "new"
        self.startdir = startdir
        self.root = startdir
    def run(self):
        if self.status == 'new':
            self.loop(self.startdir)
        elif self.status == 'finished':
            print "nothing to do"
        else:
            self.loop(self.root)
    def loop(self, where):
        self.status = "started"
        for self.root, dirs, files in os.walk(where):
            # blah blah blah... 


Here's another way, catching the interrupt:

def processFiles(startDir):
    try:
        for root, dirs, files in os.walk(startDir):
            # blah blah blah ...
    except KeyboardInterrupt:
        do_something_with_status()


You can fill in the details :)


As for which is "better", I think the solution using a global variable is
the worst, although it has the advantage of being easy to implement. I
think you may need to try a few different implementations and judge for
yourself. 


-- 
Steven.




More information about the Python-list mailing list