[Tutor] Excluding branches while walking directory tree

Kent Johnson kent37 at tds.net
Wed Sep 13 17:34:25 CEST 2006


William O'Higgins Witteman wrote:
> Hello all,
> 
> I am looking for an approach for the following problem:
> 
> I have to walk a directory tree and examine files within it.  I have a
> set of directory names and filename patterns that I must skip while
> doing this walk.  How do I create a set of rules to skip files or
> directory branches?  I'm looking for something reasonably scalable, 
> 'cause I'm sure to need to update these rules in the future.

os.walk() lets you prune the directory list to avoid processing specific 
directories.

fnmatch.fnmatch() does simple wildcard pattern matching. So maybe 
something like this (*not* tested):

import os, fnmatch

def matchesAny(name, tests):
   for test in tests:
     if fnmatch.fnmatch(name, test):
       return True
   return False

dirsToSkip = [ ... ] # list of directory patterns to skip
filesToSkip = [ ... ] # list of file patterns to skip

for dirpath, dirnames, filenames in os.walk(baseDir):
   # Note use of slice assignment - you have to modify the caller's list
   dirnames[:] = [ name for name in dirnames if not matchesAny(name, 
dirsToSkip) ]

   filenames = [name for name in filenames if not matchesAny(name, 
filesToSkip) ]

   for name in filenames:
     # whatever file processing you want to do goes here


You could get the list of patterns from another file that you import, or 
a config file, or command line args, depending on how you want to store 
them and change them.

For more flexibility in the patterns you could use a regular expression 
match instead of fnmatch(). You also might want to match on full paths 
rather than just the file or dir name.

HTH,
Kent



More information about the Tutor mailing list