os.walk()

Mike Meyer mwm at mired.org
Thu Feb 17 20:12:43 EST 2005


rbt <rbt at athop1.ath.vt.edu> writes:

> Could someone demonstrate the correct/proper way to use os.walk() to
> skip certain files and folders while walking a specified path? I've
> read the module docs and googled to no avail and posted here about
> other os.walk issues, but I think I need to back up to the basics or
> find another tool as this isn't going anywhere fast... I've tried this:
>
> for root, dirs, files in os.walk(path, topdown=True):
>
>      file_skip_list = ['file1', 'file2']
>      dir_skip_list = ['dir1', 'dir2']
>
>      for f in files:
>          if f in file_skip_list
>              files.remove(f)
>
>      for d in dirs:
>          if d in dir_skip_list:
>              dirs.remove(d)
>
>      NOW, ANALYZE THE FILES
>
> And This:
>
>      files = [f for f in files if f not in file_skip_list]
>      dirs = [d for d in dirs if dir not in dir_skip_list]
>
>      NOW, ANAYLZE THE FILES
>
> The problem I run into is that some of the files and dirs are not
> removed while others are. I can be more specific and give exact
> examples if needed. On WinXP, 'pagefile.sys' is always removed, while
> 'UsrClass.dat' is *never* removed, etc.

As other have pointed out, the problem you are running into is that
you are modifying the list while looping over it. You can fix this by
creating copies of the list. No one has presented the LC version yet:

         for rl, dl, fl in os.walk(path, topdown=True):
             file_skip_list = ('file1', 'file2')  #*
             dir_skip_list = ('dir1', 'dir2')
             
         files = [f for f in fl if not f in file_skip_list]
         dirs = [d for d in dl if not d in dir_skip_list]

         # Analyze files and dirs
         
If you're using 2.4, you might consider using generators instead of
LC's to avoid creating the second copy of the list:

     files = (f for f in fl if not f in file_skip_list)
     dirs = (d for d in dl if not d in dir_skip_list)

     <mike

*) I changed the short list to short tuples, because I use tuples if
I'm not going to modify the list.
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list