Is there an alternative to os.walk?

Bruce epost2 at gmail.com
Sat Oct 7 13:34:10 EDT 2006


waylan wrote:
> Bruce wrote:
> > Hi all,
> > I have a question about traversing file systems, and could use some
> > help. Because of directories with many files in them, os.walk appears
> > to be rather slow. I`m thinking there is a potential for speed-up since
> > I don`t need os.walk to report filenames of all the files in every
> > directory it visits. Is there some clever way to use os.walk or another
> > tool that would provide functionality like os.walk except for the
> > listing of the filenames?
>
> You might want to check out the path module [1] (not os.path). The
> following is from the docs:
>
> > The method path.walk() returns an iterator which steps recursively
> > through a whole directory tree. path.walkdirs() and path.walkfiles()
> > are the same, but they yield only the directories and only the files,
> > respectively.
>
> Oh, and you can thank Paul Bissex for pointing me to path [2].
>

> [1]: http://www.jorendorff.com/articles/python/path/
> [2]: http://e-scribe.com/news/289

A little late but.. thanks for the replies, was very useful. Here`s
what I do in this case:

def search(a_dir):
   valid_dirs = []
   walker = os.walk(a_dir)
   while 1:
       try:
           dirpath, dirnames, filenames = walker.next()
       except StopIteration:
           break
       if dirtest(dirpath,filenames):
           valid_dirs.append(dirpath)
   return valid_dirs

def dirtest(a_dir):
   testfiles = ['a','b','c']
   for f in testfiles:
       if not os.path.exists(os.path.join(a_dir,f)):
           return 0
   return 1

I think you`re right - it`s not os.walk that makes this slow, it`s the
dirtest method that takes so much more time when there are many files
in a directory. Also, thanks for pointing me to the path module, was
interesting.




More information about the Python-list mailing list