dirwalk.py generator version of os.path.walk
Jim Dennis
jimd at vega.starshine.org
Mon Feb 25 05:21:02 EST 2002
This function could probably use a bit of polishing,
and it certainly could use some enhancement (some options to
control if, and how we follow symlinks, to how to handle
exceptions on listdir(), whether to be depth first, and an
option to avoid crossing mount boundaries with os.path.ismount(),
etc).
However, it seems to work.
dirwalk() simply takes an optional top level directory/path name
as an argument and instantiates a generator which will walk down
that tree and return every filename that it can access.
It's late and I need sleep. So I'm just going to post this in
it's rough (and probably buggy) form and let y'all thrash on it
a bit.
I guess there's some sort of statcache module that might let me
cache the stat() tuples. I guess I'm implicitly incurring a stat()
system call for each node by checking islink() and isdir() on it
so it seems like I ought to cache that and make it available to
my caller (without forcing them to make an additional stat system
call).
I hope that something like this (a simple dirwalk() or other
greatly simplified alternative to os.path.walk()) makes it into
Python 2.3 or later.
#!/usr/bin/env python2.2
from __future__ import generators
import os
def dirwalk(startdir=None):
if not startdir:
startdir="."
if not os.path.isdir(startdir):
raise ValueError ## Is this the right exception?
stack = [startdir]
while stack:
cwd = stack.pop(0)
try:
current = os.listdir(cwd)
except (OSError):
continue # Skip it if we don't have access
for each in current:
each = os.path.join(cwd,each)
if os.path.islink(each):
pass
elif os.path.isdir(each):
stack.append(each)
yield(each)
if __name__ == "__main__":
# import unittest?
# test suite should consist of:
# dirwalk() vs. os.listdir()
# dirwalk("/") vs. os.path.walk()
# dirwalk("/etc/passwd") (should raise exception)
import sys
for i in sys.argv[1:]:
for j in dirwalk(i):
print j
# should compare this to os.popen("find ....") and
# or to os.path.walk(...)
More information about the Python-list
mailing list