dirwalk.py generator version of os.path.walk

Jim Dennis jimd at vega.starshine.org
Mon Feb 25 05:21:02 EST 2002


 This function could probably use a bit of polishing,
 and it certainly could use some enhancement (some options to
 control if, and how we follow symlinks, to how to handle 
 exceptions on listdir(), whether to be depth first, and an 
 option to avoid crossing mount boundaries with os.path.ismount(), 
 etc).

 However, it seems to work.  

 dirwalk() simply takes an optional top level directory/path name
 as an argument and instantiates a generator which will walk down
 that tree and return every filename that it can access.  

 It's late and I need sleep.  So I'm just going to post this in
 it's rough (and probably buggy) form and let y'all thrash on it
 a bit.

 I guess there's some sort of statcache module that might let me
 cache the stat() tuples.  I guess I'm implicitly incurring a stat()
 system call for each node by checking islink() and isdir() on it
 so it seems like I ought to cache that and make it available to 
 my caller (without forcing them to make an additional stat system
 call).

 I hope that something like this (a simple dirwalk() or other 
 greatly simplified alternative to os.path.walk()) makes it into 
 Python 2.3 or later.

#!/usr/bin/env python2.2
from __future__ import generators 
import os

def dirwalk(startdir=None):
	if not startdir:
		startdir="."
	if not os.path.isdir(startdir):
		raise ValueError ## Is this the right exception?
	stack = [startdir]
	while stack:
		cwd = stack.pop(0)
		try:
			current = os.listdir(cwd)
		except (OSError):
			continue	# Skip it if we don't have access
		for each in current:
			each = os.path.join(cwd,each)
			if os.path.islink(each): 
				pass
			elif os.path.isdir(each):
				stack.append(each)
			yield(each)

if __name__ == "__main__":
	# import unittest?
	# test suite should consist of:
	# 	dirwalk() vs. os.listdir()
	# 	dirwalk("/") vs. os.path.walk()
	# 	dirwalk("/etc/passwd") (should raise exception)
	import sys
	for i in sys.argv[1:]:
		for j in dirwalk(i):
			print j
	# should compare this to os.popen("find ....") and
	# or to os.path.walk(...)




More information about the Python-list mailing list