Hack with os.walk()

Michael Spencer mahs at telcopartners.com
Sat Feb 12 16:05:47 EST 2005


Frans Englich wrote:
> Hello,
> 
> Have a look at this recursive function:
> 
> def walkDirectory( directory, element ):
> 
>     element = element.newChild( None, "directory", None ) # automatically 
> appends to parent
>     element.setProp( "name", os.path.basename(directory))
> 
>     for root, dirs, files in os.walk( directory ):
> 
>         for fileName in files:
>             element.addChild( parseFile( os.path.join( root, fileName ))
> 
>         for dirName in filter( acceptDirectory, dirs):
>             walkDirectory( os.path.join( root, dirName ), element )
> 
>         return ### Note, this is inside for loop
> 
> What it does, is it recurses through all directories, and, with libxml2's 
> bindings, builds an XML document which maps directly to the file hierarchy. 
> For every file is parseFile() called, which returns a "file element" which is 
> appended; the resulting structure looks the way one expect -- like a GUI tree 
> view.
> 
> The current code works, but I find it hackish, and it probably is inefficient, 
> considering that os.walk() is not called once(as it usually is), but for 
> every directory level.
> 
> My problem, AFAICT, with using os.walk() the usual way, is that in order to 
> construct the /hierarchial/ XML document, I need to be aware of the directory 
> depth, and a recursive function handles that nicely; os.walk() simply 
> concentrates on figuring out paths to all files in a directory, AFAICT.
> 
> I guess I could solve it with using os.walk() in a traditional way, by somehow 
> pushing libxml2 nodes on a stack, after keeping track of the directory levels 
> etc(string parsing..). Or, one could write ones own recursive directory 
> parser..
> 
> My question is: what is the least ugly? What is the /proper/ solution for my 
> problem? How would you write it in the cleanest way?
> 
> 
> Cheers,
> 
> 		Frans
> 
The path module by Jorendorff: http://www.jorendorff.com/articles/python/path/

wraps various os functions into an interface that can make this sort of thing 
cleaner

Michael




More information about the Python-list mailing list