os.path.walk not pruning descent tree (and I'm not happy with that behavior?)

Peter Otten __peter__ at web.de
Mon May 28 01:16:41 EDT 2007


Joe Ardent wrote:

> Good day, everybody!  From what I can tell from the archives, this is
> everyone's favorite method from the standard lib, and everyone loves
> answering questions about it.  Right? :)

I don't know what to make of the smiley, so I'll be explicit: use os.walk()
instead of os.path.walk().

> Anyway, my question regards the way that the visit callback modifies
> the names list.  Basically, my simple example is:
> 
> ##############################
> def listUndottedDirs( d ):
>     dots = re.compile( '\.' )
> 
>     def visit( arg, dirname, names ):
>         for f in names:
>             if dots.match( f ):
>                 i = names.index( f )
>                 del names[i]
>             else:
>                 print "%s: %s" % ( dirname, f )
> 
>     os.path.walk( d, visit, None )
> ###############################
> 
> Basically, I don't want to visit any hidden subdirs (this is a unix
> system), nor am I interested in dot-files.  If I call the function
> like, "listUndottedDirs( '/usr/home/ardent' )", however, EVEN THOUGH
> IT IS REMOVING DOTTED DIRS AND FILES FROM names, it will recurse into
> the dotted directories; eg, if I have ".kde3/" in that directory, it
> will begin listing the contents of /usr/home/ardent/.kde3/ .  Here's
> what the documentation says about this method:
> 
> "The visit function may modify names to influence the set of
> directories visited below dirname, e.g. to avoid visiting certain
> parts of the tree. (The object referred to by names must be modified
> in place, using del or slice assignment.)"
> 
> So...  What am I missing?  Any help would be greatly appreciated.

Your problem is that you are deleting items from a list while iterating over
it:

# WRONG
>>> names = [".alpha", ".beta", "gamma"]
>>> for name in names:
...     if name.startswith("."):
...             del names[names.index(name)]
...
>>> names
['.beta', 'gamma']

Here's one way to avoid that mess:

>>> names = [".alpha", ".beta", "gamma"]
>>> names[:] = [name for name in names if not name.startswith(".")]
>>> names
['gamma']

The slice [:] on the left side is necessary to change the list in-place.

Peter




More information about the Python-list mailing list