os.walk walks too much

Edward C. Jones edcjones at erols.com
Wed Feb 25 09:53:58 EST 2004


Marcello Pietrobon wrote:
> Hello,
> I am using Pyton 2.3
> I desire to walk a directory without recursion

I am not sure what this means. Do you want to iterate over the 
non-directory files in directory top? For this job I would use:

def walk_files(top):
     names = os.listdir(top)
     for name in names:
         if os.path.isfile(name):
             yield name

> this only partly works:
> def walk_files() :
>    for root, dirs, files in os.walk(top, topdown=True):
>        for filename in files:
>            print( "file:" + os.path.join(root, filename) )
>        for dirname in dirs:
>             dirs.remove( dirname )
> because it skips all the subdirectories but one.

Replace
        for dirname in dirs:
             dirs.remove( dirname )
with
         for i in range(len(dirs)-1, -1, -1):
             del dirs[i]
to make it work. Run

seq = [0,1,2,3,4,5]
for x in seq:
     seq.remove(x)
print seq

to see the problem. If you are iterating through a list selectively 
removing members, you should iterate in reverse. Never change the 
positions in the list of elements that have not yet been reached by the 
iterator.

> this *does not* work at all
> def walk_files() :
>    for root, dirs, files in os.walk(top, topdown=True):
>        for filename in files:
>            print( "file:" + os.path.join(root, filename) )
>        dirs = []

There is a subtle point in the documentation.

"When topdown is true, the caller can modify the dirnames list in-place 
(perhaps using del or slice assignment), and walk() will only recurse 
into the subdirectories whose names remain in dirnames; ..."

The key word is "in-place". "dirs = []" does not change "dirs" in-place.
It replaces "dirs" with a different list. Either use "del"
         for i in range(len(dirs)-1, -1, -1):
             del dirs[i]
as I did above or use "slice assignment"
          dirs[:] = []



More information about the Python-list mailing list