walk directory & ignore all files/directories begin with '.'

albert kao albertkao3 at gmail.com
Thu May 13 16:10:30 EDT 2010


On May 13, 3:10 pm, MRAB <pyt... at mrabarnett.plus.com> wrote:
> albert kao wrote:
> > I want to walk a directory and ignore all the files or directories
> > which names begin in '.' (e.g. '.svn').
> > Then I will process all the files.
> > My test program walknodot.py does not do the job yet.
> > Python version is 3.1 on windows XP.
> > Please help.
>
> > [code]
> > #!c:/Python31/python.exe -u
> > import os
> > import re
>
> > path = "C:\\test\\com.comp.hw.prod.proj.war\\bin"
> > for dirpath, dirs, files in os.walk(path):
> >     print ("dirpath " + dirpath)
> >     p = re.compile('\\\.(\w)+$')
> >     if p.match(dirpath):
> >         continue
> >     print ("dirpath " + dirpath)
> >     for dir in dirs:
> >         print ("dir " + dir)
> >         if dir.startswith('.'):
> >             continue
>
> >         print (files)
> >         for filename in files:
> >             print ("filename " + filename)
> >             if filename.startswith('.'):
> >                 continue
> >             print ("dirpath filename " + dirpath + "\\" + filename)
> >                # process the files here
> > [/code]
>
> > C:\python>walknodot.py
> > dirpath C:\test\com.comp.hw.prod.proj.war\bin
> > dirpath C:\test\com.comp.hw.prod.proj.war\bin
> > dir .svn
> > dir com
> > []
> > dirpath C:\test\com.comp.hw.prod.proj.war\bin\.svn
> > dirpath C:\test\com.comp.hw.prod.proj.war\bin\.svn
> > ...
>
> > I do not expect C:\test\com.comp.hw.prod.proj.war\bin\.svn to appear
> > twice.
> > Please help.
>
> The problem is with your use of the 'match' method, which will look for
> a match only at the start of the string. You need to use the 'search'
> method instead.
>
> The regular expression is also incorrect. The string literal:
>
>      '\\\.(\w)+$'
>
> passes the characters:
>
>      \\.(\w)+$
>
> to the re module as the regular expression, which will match a
> backslash, then any character, then a word, then the end of the string.
> What you want is:
>
>      \\\.\w+$
>
> (you don't need the parentheses) which is best expressed as the 'raw'
> string literal:
>
>      r'\\\.\w+$'
Following your advice and add the case for C:\test
\com.comp.hw.prod.proj.war\bin\.svn\tmp
    p = re.compile(r'\\\.\w+$')
    if p.search(dirpath):
        continue
    p = re.compile(r'\\\.\w+\\')
    if p.search(dirpath):
        continue

Problem is solved.
Thanks.



More information about the Python-list mailing list