regex problem with re and fnmatch

John Machin sjmachin at lexicon.net
Tue Nov 20 16:40:55 EST 2007


On Nov 21, 8:05 am, Fabian Braennstroem <f.braennstr... at gmx.de> wrote:
> Hi,
>
> I would like to use re to search for lines in a files with
> the word "README_x.org", where x is any number.
> E.g. the structure would look like this:
> [[file:~/pfm_v99/README_1.org]]
>
> I tried to use these kind of matchings:
> #        org_files='.*README\_1.org]]'
>         org_files='.*README\_*.org]]'
>         if re.match(org_files,line):

First tip is to drop the leading '.*' and use search() instead of
match(). The second tip is to use raw strings always for your
patterns.

>
> Unfortunately, it matches all entries with "README.org", but
> not the wanted number!?

\_* matches 0 or more occurrences of _ (the \ is redundant). You need
to specify one or more digits -- use \d+ or [0-9]+

The . in .org matches ANY character except a newline. You need to
escape it with a \.

>>> pat = r'README_\d+\.org'
>>> re.search(pat, 'xxxxREADME.org')
>>> re.search(pat, 'xxxxREADME_.org')
>>> re.search(pat, 'xxxxREADME_1.org')
<_sre.SRE_Match object at 0x00B899C0>
>>> re.search(pat, 'xxxxREADME_9999.org')
<_sre.SRE_Match object at 0x00B899F8>
>>> re.search(pat, 'xxxxREADME_9999Zorg')
>>>

>
> After some splitting and replacing I am able to check, if
> the above file exists. If it does not, I start to search for
> it using the 'walk' procedure:

I presume that you mean something like: """.. check if the above file
exists in some directory. If it does not, I start to search for  it
somewhere else ..."""

>
>                 for root, dirs, files in
> os.walk("/home/fab/org"):

>                     for name in dirs:
>                         dirs=os.path.join(root, name) + '/'

The above looks rather suspicious ...
for thing in container:
    container = something_else
????
What are you trying to do?


>                     for name in files:
>                          files=os.path.join(root, name)

and again ....

>                     if fnmatch.fnmatch(str(files), "README*"):

Why str(name) ?

>                         print "File Found"
>                         print str(files)
>                         break


fnmatch is not as capable as re; in particular it can't express "one
or more digits". To search a directory tree for the first file whose
name matches a pattern, you need something like this:
def find_one(top, pat):
   for root, dirs, files in os.walk(top):
      for fname in files:
         if re.match(pat + '$', fname):
            return os.path.join(root, fname)


> As soon as it finds the file,

"the" file or "a" file???

Ummm ... aren't you trying to locate a file whose EXACT name you found
in the first exercise??

def find_it(top, required):
   for root, dirs, files in os.walk(top):
      if required in files:
            return os.path.join(root, required)


> it should stop the searching
> process; but there is the same matching problem like above.

HTH,
John



More information about the Python-list mailing list