regex problem with re and fnmatch

Fabian Braennstroem f.braennstroem at gmx.de
Wed Nov 21 16:05:16 EST 2007


Hi John,

John Machin schrieb am 11/20/2007 09:40 PM:
> On Nov 21, 8:05 am, Fabian Braennstroem <f.braennstr... at gmx.de> wrote:
>> Hi,
>>
>> I would like to use re to search for lines in a files with
>> the word "README_x.org", where x is any number.
>> E.g. the structure would look like this:
>> [[file:~/pfm_v99/README_1.org]]
>>
>> I tried to use these kind of matchings:
>> #        org_files='.*README\_1.org]]'
>>         org_files='.*README\_*.org]]'
>>         if re.match(org_files,line):
> 
> First tip is to drop the leading '.*' and use search() instead of
> match(). The second tip is to use raw strings always for your
> patterns.
> 
>> Unfortunately, it matches all entries with "README.org", but
>> not the wanted number!?
> 
> \_* matches 0 or more occurrences of _ (the \ is redundant). You need
> to specify one or more digits -- use \d+ or [0-9]+
> 
> The . in .org matches ANY character except a newline. You need to
> escape it with a \.
> 
>>>> pat = r'README_\d+\.org'
>>>> re.search(pat, 'xxxxREADME.org')
>>>> re.search(pat, 'xxxxREADME_.org')
>>>> re.search(pat, 'xxxxREADME_1.org')
> <_sre.SRE_Match object at 0x00B899C0>
>>>> re.search(pat, 'xxxxREADME_9999.org')
> <_sre.SRE_Match object at 0x00B899F8>
>>>> re.search(pat, 'xxxxREADME_9999Zorg')
>>>>

Thanks a lot, works really nice!

>> After some splitting and replacing I am able to check, if
>> the above file exists. If it does not, I start to search for
>> it using the 'walk' procedure:
> 
> I presume that you mean something like: """.. check if the above file
> exists in some directory. If it does not, I start to search for  it
> somewhere else ..."""
> 
>>                 for root, dirs, files in
>> os.walk("/home/fab/org"):
> 
>>                     for name in dirs:
>>                         dirs=os.path.join(root, name) + '/'
> 
> The above looks rather suspicious ...
> for thing in container:
>     container = something_else
> ????
> What are you trying to do?
> 
> 
>>                     for name in files:
>>                          files=os.path.join(root, name)
> 
> and again ....
> 
>>                     if fnmatch.fnmatch(str(files), "README*"):
> 
> Why str(name) ?
> 
>>                         print "File Found"
>>                         print str(files)
>>                         break
> 
> 
> fnmatch is not as capable as re; in particular it can't express "one
> or more digits". To search a directory tree for the first file whose
> name matches a pattern, you need something like this:
> def find_one(top, pat):
>    for root, dirs, files in os.walk(top):
>       for fname in files:
>          if re.match(pat + '$', fname):
>             return os.path.join(root, fname)
> 
> 
>> As soon as it finds the file,
> 
> "the" file or "a" file???
> 
> Ummm ... aren't you trying to locate a file whose EXACT name you found
> in the first exercise??
> 
> def find_it(top, required):
>    for root, dirs, files in os.walk(top):
>       if required in files:
>             return os.path.join(root, required)

Great :-) Thanks a lot for your help... it can be so easy :-)
Fabian





More information about the Python-list mailing list