[Tutor] Regexp Not Matching on Numbers?

Gooch, John John.Gooch at echostar.com
Tue Dec 14 21:47:17 CET 2004


I am used to ( in Perl ) the entire string being searched for a match when
using RegExp's. I assumed this was the way Python would do it do, as
Java/Javascript/VbScript all behaved in this manner. However, I found that I
had to add ".*" in front of my regular expression object before it would
search the entire string for a match. This seems a bit unusual from my past
experience, but it solved the issue I was experiencing. 

Thank you for your help.

John Gooch 

-----Original Message-----
From: Kent Johnson [mailto:kent37 at tds.net] 
Sent: Tuesday, December 14, 2004 10:41 AM
To: Gooch, John
Cc: 'tutor at python.org'
Subject: Re: [Tutor] Regexp Not Matching on Numbers?


I'm not too sure what you are trying to do here, but the re in your code
matches the names in your 
example data:
 >>> import re
 >>> name = 'partners80_access_log.1102723200'
 >>> re.compile(r"([\w]+)").match( name ).groups()
('partners80_access_log',)

One thing that may be tripping you up is that re.match() only matches at the
*start* of a string, as 
if the regex starts with '^'. For matching anywhere in the string use
re.search() instead.

Another possible problem is that you are defining regexp but not using it -
have you been trying to 
change the value of regexp and wondering why the program doesn't change?

If you are looking for file names that end in log.ddd then try
re.search(r'log\.\d+$', name)

If this doesn't help please be more specific about the format of the names
you want to match and 
exclude.

Kent

Gooch, John wrote:
> This is weird. I have a script that checks walks through directories, 
> checks to see if their name matches a certain format ( regular 
> expression ), and then prints out what it finds. However, it refuses 
> to ever match on numbers unless the regexp is ".*". So far I have 
> tried the following regular
> expressions:
> "\d+"
> "\d*"
> "\W+"
> "\W*"
> "[1-9]+"
> and more...
> 
> 
> Here is an example of the output:
> No Match on partners80_access_log.1102723200
> Checking on type=<type 'str'> Name = some_access_log.1102896000
> 
> 
> Here is the code:
> import os,sys
> import re
> #define the file mask to identify web log files
> regexp = re.compile( r"\." )
> 
> startdir = "C:/Documents and Settings/John.Gooch/My Documents/chaser/" 
> #define global functions def delFile(arg, dirname, names):
>     found = 0
>     for name in names:
>         print "Checking on type="+str(type( name ) )+" Name = "+str(name)
>         matches = re.compile(r"([\w]+)").match( name )
>         if matches:
>             print "Match on "+str(matches.groups()) 
>             found = 1
>         else:
>             print "No Match on "+name
>     if not found:
>         print "No matches found in "+dirname
>     else:
>         print "Match found in "+dirname
> os.path.walk( startdir, delFile, "" )
> 
> 
> 
> 
> Any thoughts?
> _______________________________________________
> Tutor maillist  -  Tutor at python.org 
> http://mail.python.org/mailman/listinfo/tutor
> 


More information about the Tutor mailing list