re.match and non-alphanumeric characters

Sun Nov 16 12:44:10 EST 2008

The Web President wrote:

> Dear all,
> 
> this is really driving me nuts and any help would be extremely
> appreciated.
> 
> I have a string that contains some numeric data. I want to isolate
> these data using re.match, as follows.
> 
> bogus = "IFC(35m)"
> data = re.match(r'(\d+)',bogus)
> print data.group(1)
> 
> I would expect to have "35" printed out to screen, but instead I get
> an error that the regular expression did not match:
> 
> Traceback (most recent call last):
>   File "C:\Documents and Settings\Mattia\Desktop\Neeltje\read.py",
> line 20, in <module>
>     print data.group(1)
> AttributeError: 'NoneType' object has no attribute 'group'
> 
> Note that the same holds if I look for "35" straight, instead of "\d
> +". If instead I look for "IFC" it works fine. That is, apparently
> re.match will match only up to the first non-alphanumeric character
> and ignore anything after a "(", "_", "[" and god knows what else.
> 
> I am using Python 2.6 (r26:66721, latest stable version). Am I missing
> something very big and very important?

Yep - re.search. Match matches the whole string. You want searching.

Diez