Akward code using multiple regexp searches

Steven Bethard steven.bethard at gmail.com
Fri Sep 10 02:54:47 EDT 2004


Topher Cawlfield <cawlfiel <at> uiuc.edu> writes:
> Can anyone suggest a more elegant solution?

Does this do what you want?

>>> rexp1 = re.compile(r'blah(dee)blah')
>>> rexp2 = re.compile(r'hum(dum)')
>>> for s in ['blahdeeblah', 'blah blah', 'humdum humdum']:
...     result = rexp1.findall(s) or rexp2.findall(s) or [None]
...     print repr(result[0])
...
'dee'
None
'dum'

The findall function returns all matches of the re in the string, or an empty 
list if there were no matches.  So if the first findall fails, the or-
statement will then execute the second findall, and if that one fails, the 
default value None will be supplied.  Note that findall returns a list of the 
matches, hence why I have to extract the first element of the list at the end.

> I'm a little bit worried about doing the following in Python, since I'm 
> not sure if the compiler is smart enough to avoid doing each regexp 
> search twice:
> 
> for line in inFile:
>      if rexp1.search(line)
>          something = rexp1.search(line).group(1)
>      elif rexp2.search(line):
>          somethingElse = rexp2.search(line).group(1)

You're right here - Python will call the method twice (and therefore search 
the string twice).  It has no way of knowing that these two calls to the same 
method will actually return the same results.  (In general, there are no 
guarantees that calling a method with the same parameters will return the same 
result -- for example, file.read(100))

Steve




More information about the Python-list mailing list