repeating regular expressions in one string

Wed Nov 16 15:21:51 EST 2005

"Shane" wrote

> I have a bunch of strings that looks like this:
>
> 'blahblah_sf1234-sf1238_blahblah'
>
> and I would like to use the re module to parse all the 'sfXXXX' parts
> of the string. Each 'sfXXXX' needs to be its own string when I am
> through. How do I compile a regular expression that looks for more
> than one instance? Currently my expression looks like this:
>
> myString = re.compile('sf[0-9][0-9][0-9][0-9]')
>
> This works great for finding the first instance of 'sfXXXX'.

if you want to extract all matches, you can either call the search method
again (with a start offset), or use a method that returns all matches:

>>> s = 'blahblah_sf1234-sf1238_blahblah'

>>> import re
>>> p = re.compile('sf[0-9][0-9][0-9][0-9]')

footnote: you can use \d instead of [0-9]:

    p = re.compile('sf\d\d\d\d')

and you can use {n} to specify a repeat count:

    p = re.compile('sf\d{4}')

no matter what form you prefer, you can use findall and finditer to locate all
matching substrings:

>>> print p.findall(s)
['sf1234', 'sf1238']

>>> for m in p.finditer(s):
...     print m, m.group()
...
<_sre.SRE_Match object at 0x00A29058> sf1234
<_sre.SRE_Match object at 0x00A29918> sf1238

</F>