Regexp and multiple groups (with repeats)

Mark Tolonen metolone+gmane at gmail.com
Fri Nov 20 11:03:51 EST 2009


"mk" <mrkafk at gmail.com> wrote in message news:he60ha$ivv$1 at ger.gmane.org...
> Hello,
>
> >>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+')
>
> >>> r.search(r'c:/tmp/spam/eggs').groups()
> ('/eggs',)
>
> Obviously, I would like to capture all groups:
> ('/tmp', '/spam', '/eggs')
>
> But it seems that re captures only the last group. Is there any way to 
> capture all groups with repeat following it, i.e. (...)+ or (...)* ?
>
> Even better would be:
>
> ('tmp', 'spam', 'eggs')
>
> Yes, I know about re.split:
>
> >>> re.split( r'(?:\w:)?[/\\]', r'c:/tmp/spam\\eggs/' )
> ['', 'tmp', 'spam', '', 'eggs', '']
>
> My interest is more general in this case: how to capture many groups with 
> a repeat?

re.findall is what you're looking for.  Here's all words not followed by a 
colon:

>>> import re
>>> re.findall(u'(\w+)(?!:)',r'c:\tmp\spam/eggs')
['tmp', 'spam', 'eggs']

-Mark





More information about the Python-list mailing list