regex question

MRAB python at mrabarnett.plus.com
Fri Jul 29 12:15:11 EDT 2011


On 29/07/2011 16:45, Thomas Jollans wrote:
> On 29/07/11 16:53, rusi wrote:
>> Can someone throw some light on this anomalous behavior?
>>
>>>>> import re
>>>>> r = re.search('a(b+)', 'ababbaaabbbbb')
>>>>> r.group(1)
>> 'b'
>>>>> r.group(0)
>> 'ab'
>>>>> r.group(2)
>> Traceback (most recent call last):
>>    File "<stdin>", line 1, in<module>
>> IndexError: no such group
>>
>>>>> re.findall('a(b+)', 'ababbaaabbbbb')
>> ['b', 'bb', 'bbbbb']
>>
>> So evidently group counts by number of '()'s and not by number of
>> matches (and this is the case whether one uses match or search). So
>> then whats the point of search-ing vs match-ing?
>>
>> Or equivalently how to move to the groups of the next match in?
>>
>> [Side note: The docstrings for this really suck:
>>
>>>>> help(r.group)
>> Help on built-in function group:
>>
>> group(...)
>>
>
> Pretty standard regex behaviour: Group 1 is the first pair of brackets.
> Group 2 is the second, etc. pp. Group 0 is the whole match.
> The difference between matching and searching is that match assumes that
> the start of the regex coincides with the start of the string (and this
> is documented in the library docs IIRC). re.match(exp, s) is equivalent
> to re.search('^'+exp, s). (if not exp.startswith('^'))
>
> Apparently, findall() returns the content of the first group if there is
> one. I didn't check this, but I assume it is documented.
>
findall returns a list of tuples (what the groups captured) if there is
more than 1 group, or a list of strings (what the group captured) if
there is 1 group, or a list of strings (what the regex matched) if
there are no groups.



More information about the Python-list mailing list