one more question on regex

Vlastimil Brom vlastimil.brom at gmail.com
Fri Jan 22 15:10:44 EST 2016


2016-01-22 16:50 GMT+01:00 mg <noOne at nowhere.com>:
> Il Fri, 22 Jan 2016 15:32:57 +0000, mg ha scritto:
>
>> python 3.4.3
>>
>> import re re.search('(ab){2}','abzzabab')
>> <_sre.SRE_Match object; span=(4, 8), match='abab'>
>>
>>>>> re.findall('(ab){2}','abzzabab')
>> ['ab']
>>
>> Why for search() the match is 'abab' and for findall the match is 'ab'?
>
> finditer seems to be consistent with search:
> regex = re.compile('(ab){2}')
>
> for match in regex.finditer('abzzababab'):
>   print ("%s: %s" % (match.start(), match.span() ))
> ...
> 4: (4, 8)
>
> --
> https://mail.python.org/mailman/listinfo/python-list

Hi,
as was already pointed out, findall "collects" the content of the
capturing groups (if present), rather than the whole matching text;

for repeated captures the last content of them is taken discarding the
previous ones; cf.:

>>> re.findall('(?i)(a)x(b)+','axbB')
[('a', 'B')]
>>>
(for multiple capturing groups in the pattern, a tuple of captured
parts are collected)

or with your example with differenciated parts of the string using
upper/lower case:
>>> re.findall('(?i)(ab){2}','aBzzAbAB')
['AB']
>>>

hth,
   vbr



More information about the Python-list mailing list