Finding all regex matches by index?

Ian Kelly ian.g.kelly at gmail.com
Tue May 29 21:54:35 EDT 2012


On Tue, May 29, 2012 at 7:45 PM, MRAB <python at mrabarnett.plus.com> wrote:
> On 30/05/2012 02:33, Roy Smith wrote:
>>
>> I have a long string (possibly 100s of Mbytes) that I want to search for
>> regex matches.  re.finditer() is *almost* what I want, but the problem
>> is that it returns matching strings.  What I need is a list of offsets
>> in the string where the regex matched.  Thus:
>>
>> s = "this is a string"
>> find("is", s) =>  [2, 5]
>>
>> Is there anything that does this?
>
>
> re.finditer() doesn't return matching strings, it returns match
> objects. What you want are the start positions of each match which the
> match object can provide:
>
>
>>>> s = "this is a string"
>>>> [m.start() for m in re.finditer("is", s)]
> [2, 5]

Or that.  I simply assumed without checking from the OP's post that
finditer yielded strings, not matches.



More information about the Python-list mailing list