Overlapping matches in Regular Expressions

Fredrik Lundh fredrik at pythonware.com
Tue Apr 12 06:36:33 EDT 2005


André Søreng wrote:

> With the re/sre module included with Python 2.4:
>
> pattern = "(?P<id1>avi)|(?P<id2>avi|mp3)"
> string2match = "some string with avi in it"
> matches = re.finditer(pattern, string2match)
> ...
> matches[0].groupdict()
> {'id2': None, 'id1': 'avi'}
>
> Which was expected since overlapping matches are ignored.
> But I would also like to know if other groups had a match.

that's not how regular expressions work: a regular expression describes a
set of strings (the regular set), and the engine can tell you if a given string
belongs to that set.

> What modifications to the re/sre module is needed to allow
> overlapping matches?

if you want overlapping matches, you have to apply the pattern multiple
times.  for trivial cases like your example, it's probably easier to create a
single pattern that matches all interesting cases, and use a dictionary (or
a number of sets) to do the rest.

</F> 






More information about the Python-list mailing list