Speeding up multiple regex matches
Alex Martelli
aleax at mail.comcast.net
Fri Nov 18 12:11:10 EST 2005
Talin <viridia at gmail.com> wrote:
...
> 1) Combine all of the regular expressions into one massive regex, and
> let the regex state machine do all the discriminating. The problem with
> this is that it gives you no way to determine which regex was the
> matching one.
Place each regex into a parenthesized group, and check which groups have
matched on the resulting matchobject:
>>> x=re.compile('(aa)|(bb)')
>>> mo=x.search('zaap!')
>>> mo.groups()
('aa', None)
There's a limit of 99 groups, so if you have unbounded number of regexes
to start with you'll have to split them up 99-or-fewer at a time, but
that shouldn't be impossibly hard.
Alex
More information about the Python-list
mailing list