Seeking regex optimizer

Kay Schluehr kay.schluehr at gmx.net
Tue Jun 20 01:35:34 EDT 2006


andrewdalke at gmail.com wrote:
> Kay Schluehr wrote:
> > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a
> > regular expression sx from it, such that sx.match(s) yields a SRE_Match
> > object when s starts with an s_i for one i in [0,...,n].
>
> Why do you want to use a regex for this?

Because it is part of a tokenizer that already uses regexps and I do
not intend to rewrite / replace it. Certain groups of token (
operators, braces and special characters ) should be user extensible.
All others will stay as they are. I found that certain groups of token
might be represented in a more compact mannerr: for matching ['+',
'++'] one might generate '\+|\+\+' or '\+\+?' and I wanted to know if
there is some generic approach to solve the "inverse regexp" problem in
a non-trivial fashion.

Regards,
Kay




More information about the Python-list mailing list