Generating all permutations from a regexp

Paul McGuire ptmcg at austin.rr.com
Fri Dec 22 22:39:55 EST 2006


On Dec 22, 8:30 am, "BJörn Lindqvist" <bjou... at gmail.com> wrote:
> With regexps you can search for strings matching it. For example,
> given the regexp: "foobar\d\d\d". "foobar123" would match. I want to
> do the reverse, from a regexp generate all strings that could match
> it.
>
> The regexp: "[A-Z]{3}\d{3}" should generate the strings "AAA000",
> "AAA001", "AAA002" ... "AAB000", "AAB001" ... "ZZZ999".
>
> Is this possible to do?

Here is a first cut at your problem
(http://pyparsing-public.wikispaces.com/space/showimage/invRegex.py).
I used pyparsing to identify repeatable ranges within a regex, then
attached generator-generating classes to parse actions for each type of
regex element.  Some limitations:
- unbounded '*' and '+' repetition is not allowed
- only supports \d, \w, and \s macros

Here are the test cases in the program that generate the expected list
of permutations:
[A-E]
[A-D]{3}
X[A-C]{3}Y
X[A-C]{3}\(
X\d
[A-C]\s[A-C]\s[A-C]
[A-C]\s?[A-C][A-C]
[A-C]\s([A-C][A-C])
[A-C]\s([A-C][A-C])?
[A-C]{2}\d{2}




More information about the Python-list mailing list