regex question

John Machin sjmachin at lexicon.net
Fri Aug 4 09:02:30 EDT 2006


taleinat wrote:
> Gabriel Murray <gabriel.murray <at> gmail.com> writes:
>
> >
> > Hello, I'm looking for a regular expression which will match strings as
> follows: if there are symbols a, b, c and d, then any pattern is valid if it
> begins with a and ends with d and proceeds in order through the symbols.
> However, at any point the pattern may reset to an earlier position in the
> sequence and begin again from there.
> > For example, these would be valid
> patterns:aabbbaabbcccbbbcccdddaabcabcdabcdBut these would
> not:aaaaabbbbbccccaaaaadddd   (goes straight from a to d)aaaaaaaaaaabbbbbccc
> (does not reach d)Can anyone think of a concise way of writing this regex? The
> ones I can think of are very long and awkward.Gabriel
> >
>
> Your cirteria could be defined more simply as the following:
> * must start with an 'a' and end with a 'd'
> * an 'a' must not be followed by 'c' or 'd'
> * a 'b' must not be followed by 'd'

In fact it is so regular that the rules for the same thing with N
letters can be written with O(1) rules: E.g. for N == 26:
* must start with a and end with z
* can go to (1) same spot (2) ahead 1 spot (3) back to any previous
spot.

Does the OP really need a regex????

Give the spots numbers (say from 1 up) instead of letters and a checker
becomes trivial:

[untested]
def check(size, path):
    if not path or path[0] != 1 or path[-1] != size:
        return False
    pos = 1
    for newpos in path[1:]:
        if not(1 <= newpos <= min(size, pos + 1)):
            return False
        pos = newpos
    return True

Cheers,
John




More information about the Python-list mailing list