regular expression reverse match?
Diez B. Roggisch
nospam-deets at web.de
Wed Oct 29 05:32:45 EST 2003
Hi,
> As it is, the program checks the input buffer after each keystroke and
> determines if the buffer is acceptable against a pattern as the string
> is being built. A reverse match. It allows new keystrokes to be
> added to the buffer as long as it's within the pattern.
I don't think its possible with regular expressions, or at least the way you
are using them right now, that is to come from a "in the end, I want to
look it like this, but it shall accept everything that prefixes a valid
result".
That won't work - the reason is simply that regular expressions are
equivalent to finite state automata. I don't know if you are familiar with
these, but the consist of states, which are nodes, and transitions between
these, which are edges/arrows between the states.
Now ususally there is one special starting state S, and there should be at
least on reachable end-state. As the names suggest, recoginizing a
specified String starts at S, and then every read character advances the
internal state until one of the end-states is reached _and_ there is no
more input.
Now lets see at a simple example:
"abc"
Here every character becomes a state, and the automata looks like this:
S-a->(a)-b->(b)-c->[c]
The -*-> means that this transition is taken when * is matched by the
current input. So -a-> is taken when an "a" is input.
[c] is an end-state.
Now what you want is, that _all_ states that are legal are also end-states.
That is very easy to accomplish when you implement the automata directly
like this:
S-a->[a]-b->[b]-c->[c]
However, its way more complicated to create this as a rex:
"a(b(c)?)?"
This looks straightforward, so you might be successful to create rexes like
this using a preprocessing step - but the more complicated your rex gets,
this approach will be hard to follow. Actually, I have currently no idea.
Which doesn't mean its not doable, I just don't have enough time to think
about it right now :)
It looks to me, that if you need this feature not on a per-case base where
you can think about your rexes more thoroughly, you have to sort of roll
out your own rex-implmenetation. Which isn't too hard, but definitely
something thats more than just a couple of lines.
Regards,
Diez
More information about the Python-list
mailing list