regular expression reverse match?
Ron Adam
radam2 at tampabay.rr.com
Wed Oct 29 18:30:36 EST 2003
On Wed, 29 Oct 2003 11:32:45 +0100, "Diez B. Roggisch"
<nospam-deets at web.de> wrote:
>Hi,
>
>> As it is, the program checks the input buffer after each keystroke and
>> determines if the buffer is acceptable against a pattern as the string
>> is being built. A reverse match. It allows new keystrokes to be
>> added to the buffer as long as it's within the pattern.
>
>I don't think its possible with regular expressions, or at least the way you
>are using them right now, that is to come from a "in the end, I want to
>look it like this, but it shall accept everything that prefixes a valid
>result".
>
>That won't work - the reason is simply that regular expressions are
>equivalent to finite state automata. I don't know if you are familiar with
>these, but the consist of states, which are nodes, and transitions between
>these, which are edges/arrows between the states.
I'm not familiar with the terms, but I am familiar with tree data
structures.
[clipped good explanation]
>However, its way more complicated to create this as a rex:
>
>"a(b(c)?)?"
>
>This looks straightforward, so you might be successful to create rexes like
>this using a preprocessing step - but the more complicated your rex gets,
>this approach will be hard to follow. Actually, I have currently no idea.
>Which doesn't mean its not doable, I just don't have enough time to think
>about it right now :)
I'm still definitely learning this, and appreciate the help.
In reply to Emily, I compared rex to simplifying a math problem like
this. I'm hoping this will lead to a method that will work.
> if this were a math problem it might look something like this.
>
> string <= yes$ | no$
> string = (y | ye | yes)$ | (n | no)$
> string = (y$ | ye$ | yes$) | (n$ | no$)
> string = y$ | ye$ | yes$ | n$ | no$
Using the same approach for the example above might be something like:
s <= a( b (c)? )?
s <= a( b | bc )?
s <= a | ab | abc
s = a | (a | ab) | ( a | ab | abc)
s = a | a | ab | a | ab | abc
s = a | ab | abc
I have no idea if what I'm doing here is actually valid. I'm sort of
thinking it out and learning as I go. Are there rules and methods to
manipulating regular expressions in this manner, rex algebra? Or is
this a subset of set theory? I think my statistics instructor did
something similar to this. It was a few years ago.
>It looks to me, that if you need this feature not on a per-case base where
>you can think about your rexes more thoroughly, you have to sort of roll
>out your own rex-implmenetation. Which isn't too hard, but definitely
>something thats more than just a couple of lines.
>
>Regards,
>
>Diez
Looks like I only <obvious understatement> need to create a few
functions to manipulate rex expressions. Simpler than a full
rex-emplementations, but still definitely more than a few lines of
code.
_Ron Adam
More information about the Python-list
mailing list