regular expression reverse match?

Ron Adam radam2 at tampabay.rr.com
Wed Oct 29 18:30:36 EST 2003


On Wed, 29 Oct 2003 11:32:45 +0100, "Diez B. Roggisch"
<nospam-deets at web.de> wrote:

>Hi,
>
>> As it is, the program checks the input buffer after each keystroke and
>> determines if the buffer is acceptable against a pattern as the string
>> is being built.   A reverse match.  It allows new keystrokes to be
>> added to the buffer as long as it's within the pattern.
>
>I don't think its possible with regular expressions, or at least the way you
>are using them right now, that is to come from a "in the end, I want to
>look it like this, but it shall accept everything that prefixes a valid
>result".
>
>That won't work - the reason is simply that regular expressions are
>equivalent to finite state automata. I don't know if you are familiar with
>these, but the consist of states, which are nodes, and transitions between
>these, which are edges/arrows between the states.

I'm not familiar with the terms, but I am familiar with tree data
structures.   


[clipped good explanation]


>However, its way more complicated to create this as a rex:
>
>"a(b(c)?)?"
>
>This looks straightforward, so you might be successful to create rexes like
>this using a preprocessing step - but the more complicated your rex gets,
>this approach will be hard to follow. Actually, I have currently no idea.
>Which doesn't mean its not doable, I just don't have enough time to think
>about it right now :)

I'm still definitely learning this,  and appreciate the help.

In reply to Emily,  I compared rex to simplifying a math problem like
this.  I'm hoping this will lead to a method that will work.

>	if this were a math problem it might look something like this.
>
>	string  <=  yes$ | no$
>	string   =   (y | ye | yes)$ | (n | no)$
>	string   =   (y$ | ye$ | yes$) | (n$ | no$)
>	string   =   y$ | ye$ | yes$ | n$ | no$

Using the same approach for the example above might be something like:

s   <=  a(  b (c)? )?
s   <=  a( b | bc )?
s   <=  a | ab | abc
s    =   a | (a | ab) | ( a | ab | abc)
s    =   a | a | ab | a | ab | abc
s    =   a | ab | abc

I have no idea if what I'm doing here is actually valid.   I'm sort of
thinking it out and learning as I go.   Are there rules and methods to
manipulating regular expressions in this manner,   rex algebra?  Or is
this a subset of set theory?  I think my statistics instructor did
something similar to this.  It was a few years ago. 


>It looks to me, that if you need this feature not on a per-case base where
>you can think about your rexes more thoroughly, you have to sort of roll
>out your own rex-implmenetation. Which isn't too hard, but definitely
>something thats more than just a couple of lines.
>
>Regards,
>
>Diez

Looks like I only <obvious understatement> need to create a few
functions to manipulate rex expressions.   Simpler than a full
rex-emplementations,  but still definitely more than a few lines of
code.  

_Ron Adam







More information about the Python-list mailing list