Regular expression negative look-ahead

Ian Kelly ian.g.kelly at gmail.com
Tue Jul 2 01:44:31 EDT 2013


On Mon, Jul 1, 2013 at 8:27 PM, Jason Friedman <jsf80238 at gmail.com> wrote:
> Found this:
> http://stackoverflow.com/questions/13871833/negative-lookahead-assertion-not-working-in-python.
>
> This pattern seems to work:
> pattern = re.compile(r"^(?!.*(CTL|DEL|RUN))")
>
> But I am not sure why.
>
>
> On Mon, Jul 1, 2013 at 5:07 PM, Jason Friedman <jsf80238 at gmail.com> wrote:
>>
>> I have table names in this form:
>> MY_TABLE
>> MY_TABLE_CTL
>> MY_TABLE_DEL
>> MY_TABLE_RUN
>> YOUR_TABLE
>> YOUR_TABLE_CTL
>> YOUR_TABLE_DEL
>> YOUR_TABLE_RUN
>>
>> I am trying to create a regular expression that will return true for only
>> these tables:
>> MY_TABLE
>> YOUR_TABLE
>>
>> I tried these:
>> pattern = re.compile(r"_(?!(CTL|DEL|RUN))")
>> pattern = re.compile(r"\w+(?!(CTL|DEL|RUN))")
>> pattern = re.compile(r"(?!(CTL|DEL|RUN)$)")
>>
>> But, both match.
>> I do not need to capture anything.


For some reason I don't seem to have a copy of your initial post.

The reason that regex works is because you're anchoring it at the
start of the string and then telling it to match only if
".*(CTL|DEL|RUN)" /doesn't/ match.  That pattern does match starting
from the beginning of the string, so the pattern as a whole does not
match.

The reason that the other three do not work is because the forward
assertions are not properly anchored.  The first one can match the
first underscore in "MY_TABLE_CTL" instead of the second, and then the
next three characters are "TAB", not any of the verboten strings, so
it matches.  The second one matches any substring of "MY_TABLE_CTL"
that isn't followed by "CTL".  So it will just match the entire string
"MY_TABLE_CTL", and the rest of the string is then empty, so does not
match any of those three strings, so it too gets accepted.  The third
one simply matches an empty string that isn't followed by one of those
three, so it will just match at the very start of the string and see
that the next three characters meet the forward assertion.

Now, all that said, are you sure you actually need a regular
expression for this?  It seems to me that you're overcomplicating
things.  Since you don't need to capture anything, your need can be
met more simply with:

if not table_name.endswith(('_CTL', '_DEL', '_RUN')):
    # Do whatever



More information about the Python-list mailing list