regex negative lookbehind assertion not working correctly?

Gabriel Rossetti gabriel.rossetti at arimaz.com
Tue Mar 31 15:13:25 EDT 2009


MRAB wrote:
> Gabriel Rossetti wrote:
>> Hello everyone,
>>
>> I am trying to write a regex pattern to match an ID in a URL only if 
>> it is not a given ID. Here's an example, the ID not to match is 
>> "14522XXX98", if my URL is "/profile.php?id=14522XXX99" I want it to 
>> match and if it's "/profile.php?id=14522XXX98" I want it not to. I 
>> tried this:
>>
>>  >>> re.search(r"/profile.php\?id=(\d+)(?<!14522XXX98)", 
>> "/profile.php?id=14522XXX98").groups()
>> ('14522XXX9',)
>>
>> which should not match, but it does, then I tried this :
>>
> [snip]
> How can '(\d+)' be capturing '14522XXX9'? '\d' matches only digits!
:-), yes, I had replaced the digits for the example (originally longer, etc)
>
> Anyway, your basic problem is that it initially matches '14522XXX98',
> but then the lookbehind rejects that, so it backtracks and releases the
> last character, giving '14522XXX9', which is not be rejected because
> '14522XXX9' isn't '14522XXX98'.
>
> Try putting a '\b' after the '\d+' to reject partial IDs.
>
That did it, thanks a lot, I would never have found that.

Gabriel



More information about the Python-list mailing list