python regex "negative lookahead assertions" problems

Helmut Jarausch jarausch at igpm.rwth-aachen.de
Mon Nov 23 04:20:30 EST 2009


On 11/22/09 16:05, Helmut Jarausch wrote:
> On 11/22/09 14:58, Jelle Smet wrote:
>> Hi List,
>>
>> I'm trying to match lines in python using the re module.
>> The end goal is to have a regex which enables me to skip lines which
>> have ok and warning in it.
>> But for some reason I can't get negative lookaheads working, the way
>> it's explained in "http://docs.python.org/library/re.html".
>>
>> Consider this example:
>>
>> Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
>> [GCC 4.4.1] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import re
>>>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh
>>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf
>>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
>>>>> re.match('.*(?!warning)',line)
>> <_sre.SRE_Match object at 0xb75b1598>
>>
>> I would expect that this would NOT match as it's a negative lookahead
>> and warning is in the string.
>>
>
> '.*' eats all of line. Now, when at end of line, there is no 'warning'
> anymore, so it matches.
> What are you trying to achieve?
>
> If you just want to single out lines with 'ok' or warning in it, why not
> just
> if re.search('(ok|warning)') : call_skip
>

Probably you don't want words like 'joke' to match 'ok'.
So, a better regex is

if re.search('\b(ok|warning)\b',line) : SKIP_ME

Helmut.



-- 
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany



More information about the Python-list mailing list