Match First Sequence in Regular Expression?

Roger L. Cauvin roger at deadspam.com
Thu Jan 26 11:38:05 EST 2006


"Tim Chase" <python.list at tim.thechases.com> wrote in message 
news:mailman.1085.1138293020.27775.python-list at python.org...
>>>r = re.compile("[^a]*a{3}b+(a+b*)*")
>>>matches = [s for s in listOfStringsToTest if r.match(s)]
>>
>> Wow, I like it, but it allows some strings it shouldn't.  For example:
>>
>> "xyz123aabbaaab"
>>
>> (It skips over the two-letter sequence of 'a' and matches 'bbaaab'.)
>
> Anchoring it to the beginning/end might solve that:
>
> r = re.compile("^[^a]*a{3}b+(a+b*)*$")
>
> this ensures that no "a"s come before the first 3x"a" and nothing but "b" 
> and "a" follows it.

Anchoring may be the key here, but this pattern rejects

"xayz123aaabab"

which it should accept, since the 'a' between the 'x' and the 'y' is not 
directly followed by the letter 'b'.

-- 
Roger L. Cauvin
nospam_roger at cauvin.org (omit the "nospam_" part)
Cauvin, Inc.
Product Management / Market Research
http://www.cauvin-inc.com





More information about the Python-list mailing list