Match First Sequence in Regular Expression?

Roger L. Cauvin roger at deadspam.com
Thu Jan 26 12:33:05 EST 2006


"Fredrik Lundh" <fredrik at pythonware.com> wrote in message 
news:mailman.1088.1138296051.27775.python-list at python.org...
> Roger L. Cauvin wrote:
>
>> > $ python test.py
>> > got    expected
>> > ---------------
>> > accept accept
>> > reject reject
>> > accept accept
>> > reject reject
>> > accept accept
>>
>> Thanks, but the second test case I listed contained a typo.  It should 
>> have
>> contained a sequence of three of the letter 'a'.  The test cases should 
>> be:
>>
>> "xyz123aaabbab" accept
>> "xyz123aabbaaab" reject
>> "xayz123aaabab" accept
>> "xaaayz123abab" reject
>> "xaaayz123aaabab" accept
>>
>> Your pattern fails the second test.
>
> $ more test.py
>
> import re
>
> print "got    expected"
> print "------ --------"
>
> testsuite = (
>    ("xyz123aaabbab", "accept"),
>    ("xyz123aabbaaab", "reject"),
>    ("xayz123aaabab", "accept"),
>    ("xaaayz123abab", "reject"),
>    ("xaaayz123aaabab", "accept"),
>    )
>
> for string, result in testsuite:
>    m = re.search("a+b", string)
>    if m and len(m.group()) == 4:
>        print "accept",
>    else:
>        print "reject",
>    print result
>
> $ python test.py
>
> got    expected
> ------ --------
> accept accept
> reject reject
> accept accept
> reject reject
> accept accept

Thanks, but I'm looking for a solution in terms of a regular expression 
only.  In other words, "accept" means the regular expression matched, and 
"reject" means the regular expression did not match.  I want to see if I can 
fulfill the requirements without additional code (such as checking 
"len(m.group())").

-- 
Roger L. Cauvin
nospam_roger at cauvin.org (omit the "nospam_" part)
Cauvin, Inc.
Product Management / Market Research
http://www.cauvin-inc.com





More information about the Python-list mailing list