Match First Sequence in Regular Expression?
Roger L. Cauvin
roger at deadspam.com
Thu Jan 26 12:33:05 EST 2006
"Fredrik Lundh" <fredrik at pythonware.com> wrote in message
news:mailman.1088.1138296051.27775.python-list at python.org...
> Roger L. Cauvin wrote:
>
>> > $ python test.py
>> > got expected
>> > ---------------
>> > accept accept
>> > reject reject
>> > accept accept
>> > reject reject
>> > accept accept
>>
>> Thanks, but the second test case I listed contained a typo. It should
>> have
>> contained a sequence of three of the letter 'a'. The test cases should
>> be:
>>
>> "xyz123aaabbab" accept
>> "xyz123aabbaaab" reject
>> "xayz123aaabab" accept
>> "xaaayz123abab" reject
>> "xaaayz123aaabab" accept
>>
>> Your pattern fails the second test.
>
> $ more test.py
>
> import re
>
> print "got expected"
> print "------ --------"
>
> testsuite = (
> ("xyz123aaabbab", "accept"),
> ("xyz123aabbaaab", "reject"),
> ("xayz123aaabab", "accept"),
> ("xaaayz123abab", "reject"),
> ("xaaayz123aaabab", "accept"),
> )
>
> for string, result in testsuite:
> m = re.search("a+b", string)
> if m and len(m.group()) == 4:
> print "accept",
> else:
> print "reject",
> print result
>
> $ python test.py
>
> got expected
> ------ --------
> accept accept
> reject reject
> accept accept
> reject reject
> accept accept
Thanks, but I'm looking for a solution in terms of a regular expression
only. In other words, "accept" means the regular expression matched, and
"reject" means the regular expression did not match. I want to see if I can
fulfill the requirements without additional code (such as checking
"len(m.group())").
--
Roger L. Cauvin
nospam_roger at cauvin.org (omit the "nospam_" part)
Cauvin, Inc.
Product Management / Market Research
http://www.cauvin-inc.com
More information about the Python-list
mailing list