How to write this repeat matching?

Ian Kelly ian.g.kelly at gmail.com
Sun Jul 6 15:26:44 EDT 2014


On Sun, Jul 6, 2014 at 12:57 PM,  <rxjwg98 at gmail.com> wrote:
> I write the following code:
>
> .......
> import re
>
> line = "abcdb"
>
> matchObj = re.match( 'a[bcd]*b', line)
>
> if matchObj:
>    print "matchObj.group() : ", matchObj.group()
>    print "matchObj.group(0) : ", matchObj.group()
>    print "matchObj.group(1) : ", matchObj.group(1)
>    print "matchObj.group(2) : ", matchObj.group(2)
> else:
>    print "No match!!"
> .........
>
> In which I have used its match pattern, but the result is not 'abcb'

You're never going to get a match of 'abcb' on that string, because
'abcb' is not found anywhere in that string.

There are two possible matches for the given pattern over that string:
'abcdb' and 'ab'.  The first one matches the [bcd]* three times, and
the second one matches it zero times.  Because the matching is greedy,
you get the result that matches three times.  It cannot match one, two
or four times because then there would be no 'b' following the [bcd]*
portion as required by the pattern.

>
> Only matchObj.group(0): abcdb
>
> displays. All other group(s) have no content.

Calling match.group(0) is equivalent to calling match.group without
arguments. In that case it returns the matched string of the entire
regular expression.  match.group(1) and match.group(2) will return the
value of the first and second matching group respectively, but the
pattern does not have any matching groups.  If you want a matching
group, then enclose the part that you want it to match in parentheses.
For example, if you change the pattern to:

    matchObj = re.match('a([bcd]*)b', line)

then the value of matchObj.group(1) will be 'bcd'



More information about the Python-list mailing list