Using re - side effects or misunderstanding

Andrew Henshaw andrew_dot_henshaw_at_earthling_dot_net
Sat Jan 13 15:34:50 EST 2001


Using re, with the findall command, I find the use of groups sometimes
inconsistent.
Suppose I have the following re

    r = re.compile('abcxyz')

and I execute the following

    r.findall('..abcxyz..')

then I get the string, ['abcxyz'], this is fine. Now I decide that the
pattern I'm looking for should have 1 or more repetitions of abc so I change
my re to

    '(abc)+xyz'
so now I get

    ['abc']

 right-  I need to change my re to '(?:abc)+xyz' to get the desired
behavior.
    'abcxyz'

But I thought that ?: is for matching but not returning.  And so it is,
under certain circumstances, e.g.
if my pattern is

    '(ab)(c)xyz'

I get

    [('ab', 'c')]   (Yikes! a tuple. I'm going to have to change my code a
bit to handle this)

but

    '(ab)(?:c)xyz'

yields,
    ['ab']

and
    '(?:ab)(?:c)xyz'

gives
    ['abcxyz']

so how do I get the result

    ['abxyz']

??

In other words, adding groups for the purpose of adding repetitions seems to
have a greater side-effect than I would desire.  Is there something that I'm
missing in my use of re's?

AH






More information about the Python-list mailing list