Some more odd behaviour from the Regexp library

Mike Meyer mwm at mired.org
Wed Oct 19 23:48:25 EDT 2005


"David Veerasingam" <vdavidster at gmail.com> writes:

> Can anyone explain why it won't give me my captured group?
>
> In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd'
> In [2]: import re
> In [3]: b = re.search(r'exit: (.*?)', a)
> In [4]: b.group(0)
> Out[4]: 'exit: '
>
> In [5]: b.group(1)
> Out[5]: ''
>
> In [6]: b.group(2)
> IndexError: no such group

It is giving you your captured group.  While the * operator matches as
long a string as possible, the *? operator matches as *short* a string
as possible. Since '' matches .*?, that's all it's ever going to
capture. So b.group(1) is '', which is what it's giving you.

>>> a = 'exit: gkdfjgfjdfsgdjglkghdfgkd'
>>> import re
>>> b = re.search(r'exit: (.*)', a)
>>> b.group(0)
'exit: gkdfjgfjdfsgdjglkghdfgkd'
>>> b.group(1)
'gkdfjgfjdfsgdjglkghdfgkd'
>>> 

which I suspect is what you actually want.

Of course, being the founder of SPARE, I have to point out that
a.split(': ') will get you the same two strings as the re I used
above.


      <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list