How do I get to *all* of the groups of an re search?

Kyler Laird Kyler at news.Lairds.org
Sat Jan 11 11:26:05 EST 2003


Tim Peters <tim.one at comcast.net> writes:

>[Kyler Laird, discovers that a regexp group "in a loop" captures only
> the last place it matched]
>> ...
>> Yes, and that surprises me.  It seems so obvious that it
>> should return all matched pieces and so arbitrary that it
>> only returns the last one.

>It would take potentially unbounded storage to remember all matches,

O.k.

>and
>would also (at least) complicate the meaning of backreferences (what is \1
>supposed to match then?  "a list" of all strings ever matched by group 1?

Yes, that is what I have suggested.

>the catentation of them?

Lists seem a whole lot more appropriate.  Why would you
eliminate the possibility of accessing them individually?

>at least one of them?

That's the arbitrary choice made now.

>a contigous slice of the
>string spanning the first and last places it matched?  etc).

That would be an unexpected behavior.  No other group returns
text that it didn't match.

>> ...
>> Regardless, do you find it useful?  Can you think of any time
>> when you want to match a bunch of things and just end up with
>> the last one?

>Sure.  For example, finding the last component of a path expression, in
>order to isolate the file name, is a common application.

That's readily done using simple REs without depending on this
behavior.

>> ...
>> I'm not at all interested in how popular the solution is.

>The Python developers were, though.

Yes, I can understand that developers are already programmers
and have been tainted by other languages.  I'm advocating the
use of Python for people who have not learned other languages.
They won't have the experience to look at a piece of Python
documentation and say "Oh, but surely they don't mean *that*."

--kyler




More information about the Python-list mailing list