[issue7132] Regexp: capturing groups in repetitions

Philippe Verdy report at bugs.python.org
Wed Oct 14 22:49:19 CEST 2009


Philippe Verdy <verdy_p at wanadoo.fr> added the comment:

Rationale for the compilation flag:

You could think that the compilation flag should not be needed. However, 
not using it would mean that a LOT of existing regular expressions that 
already contain capturing groups in repetitions, and for which the 
caputiring group is effectively not used and should have been better 
encoded as a non-capuring group like (?:X) instead of (X), will suffer a 
negative performance impact and a higher memory usage.

The reason is that the MatchObject will have to store lists of 
(start,end) pairs instead of just a single pair: using a list will not 
be the default, so MatchObject.group(groupIndex), 
MatchObject.start(groupIndex), MatchObject.end(groupIndex), and 
MatchObject.span(groupIndex) will continue to return a single value or 
single pair, when the R compilation flag is not set (these values will 
continue to return only the last occurence, that will be overriden after 
each matched occurence of the capturing group.

The MatchObject.groups() will also continue to return a list of single 
strings without that flag set (i.e. a list of the last occurence of each 
capturing group). But when the R flag will be specified, it will return, 
instead, a list of lists, each element being for the group with the same 
index and being itself a list of strings, one for each occurence of the 
capturing group.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7132>
_______________________________________


More information about the Python-bugs-list mailing list