+ in regular expression

Duncan Booth duncan.booth at invalid.invalid
Tue Oct 9 07:29:16 EDT 2012


Cameron Simpson <cs at zip.com.au> wrote:

>| Because "\s{6}+" 
>| has other meanings in different regex syntaxes and the designers didn't 
>| want confusion?
> 
> I think Python REs are supposed to be Perl compatible; ISTR an opening
> sentence to that effect...
> 
I don't know the full history of how regex engines evolved, but I suspect 
at least part of the answer is that the decisions the Perl developers made 
influenced the other implementations.

Perl's quantifiers allow both '?' and '+' as modifiers on the standard 
quantifiers so clearly you cannot stack those particular quantifiers in 
Perl, therefore quantifiers in general are unstackable.

The only grammars I can find online for regular expressions split out the 
elements and quantifiers the way I did in my previous post. Python's regex 
parser (and I would guess also most of the others in existence) tend more 
to the spaghetti code than following a grammar (_parse is a 238 line 
function). So I think it really is just trying to match existing regular 
expression parsers and any possible grammar is an excuse for why it should 
be the way it is rather than an explanation.

-- 
Duncan Booth http://kupuguy.blogspot.com



More information about the Python-list mailing list