Python regular expressions just ain't PCRE

Wiseman Wiseman1024 at gmail.com
Fri May 4 20:11:41 EDT 2007


I'm kind of disappointed with the re regular expressions module. In
particular, the lack of support for recursion ( (?R) or (?n) ) is a
major drawback to me. There are so many great things that can be
accomplished with regular expressions this way, such as validating a
mathematical expression or parsing a language with nested parens,
quoting or expressions.

Another feature I'm missing is once-only subpatterns and possessive
quantifiers ( (?>...) and ?+ *+ ++ {...}+ ) which are great to avoid
deep recursion and inefficiency in some complex patterns with nested
quantifiers. Even java.util.regex supports them.

Are there any plans to support these features in re? These would be
great features for Python 2.6, they wouldn't clutter anything, and
they'd mean one less reason left to use Perl instead of Python.

Note: I know there are LALR parser generators/parsers for Python, but
the very reason why re exists is to provide a much simpler, more
productive way to parse or validate simple languages and process text.
(The pyparse/yappy/yapps/<insert your favourite Python parser
generator here> argument could have been used to skip regular
expression support in the language, or to deprecate re. Would you want
that? And following the same rule, why would we have Python when
there's C?)




More information about the Python-list mailing list