Python regular expressions just ain't PCRE

sjdevnull at yahoo.com sjdevnull at yahoo.com
Sat May 5 17:06:37 EDT 2007


Wiseman wrote:
> I'm kind of disappointed with the re regular expressions module. In
> particular, the lack of support for recursion ( (?R) or (?n) ) is a
> major drawback to me. There are so many great things that can be
> accomplished with regular expressions this way, such as validating a
> mathematical expression or parsing a language with nested parens,
> quoting or expressions.

-1 on this from me.  In the past 10 years as a professional
programmer, I've used the wierd extended "regex" features maybe 5
times total, whether it be in Perl or Python.  In contrast, I've had
to work around the slowness of PCRE-style engines by forking off  a
grep() or something similar practically every other month.  I think
it'd be far more valuable for most programmers if Python moved toward
dropping the extended semantics so that something one of the efficient
regex libraries (linked in a recent thread here on comp.lang.python)
could work with, and then added a parsing library to the standard
library for more complex jobs.  Alternatively, if the additional
memory used isn't huge we could consider having more intelligence in
the re compiler and having it choose between a smarter PCRE engine or
a faster regex engine based on the input.  The latter is something I'm
playing with a patch for that I hope to get into a useful state for
discussion soon.

But regexes are one area where speed very often makes the difference
between whether they're usable or not, and that's far more often been
a limitation for me--and I'd think for most programmers--than any lack
in their current Python semantics.  So I'd rather see  that attacked
first.




More information about the Python-list mailing list