[Python-Dev] interesting article on regex performance

skip at pobox.com skip at pobox.com
Sat Mar 13 00:22:13 CET 2010


    Collin> re2 is not a full replacement for Python's current regex
    Collin> semantics: it would only serve as an accelerator for a subset of
    Collin> the current regex language. Given that, it makes perfect sense
    Collin> that it would be optional on such minority platforms (much like
    Collin> the incoming JIT).

Sure, but over the years Python has supported at least four different
regular expression modules that I'm aware of (regex, regexp, and the current
re module with different extension modules underneath it, perhaps there were
others).  During some of that time more than one module was distributed with
Python proper.  I think the desire today would be that only one regular
expression module be distributed with Python (that would be my vote anyway).
Getting people to move off the older libraries was difficult.  If re2 can't
replace sre under the covers than I think it belongs in PyPI, not the Python
distribution.  That said, that suggests to me that a different NFA or DFA
implementation written in C would replace sre, one not written in C++.

Hopefully that provides some context for my earlier response.

Skip


More information about the Python-Dev mailing list