[Python-Dev] one last SRE headache
Ka-Ping Yee
ping@lfw.org
Thu, 31 Aug 2000 16:04:26 -0500 (CDT)
On Thu, 31 Aug 2000, Fredrik Lundh wrote:
> I had to add one rule:
>
> If it starts with a zero, it's always an octal number.
> Up to two more octal digits are accepted after the
> leading zero.
Fewer rules are better. Let's not arbitrarily rule out
the possibility of more than 100 groups.
The octal escapes are a different kind of animal than the
backreferences: for a backreference, there is *actually*
a backslash followed by a number in the regular expression;
but we already have a reasonable way to put funny characters
into regular expressions.
That is, i propose *removing* the translation of octal
escapes from the regular expression engine. That's the
job of the string literal:
r'\011' is a backreference to group 11
'\\011' is a backreference to group 11
'\011' is a tab character
This makes automatic construction of regular expressions
a tractable problem. We don't want to introduce so many
exceptional cases that an attempt to automatically build
regular expressions will turn into a nightmare of special
cases.
-- ?!ng