[Python-Dev] one last SRE headache

Ka-Ping Yee ping@lfw.org
Thu, 31 Aug 2000 16:04:26 -0500 (CDT)


On Thu, 31 Aug 2000, Fredrik Lundh wrote:
> I had to add one rule:
> 
>     If it starts with a zero, it's always an octal number.
>     Up to two more octal digits are accepted after the
>     leading zero.

Fewer rules are better.  Let's not arbitrarily rule out
the possibility of more than 100 groups.

The octal escapes are a different kind of animal than the
backreferences: for a backreference, there is *actually*
a backslash followed by a number in the regular expression;
but we already have a reasonable way to put funny characters
into regular expressions.

That is, i propose *removing* the translation of octal
escapes from the regular expression engine.  That's the
job of the string literal:

    r'\011'    is a backreference to group 11

    '\\011'    is a backreference to group 11

    '\011'     is a tab character

This makes automatic construction of regular expressions
a tractable problem.  We don't want to introduce so many
exceptional cases that an attempt to automatically build
regular expressions will turn into a nightmare of special
cases.
    

-- ?!ng