Regular expression confusion

John Machin sjmachin at lexicon.net
Sat Sep 23 21:53:30 EDT 2006


York wrote:
> I have two backslash - a. and I want to replace them with one backslash,
> but I failed:
>
>  >>> import re
>  >>> a = '\\\\'
>  >>> re.sub(r'\\\\', '\\', '\\\\')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "/usr/lib/python2.3/sre.py", line 143, in sub
>      return _compile(pattern, 0).sub(repl, string, count)
>    File "/usr/lib/python2.3/sre.py", line 258, in _subx
>      template = _compile_repl(template, pattern)
>    File "/usr/lib/python2.3/sre.py", line 245, in _compile_repl
>      raise error, v # invalid expression
> sre_constants.error: bogus escape (end of line)
>  >>>
>
> anybody knows why?

Yep. There are *two* levels of escaping happening (1) Python compiler
(2) re compiler (in the first two args, but of course only Python in
the 3rd).
To get your single backslash you need to start out with four cooked or
two raw:

| >>> re.sub(r'\\\\', '\\\\', '\\\\')
'\\'
| >>> re.sub(r'\\\\', r'\\', '\\\\')
'\\'

Cheers,
John




More information about the Python-list mailing list