Raw string substitution problem

D'Arcy J.M. Cain darcy at druid.net
Thu Dec 17 12:19:52 EST 2009


On Thu, 17 Dec 2009 11:51:26 -0500
Alan G Isaac <alan.isaac at gmail.com> wrote:
>          >>> re.sub('abc', r'a\nb\n.c\a','123abcdefg') == re.sub('abc', 'a\\nb\\n.c\\a',' 123abcdefg') == re.sub('abc', 'a\nb\n.c\a','123abcdefg')
>          True

Was this a straight cut and paste or did you make a manual change?  Is
that leading space in the middle one a copying error?  I get False for
what you actually have there for obvious reasons.

>          >>> r'a\nb\n.c\a' == 'a\\nb\\n.c\\a' == 'a\nb\n.c\a'
>          False
> 
> Why are the first two strings being treated as if they are the last one?

They aren't.  The last string is different.

>>> for x in (r'a\nb\n.c\a', 'a\\nb\\n.c\\a', 'a\nb\n.c\a'): print repr(x)
...
'a\\nb\\n.c\\a'
'a\\nb\\n.c\\a'
'a\nb\n.c\x07'

> That is, why isn't '\\' being processed in the obvious way?
> This still seems wrong.  Why isn't it?

What do you think is wrong?  What would the "obvious" way of handling
'//' be?
> 
> More simply, consider::
> 
>          >>> re.sub('abc', '\\', '123abcdefg')
>          Traceback (most recent call last):
>            File "<stdin>", line 1, in <module>
>            File "C:\Python26\lib\re.py", line 151, in sub
>              return _compile(pattern, 0).sub(repl, string, count)
>            File "C:\Python26\lib\re.py", line 273, in _subx
>              template = _compile_repl(template, pattern)
>            File "C:\Python26\lib\re.py", line 260, in _compile_repl
>              raise error, v # invalid expression
>          sre_constants.error: bogus escape (end of line)
> 
> Why is this the proper handling of what one might think would be an
> obvious substitution?

Is this what you want?  What you have is a re expression consisting of
a single backslash that doesn't escape anything (EOL) so it barfs.

>>> re.sub('abc', r'\\', '123abcdefg')
'123\\defg'

-- 
D'Arcy J.M. Cain <darcy at druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.



More information about the Python-list mailing list