Goofy re.sub behavior

Fri Oct 31 21:08:28 EST 2003

On Friday 31 October 2003 01:24 pm, Blair Fraser wrote:
> This seems incredibly bizzare.  I'm trying to sub in a string with a
> backslash using re.sub.  If that string has a \b in it I can't stop it
> from being interpreted as a backspace, even though the string is a raw
> string and prints correctly.
>
> >>> sub_in_x = r'{\bf}'   # This one won't work
> >>> sub_in_y = r'{\if}'   # This one will work, but I don't want a \if
> >>> print sub_in_x        # Prints with \b as a slash and a b
>
> {\bf}
>
> >>> print sub_in_y        # Also prints correctly
>
> {\if}
>
> >>> print re.sub(r'here', sub_in_x, 'do it here!')  # ?!?
>
> do it f}!
>
> >>> print re.sub(r'here', sub_in_y, 'do it here!')  # \i works fine
>
> do it {\if}!
>
>
> In the first case the \b is being re-interpreted as a backspace and
> deleting the opening curly bracket.  However it initially prints as
> the string I want as a substitution.

This is exactly the behavior specified in the re documentation:

  sub(pattern, repl, string[, count])
    Return the string obtained by replacing the leftmost
    non-overlapping occurrences of pattern in string by the
    replacement repl. If the pattern isn't found, string is returned
    unchanged. repl can be a string or a function; if it is a string,
    any backslash escapes in it are processed. That is, "\n" is
    converted to a single newline character, "\r" is converted to a
    linefeed, and so forth. Unknown escapes such as "\j" are left
    alone. Backreferences, such as "\6", are replaced with the
    substring matched by group 6 in the pattern. For example:  

Try doubling up your backslashes.  Alternately, if you don't need any
of the regular expression machinery, perhaps you could just use the
'sub' method of string class.

Gary Herron