Goofy re.sub behavior

Gary Herron gherron at islandtraining.com
Fri Oct 31 21:08:28 EST 2003


On Friday 31 October 2003 01:24 pm, Blair Fraser wrote:
> This seems incredibly bizzare.  I'm trying to sub in a string with a
> backslash using re.sub.  If that string has a \b in it I can't stop it
> from being interpreted as a backspace, even though the string is a raw
> string and prints correctly.
>
> >>> sub_in_x = r'{\bf}'   # This one won't work
> >>> sub_in_y = r'{\if}'   # This one will work, but I don't want a \if
> >>> print sub_in_x        # Prints with \b as a slash and a b
>
> {\bf}
>
> >>> print sub_in_y        # Also prints correctly
>
> {\if}
>
> >>> print re.sub(r'here', sub_in_x, 'do it here!')  # ?!?
>
> do it f}!
>
> >>> print re.sub(r'here', sub_in_y, 'do it here!')  # \i works fine
>
> do it {\if}!
>
>
> In the first case the \b is being re-interpreted as a backspace and
> deleting the opening curly bracket.  However it initially prints as
> the string I want as a substitution.


This is exactly the behavior specified in the re documentation:

  sub(pattern, repl, string[, count])
    Return the string obtained by replacing the leftmost
    non-overlapping occurrences of pattern in string by the
    replacement repl. If the pattern isn't found, string is returned
    unchanged. repl can be a string or a function; if it is a string,
    any backslash escapes in it are processed. That is, "\n" is
    converted to a single newline character, "\r" is converted to a
    linefeed, and so forth. Unknown escapes such as "\j" are left
    alone. Backreferences, such as "\6", are replaced with the
    substring matched by group 6 in the pattern. For example:  

Try doubling up your backslashes.  Alternately, if you don't need any
of the regular expression machinery, perhaps you could just use the
'sub' method of string class.

Gary Herron







More information about the Python-list mailing list