efficient mega-replacements

Delaney, Timothy tdelaney at avaya.com
Thu Jan 24 17:45:09 EST 2002


> From: Jonathan Hogg [mailto:jonathan at onegoodidea.com]
> 
> On 23/1/2002 19:36, in article
> mailman.1011813577.11632.python-list at python.org, "Clark C . Evans"
> <cce at clarkevans.com> wrote:
> 
> In case you hadn't realised it I thought I'd note that using "regexp"
> strings will save a lot of the stramash of backslashes in 
> your code, as
> follows:
> 
>             val = val.replace("'","''")\
>                      .replace(r"\",r"\\\\\\\\")\
>                      .replace(chr(13)+chr(10),r"\\\\n")\
>                      .replace(chr(10),r"\\\\n")\
>                      .replace(chr(13),r"\\\\n")\
>                      .replace('"',r'\\"')
> 
> The r'' form of string literals doesn't interpret backslashes.

Firstly, they are not "regexp" strings, but "raw" strings.

Secondly, a raw string cannot end with a non-escaped backslash. So r'\' is
invalid.

Probably the easiest way to do this is to create a tuple of tuples
containing the match/substitute pairs (since the order in which these
operations is performed is important). Then just run through the loop,
replacing as required.

Something along the lines of:

matchsub = (
    ("'", "''"),
    ("\\", "\\\\\\\\\\\\\\\\"),
    ("\r\n", "\n"),
    ("\r", "\n"),
    ("\n", "\\\\\\\\n"),
    ('"', '\\\\"'),
)

for m, s in matchsub
    val = val.replace(m, s)

Note that I have changed chr(13) to '\r' and chr(10) to '\n'. Also note that
the easiest way to deal with \r\n, \r and \n is to convert them all to one
type of line ending, then do whatever you want with them.

Tim Delaney




More information about the Python-list mailing list