Performing a number of substitutions on a unicode string

Arnaud Delobelle arnodel at gmail.com
Tue Dec 20 10:04:35 EST 2011


On 20 December 2011 14:54, Tim Chase <python.list at tim.thechases.com> wrote:
> On 12/20/11 08:02, Arnaud Delobelle wrote:
>>
>> Hi all,
>>
>> I've got to escape some unicode text according to the following map:
>>
>> escape_map = {
>>     u'\n': u'\\n',
>>     u'\t': u'\\t',
>>     u'\r': u'\\r',
>>     u'\f': u'\\f',
>>     u'\\': u'\\\\'
>> }
>>
>> The simplest solution is to use str.replace:
>>
>> def escape_text(text):
>>     return text.replace('\\', '\\\\').replace('\n',
>> '\\n').replace('\t', '\\t').replace('\r', '\\r').replace('\f', '\\f')
>>
>> But it creates 4 intermediate strings, which is quite inefficient
>> (I've got 10s of MB's worth of unicode strings to escape)
>
>
> You might try
>
>  def escape_text(text):
>    return text.encode("string_escape")
>

I don't think this kind of approach would work as I'm not decoding
unicode: both the input and the output of the "escape_text" function
are unicode.

-- 
Arnaud



More information about the Python-list mailing list