Is there a function to remove escape characters from a string ?

Stef Mientki stef.mientki at gmail.com
Thu Dec 25 16:53:34 EST 2008


Steven D'Aprano wrote:
> On Thu, 25 Dec 2008 11:00:18 +0100, Stef Mientki wrote:
>
>   
>> hello,
>>
>> Is there a function to remove escape characters from a string ?
>> (preferable all escape characters except "\n").
>>     
>
>
> Can you explain what you mean? I can think of at least four alternatives:
>   
I have the following kind of strings,
the funny "þ" is ASCII character 254, used as a separator character

[FSM]
Counts = "1þ11þ16"     ==>   1,11,16
Init1 = "1þ\BCtrl"     ==>    1,Ctrl
State5 = "8þ\BJUMP_COMPL\b\n>PCWrite = 1\n>PCSource = 10"
         ==> 8, JUMP_COMPL\n>PCWrite = 1\n>PCSource = 10

Seeing and testing all your answers, with great solutions that I've 
never seen before,
knowing nothing of escape sequences (I'm a windows guy ;-)
I now see that the characters I need to remove, like  \B  and \b  are 
not "official" escape sequences.
So in this case the best (easiest to understand) method is a few replace 
statements:
s = s.replace ( '\b', '' ).replace( '\B',  '' )

Nevertheless, thank you all for the other examples,

cheers,
Stef


> (1) Remove literal escape sequences (backslash-char):
> "abc\\t\\ad" => "abcd"
> r"abc\t\ad" => "abcd"
>
>
> (2) Replace literal escape sequences with the character they represent:
> "abc\\t\\ad" => "abc\t\ad"
>
>
> (3) Remove characters generated by escape sequences:
> "abc\t\ad" => "abcd"
> "abc" => "abc" but "a\x62c" => "ac"
>
> This is likely to be impossible without deep magic.
>
>
> (4) Remove so-called binary characters which are typically inserted using 
> escape sequences:
> "abc\t\ad" => "abcd"
> "abc" => "abc" but "a\x62c" => "abc"
>
> This is probably the easiest, assuming you have bytes instead of unicode.
>
> import string
> table = string.maketrans('', '')
> delchars =''.join(chr(n) for n in range(32))
>
> s = string.translate(s, table, delchars)
>
>
>
>   




More information about the Python-list mailing list