String escaping utility for Python (was: Rawest raw string literals)

Sat Apr 22 20:19:50 EDT 2017

On 23 April 2017 at 00:48, Chris Angelico <rosuav at gmail.com> wrote:
> On Sun, Apr 23, 2017 at 8:30 AM, Mikhail V <mikhailwas at gmail.com> wrote:
>> The purpose is simple: reduce manual work to escape special
>> characters in string literals (and escape non-ASCII characters).
>>
>> Simple usage scenario:
>> - I have a long command-line string in some text editor.
>> - Copy this string and paste into the utility edit box
>> - In the second edit box same string with escaped characters
>>   appears (i.e tab becomes \t, etc)
>> - Further, if I edit the text in the second edit box,
>>   an unescaped string appears in the first box.
>
> Easy.
>
>>>> input()
> This string has "quotes" of 'various' «styles», and \backslashes\ too.
> 'This string has "quotes" of \'various\' «styles», and \\backslashes\\ too.'
>
> The repr of a string does pretty much everything you want. If you want
> a nice GUI, you can easily put one together that uses repr() to escape
> and ast.literal_eval() to unescape.

I am sorry, could you elaborate what have you shown here?
So in Python console I can become escaped string, but what
commands do you use? I never use Python console actually :/

And yes the idea is to have a nice GUI. And the idea is exactly opposite
to "everyone let's roll an own tool". Obviously I can spend day
or two and create such a tool, e.g. with PyQt.
But since the task is very common and quite unambiguos I think it is
a good reason for a standard official tool.

>
>> PS:
>> Also I remember now about the python-ideas thread
>> on entering unicode characters with decimals instead of
>> hex values. It was met somewhat negatively but then it turned out
>> that in recent Python version it can be done with f-strings.
>> E.g. a string :
>>
>> s="абв"
>> one can write as:
>> s = f"{1072:c}{1073:c}{1074:c}"
>> instead of traditional hex:
>> "\u0430\u0431\u0432"
>>
>> It was told however this is not normal usage.
>> Still I find it very helpful, so if this is correct syntax, I'd
>> personally find such a conversion option also very useful.
>
> Most of the world finds the hex form MUCH more logical, since Unicode
> is built around 16s and 256s and such. Please don't proliferate more
> messes - currently, the only place I can think of where decimal is
> supported is HTML character entities, and hex is equally supported
> there.
>
> Of course, the best way to represent most non-ASCII characters is as
> themselves - s="абв" from your example. The main exception is
> combining characters and related incomplete forms, such as this table
> of diacritical marks more-or-less lifted from an app of mine:
>
> {
>     "\\`":"\u0300","\\'":"\u0301","\\^":"\u0302","\\~":"\u0303",
>     "\\-":"\u0304","\\@":"\u0306","\\.":"\u0307","\\\"":"\u0308",
>     "\\o":"\u030A","\\=":"\u030B","\\v":"\u030C","\\<":"\u0326",
>     "\\,":"\u0327","\\k":"\u0328",
> }
>
> All of them are in the 03xx range. Much easier than pointing out that
> they're in the range 768 to 879. Please stick to hex.

I don't insist on decimals, I want to use decimals for my own pleasure
in own projects, may I?
And don't worry in my whole life I will not produce so many software
that will significantly increase the 'messes'.
(Anyway I've got used already to decimals somehow, ord(char), etc.,
so for me it's too late for the ugly hex)

Mikhail