[Python-ideas] Make undefined escape sequences have SyntaxWarnings

MRAB python at mrabarnett.plus.com
Thu Oct 11 10:17:48 EDT 2012


On 2012-10-11 06:34, Greg Ewing wrote:
> Steven D'Aprano wrote:
>> If you escape a character, you should get
>> something. If it's a special character, you get the special meaning.
>> If it's not, escaping should be transparent: escaping something that
>> doesn't need escaping is a null op
>
> I think that calling "\n", "\t" etc. "escape sequences" is a misnomer
> that is causing confusion in this discussion.
>
> The term "escape" in this context means to prevent something from
> having a special meaning that it would otherwise have. But the
> backslash in these is being used to *give* a special meaning to
> the following character.
>
> In Python string literals, the only true escape sequences associated
> with the backslash are '\\', "\'" and '\"'.
>
> So the backslash is a bit schizophrenic -- sometimes it's an escape
> character, sometimes it's a prefix that imparts a special meaning.
>
> This means that "\c" where c is not special in any way is somewhat
> ambiguous. Are you redundantly escaping something that doesn't
> need it, are you asking for a special meaning that doesn't exist
> (which is probably a mistake), or do you just want a literal
> backslash?
>
> Python guesses that you want a literal backslash. This seems to be
> motivated by the desire to minimise the need for backslash doubling.
> That sounds fine in theory, but I don't think it helps much in
> practice. I for one don't trust myself to keep the entire set of
> special characters in my head, including all the rarely-used ones,
> so I end up doubling every backslash anyway.
>
> Given that, I wouldn't have minded at all if Python had refused
> to guess in this case, and raised a compile-time error. That would
> have left the way open for extending the set of special chars in
> the future.
>
>> Adding a new escape sequence is almost as big a step as adding a new
>> built-in or new syntax. I see that as a good thing, it discourages too
>> many requests for new escape sequences.
>
> I don't see it makes much difference. We get plenty of requests for
> new syntax of all kinds, and we seem to have enough sense to reject
> them unless they're backed by extremely good arguments. There's no
> reason requests for new special chars should be treated any differently.
>
My own preference is that a backslash followed by an ASCII letter or
digit either has a special meaning currently (with a compile-time error
if it's not correctly formed) or is reserved for future use (with a
compile-time currently), and that a backslash followed by any other
character (codepoint) is a literal (although they may some exceptions
to that, such as a backslash followed by a newline being ignored).



More information about the Python-list mailing list