Unrecognized backslash escapes in string literals

Dave Angel davea at davea.name
Sun Feb 22 22:01:52 EST 2015


On 02/22/2015 09:41 PM, Ben Finney wrote:
> Chris Angelico <rosuav at gmail.com> writes:
>
>> In Python, unrecognized escape sequences are treated literally,
>> without (as far as I can tell) any sort of warning or anything.
>
> Right. Text strings literals are documented to work that way
> <URL:https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str>,
> which refers the reader to the language reference
> <URL:https://docs.python.org/3/reference/lexical_analysis.html#strings>.
>
>> Why is it that Python interprets them this way, and doesn't even give
>> a warning?
>
> Because the interpretation of those literals is unambiguous and correct.

Correct according to a misguided language definition.

>
> It's unfortunate that MS Windows inherited the incompatible “backslash
> is a path separator”, long after backslash was already established in
> many programming languages as the escape character.

Windows "inherited" it from DOS.  But since Windows was nothing but a 
DOS shell for several years, that's not surprising.  The historical 
problem came from CP/M's use of the forward slash for a 
switch-character.  Since MSDOS/PCDOS/QDOS was trying to permit 
transliterated CP/M programs, and because subdirectories were an 
afterthought (version 2.0), they felt they needed to pick a different 
character.  At one time, the switch-character could be set by the user, 
but most programs ignored that, so it died.

>
>> Is there a way to enable such warnings/errors?
>
> A warning or error for a correctly formatted literal with an unambiguous
> meaning would be an up-Pythonic thing to have.
>
> I can see the motivation, but really the best solution is to learn that
> the backslash is an escape character in Python text string literals.
>
> This has the advantage that it's the same escape character used for text
> string literals in virtually every other programming language, so you're
> not needing to learn anything unusual.
>

I might be able to buy that argument if it was done the same way, but as 
it says in:
   https://docs.python.org/3/reference/lexical_analysis.html#strings

"""Unlike Standard C, all unrecognized escape sequences are left in the 
string unchanged, i.e., the backslash is left in the result. (This 
behavior is useful when debugging: if an escape sequence is mistyped, 
the resulting output is more easily recognized as broken.)
"""

The word "broken" is an admission that this was a flawed approach.  If 
it's broken, it should be an error.

I'm not suggesting that the implementation should falsely trigger an 
error.  But that the language definition should be changed to define it 
as an error.

-- 
DaveA



More information about the Python-list mailing list