Unrecognized escape sequences in string literals

Douglas Alan darkwater42 at gmail.com
Wed Aug 12 13:47:55 EDT 2009


On Aug 12, 3:36 am, Steven D'Aprano
<ste... at REMOVE.THIS.cybersource.com.au> wrote:

> On Tue, 11 Aug 2009 13:20:52 -0700, Douglas Alan wrote:

> > My "Annotated C++ Reference Manual" is packed, and surprisingly in
> > Stroustrup's Third Edition, there is no mention of the issue in the
> > entire 1,000 pages. But Microsoft to the rescue:
>
> >      If you want a backslash character to appear within a string, you
> >      must type two backslashes (\\)
>
> > (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx)
>
> Should I assume that Microsoft's C++ compiler treats it as an error, not
> a warning?

In my experience, C++ compilers generally generate warnings for such
situations, where they can. (Clearly, they often can't generate
warnings for running off the end of an array, which is also undefined,
though a really smart C++ compiler might be able to generate a warning
in certain such circumstances.)

> Or is is this *still* undefined behaviour, and MS C++ compiler
> will happily compile "ab\cd" whatever it feels like?

If it's a decent compiler, it will generate a warning. Who can say
with Microsoft, however. It's clearly documented as illegal code,
however.

> > The question of what any specific C++ does if you ignore the warning is
> > irrelevant, as such behavior in C++ is almost *always* undefined. Hence
> > the warning.
>
> So a C++ compiler which follows Python's behaviour would be behaving
> within the language specifications.

It might be, but there are also *recommendations* in the C++ standard
about what to do in such situations, and the recommendations say, I am
pretty sure, not to do that, unless the particular compiler in
question has to meet some very specific backward compatibility needs.

> I note that the bash shell, which claims to follow C semantics, also does
> what Python does:
>
> $ echo $'a s\trin\g with escapes'
> a s     rin\g with escapes

Really? Not on my computers. (One is a Mac, and the other is a Fedora
Core Linux box.) On my computers, bash doesn't seem to have *any*
escape sequences, other than \\, \", \$, and \`. It seems to treat
unknown escape sequences the same as Python does, but as there are
only four known escape sequences, and they are all meant merely to
guard against string interpolation, and the like, it's pretty darn
easy to keep straight.

> Explain to me again why we're treating underspecified C++ semantics,
> which may or may not do *exactly* what Python does, as if it were the One
> True Way of treating escape sequences?

I'm not saying that C++ does it right for Python. The right thing for
Python to do is to generate an error, as Python doesn't have to deal
with all the crazy complexities that C++ has to.

|>ouglas



More information about the Python-list mailing list