[Python-ideas] String interpolation for all literal strings

Guido van Rossum guido at python.org
Thu Aug 6 16:28:00 CEST 2015


On Thu, Aug 6, 2015 at 6:18 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 6 August 2015 at 07:24, Oscar Benjamin <oscar.j.benjamin at gmail.com>
> wrote:
> > I strongly dislike this idea. One of the things I like about Python is
> > the fact that a string literal is just a string literal. I don't want
> > to have to scan through a large string and try to work out if it
> > really is just a literal or a dynamic context-dependent expression. I
> > would hold this objection if the proposal was a limited form of
> > variable interpolation (akin to .format) but if any string literal can
> > embed arbitrary expressions than I *really* don't like that idea.
>
> I'm in this camp as well. We already suffer from the problem that,
> unlike tuples, numbers and strings, lists, dictionary and set
> "literals" are actually formally displays that provide a shorthand for
> runtime procedural code, rather than literals that can potentially be
> fully resolved at compile time.
>
> This means there are *fundamentally* different limitations on what we
> can do with them. In particular, we can take literals, constant fold
> them, do various other kinds of things with them, because we *know*
> they're not dependent on runtime state - we know everything we need to
> know about them at compile time.
>

I don't buy this argument. We already arrange things so that (x, y) invokes
a tuple constructor after loading x and y, while (1, 2) is loaded as a
single constant.

Syntactically, "xyzzy" remains a constant, while "the \{x} and the \{y}"
becomes an expression that (among other things) loads the values of x and y.


> This is an absolute of Python: string literals are constants, not
> arbitrary code execution constructs. Our own peephole generator
> assumes this, AST manipulation code assumes this, people reading code
> assume this, people teaching Python assume this.
>
> I already somewhat dislike the idea of having a "string display" be
> introduced by something as subtle as a prefix character, but so long
> as it gets its own AST node independent of the existing "I'm a
> constant" string node, I can live with it. There's at least a marker
> right up front to say to readers "unlike other strings, this one may
> depend on runtime state". If the prefix was an exclamation mark to
> further distinguish it from the alphabetical prefix characters, I'd be
> even happier :)
>
> Dropping the requirement for the prefix *loses* expressiveness from
> the language, because runtime dependent strings would no longer be
> clearly distinguished from the genuine literals. Having at least f"I
> may be runtime dependent!" as an indicator, and preferably !"I may be
> runtime dependent!" instead, permits a clean simple syntax for
> explicit interpolation, and dropping the prefix saves only one
> character at writing time, while making every single string literal
> potentially runtime dependent at reading time.
>

Here you're just expressing the POV of someone coming from Python 3.5 (or
earlier). To future generations, like to users of all those languages
mentioned in the Wikipedia article, it'll be second nature to scan string
literals for interpolations, and since most strings are short most readers
won't even be aware that they're doing it. And if there's a long string
(say some template) somewhere, you have to look carefully anyway to notice
things like en embedded "+x+" somewhere, or a trailing method call (e.g.
.strip()).


> Editors and IDEs can also be updated far more easily, since existing
> strings can be continue to be marked up as is, while prefixed strings
> can potentially be highlighted differently to indicate that they may
> contain arbitrary code (and should also be scanned for name references
> and type compatibility with string interpolation).
>

For an automated tool it's trivial to scan strings for \{. And yes, the
part between \{ and } should be marked up differently (and probably the
:format or !r/!s differently again).

Also, your phrase "contain arbitrary code" still sounds like a worry about
code injection. You might as well worry about code injection in function
calls.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150806/6b0ad76a/attachment-0001.html>


More information about the Python-ideas mailing list