[Python-ideas] String interpolation for all literal strings

Thu Aug 6 06:18:58 CEST 2015

On 6 August 2015 at 07:24, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> On 5 August 2015 at 19:56, Eric V. Smith <eric at trueblade.com> wrote:
>>
>> In the "Briefer string format" thread, Guido suggested [1] in passing
>> that it would have been nice if all literal strings had always supported
>> string interpolation.
>>
>> I've come around to this idea as well, and I'm going to propose it for
>> inclusion in 3.6. Once I'm done with my f-string PEP, I'll consider
>> either modifying it or creating a new (and very similar) PEP.
>>
>> The concept would be that all strings are scanned for \{ and } pairs. If
>> any are found, then they'd be interpreted in the same was as the other
>> discussion on "f-strings". That is, the expression between the \{ and }
>> would be extracted and searched for conversion characters and format
>> specifiers. The expression would be evaluated, converted if needed, have
>> its __format__ method called, and the resulting string inserted back in
>> to the original string.
>
> I strongly dislike this idea. One of the things I like about Python is
> the fact that a string literal is just a string literal. I don't want
> to have to scan through a large string and try to work out if it
> really is just a literal or a dynamic context-dependent expression. I
> would hold this objection if the proposal was a limited form of
> variable interpolation (akin to .format) but if any string literal can
> embed arbitrary expressions than I *really* don't like that idea.

I'm in this camp as well. We already suffer from the problem that,
unlike tuples, numbers and strings, lists, dictionary and set
"literals" are actually formally displays that provide a shorthand for
runtime procedural code, rather than literals that can potentially be
fully resolved at compile time.

This means there are *fundamentally* different limitations on what we
can do with them. In particular, we can take literals, constant fold
them, do various other kinds of things with them, because we *know*
they're not dependent on runtime state - we know everything we need to
know about them at compile time.

This is an absolute of Python: string literals are constants, not
arbitrary code execution constructs. Our own peephole generator
assumes this, AST manipulation code assumes this, people reading code
assume this, people teaching Python assume this.

I already somewhat dislike the idea of having a "string display" be
introduced by something as subtle as a prefix character, but so long
as it gets its own AST node independent of the existing "I'm a
constant" string node, I can live with it. There's at least a marker
right up front to say to readers "unlike other strings, this one may
depend on runtime state". If the prefix was an exclamation mark to
further distinguish it from the alphabetical prefix characters, I'd be
even happier :)

Dropping the requirement for the prefix *loses* expressiveness from
the language, because runtime dependent strings would no longer be
clearly distinguished from the genuine literals. Having at least f"I
may be runtime dependent!" as an indicator, and preferably !"I may be
runtime dependent!" instead, permits a clean simple syntax for
explicit interpolation, and dropping the prefix saves only one
character at writing time, while making every single string literal
potentially runtime dependent at reading time.

Editors and IDEs can also be updated far more easily, since existing
strings can be continue to be marked up as is, while prefixed strings
can potentially be highlighted differently to indicate that they may
contain arbitrary code (and should also be scanned for name references
and type compatibility with string interpolation).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia