[Python-ideas] Draft PEP on string interpolation

Nick Coghlan ncoghlan at gmail.com
Fri Aug 21 08:40:19 CEST 2015


On 21 August 2015 at 12:52, Guido van Rossum <guido at python.org> wrote:
> Yeah, I think Nick meant that as a way of implementing the "formatting
> mini-language" for bytes, given that bytes don't have __format__ or format.
> But using %(name)s for the *syntax* in bytes was never on the table. I think
> we're better off not supporting this type of string interpolation for bytes
> at all.

Yeah, I'm OK with doing this as a text-only thing - while printf-style
formatting is certainly useful, binary data is still often best
approached as a serialisation problem moreso than as an interpolation
one.

I really like Mike's language survey in his draft, and the main thing
I'd highlight in relation to that is that the interpolation syntax
used in JavaScript (with the leading "$" for substitution expressions)
is essentially the same as that used in PEPs 215, 292 & 501 (with the
key difference being to make the braces optional when leaving them out
is unambiguous)

One key pragmatic benefit of that is that I expect the number of folks
needing to context switch between JavaScript code and Python code will
vastly outstrip the number of folks context switching between C# and
Python.

One key compatibility benefit of that particular syntax is that it
interoperates much better with the "{{ global_variable }}"
substitution used for Mozilla's l20n templating (http://l20n.org/).
That makes it more compatible with the similar syntax used for Django
and Jinja2 variable substituation, and the "{% %}" syntax used for
Django and Jinja2 blocks.

However, those latter examples *do* highlight a "What could possibly
go wrong?" question we need to ensure we ask, which is how we want to
address the likelihood of folks writing things like:

    myquery = i"SELECT $column FROM $table;"
    mycommand = i"cat $filename"
    mypage = i"<html><body>$content</body></html>"

It's the opposite of the "interpolating untrusted strings that may
contain aribtrary expressions" problem - what happens when the
variables being *substituted* are untrusted? It's easy to say "don't
do that", but if doing the right thing incurs all the repetition
currently involved in calling str.format, we're going to see a *lot*
of people doing the wrong thing. At that point, the JavaScript
backticks-with-arbitrary-named-callable solution starts looking very
attractive:

    myquery = sql`SELECT $column FROM $table;`
    mycommand = sh`cat $filename`
    mypage = html`<html><body>$content</body></html>`

At that point, internationalisation could just be:

    translated = _`This $value and this $other_value are interpolated
after translation lookup`

>From an implementation perspective, that could be a matter of:

* adding a new "__interpolate__" magic method with a suitable signature
* changing the builtin "format" to implement __interpolate__ as str.format
* adding an "interpolator" builtin decorator that just did:

    def interpolator(f):
        f.__interpolate__ = f.__call__
        return f

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list